DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 26 November 2025 has been entered.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 9-13, 18-21 are rejected under 35 U.S.C. 103 as being unpatentable over US 20210334142 A1 Wang et al. (hereinafter “Wang”) in view of F. Tu, W. Wu, S. Yin, L. Liu and S. Wei, "RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM," 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), Los Angeles, CA, USA, 2018, pp. 340-352. (hereinafter “Tu”) in view of A. Ahmad and M. A. Pasha, "Optimizing Hardware Accelerated General Matrix-Matrix Multiplication for CNNs on FPGAs," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 67, no. 11, pp. 2692-2696, Nov. 2020, doi: 10.1109/TCSII.2020.2965154. (hereinafter “Ahmad”) in view of US 20210319317 A1 Power et al. (hereinafter “Power”) in further view of L. Bai, Y. Zhao and X. Huang, "A CNN Accelerator on FPGA Using Depthwise Separable Convolution," in IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 65, no. 10, pp. 1415-1419, Oct. 2018. (hereinafter “Bai”) in view of US 10635739 B1 Batruni (hereinafter “Batruni”) in view of US 20230236984 A1 Katchi et al. (hereinafter “Katchi”).
Regarding claim 1, Wang teaches a neural processor circuit, comprising:
a plurality of neural engine circuits, at least one of the neural engine circuits (Fig. 2A/2D, 200, [0033], [0036]) configured to perform a convolution operation to generate output data (Fig. 1, 106, [0025-0026], [0031]); and
a planar engine circuit coupled to the plurality of neural engine circuit, the planar engine circuit (Fig. 2A/2B, 202, [0034]) including:
a first filter circuit (Fig. 2B/2C, 2020; [0048]) configured to perform a reduction operation for each patch of a plurality of patches of a tensor for a dataset of an input data (Fig. 1, 102, [0026]) originating from the output data to generate a respective reduced value of a plurality of reduced values ([0049] element-wise operations, matrices), the respective reduced value associated with a corresponding channel of a plurality of channels of the tensor (Fig. 2C, Channels 1-k; [0068]) for the dataset (Fig. 1, 102, [0026]), the respective reduced value being generated based on values of elementwise operation ([0049] element-wise operations) between a first work unit of each patch and a second work unit of another patch; and
a second filter circuit (Fig. 2B, 2022, [0049]) coupled to the first filter circuit;
a line buffer (Fig. 2B/2C, 2030; [0048]) coupled to the first filter circuit, the line buffer configured to: store the plurality of reduced values for the plurality of channels being transferred ([0054]) from one or more registers within (Fig. 2B/2C, 2034, 2036, [0056-0059]) the first filter circuit (Fig. 2B/2C, 2020; [0048]),
a data processor circuit (Fig. 2B/2C, 2032, [0055]) coupled to the planar engine circuit and the plurality of neural engine circuits, the data processor circuit configured to:
store the output data obtained from the at least one neural engine circuit ([0051]); and
send the output data to the planar engine circuit ([0055]), the output data comprising the dataset as the input data (Fig. 1, 102, [0026], [0028], [0108]) for the plurality of patches and the plurality of channels of the tensor (Fig. 1, 106, [0030]).
Wang is silent with disclosing retain the plurality of reduced values for a defined number of operating cycles as indicated by a refresh flag defining resetting of the line buffer. Further, Wang is silent with disclosing the second filter circuit configured to: receive a first tensor and a second tensor as part of the output data and perform an elementwise operation between the first tensor and the second tensor to generate the tensor for the first filter circuit; to reduce a patch tensor of a first rank to another tensor of a second rank lower than the first rank by removing one or more dimensions of the patch tensor of the first rank. Although Wang generally discloses tensor operations, they are silent with explicitly disclosing tensor reduction operations, reduced values, and the line buffer. Wang is silent with disclosing patch of a plurality of patches; a first work unit of each patch and a second work unit of another patch; wherein the tensor of the input data comprises more than 3 dimensions and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units, and at least one work unit of the plurality of work units represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Tu teaches retain the plurality of reduced values for a defined number of operating cycles as indicated by a refresh flag stored in another register external (Fig. 14, eDRAM Refresh Flag; Pg. 348, Col. 2, Para. 3; Pg. 342, Col. 2, Sec. D, Para. 1) defining resetting (Pg. 348, Col. 2, Para. 2-3) of the line buffer.
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang’s neural processing circuitry with Tu’s refresh flag because they are in the claimed invention’s same field of endeavor of convolutional neural networks (abstract). It would have been obvious to one of ordinary skill in the art to
implement the refresh flag as it allows the buffer in Wang’s circuit to more efficiently decide whether to refresh the data or not (Pg. 345, Col. 1, Para. 2), as different data types have different lifetimes (Pg. 348, Col. 2, Para. 3). Thus, making this modification would be beneficial, as Wang’s circuitry is better optimized at making necessary refreshes, reducing wasted time and energy (Pg. 328, Col. 2, Para. 4).
Wang in view of Tu is silent with disclosing tensor reduction operations, reduced values, and the line buffer. Further, Wang in view of Tu is further silent with disclosing the second filter circuit configured to: receive a first tensor and a second tensor as part of the output data and perform an elementwise operation between the first tensor and the second tensor to generate the tensor for the first filter circuit; to reduce a patch tensor of a first rank to another tensor of a second rank lower than the first rank by removing one or more dimensions of the patch tensor of the first rank. Wang in view of Tu are silent with disclosing patch of a plurality of patches; a first work unit of each patch and a second work unit of another patch; wherein the tensor of the input data comprises more than 3 dimensions and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units, and at least one work unit of the plurality of work units represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Ahmad teaches tensor reduction operations, reduced operations, reduce (Pg. 2693, Col. 2, IV-A, Para. 1; Pg. 2694, Col. 1, Para. 1-2) a patch tensor of a first rank (Fig. 2(a) and (b), left-hand-side, rank-3; Pg. 2694, Col. 1, Para. 1-2) to another tensor of a second rank (Fig. 2(a) and (b), right-hand-side, rank-2; Pg. 2694, Col. 1, Para. 1-2) lower than the first rank by removing one or more dimensions of the patch tensor of the first rank (Pg. 2693, Col. 2, IV-A, Para. 1; Pg. 2694, Col. 1, Para. 1-2).
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu’s neural processing system with Ahmad’s rank reduction techniques because they are in the claimed invention’s same field of endeavor of convolutional neural networks (Abstract). While Wang teaches the computations and generating values, they are silent with teaching them as related to reduction with respect to the dimensionality of the data. Ahmad teaches operations with respect to dimensionality reduction (Pg. 2693, Col. 2, IV-A, Para. 1; Pg. 2694, Col. 1, Para. 1-2). Modifying Wang’s operations with Ahmad’s reduction operations would allow the system to reshape the data to be processed and simplify operations, which would result in faster computations (Pg. 2693, Col. 2, IV-A). Thus, it would have been obvious to modify as doing so would be beneficial by reducing computational complexity.
Further, Wang in view of Tu in view of Ahmad is silent with disclosing the line buffer, the second filter circuit configured to: receive a first tensor and a second tensor as part of the output data and perform an elementwise operation between the first tensor and the second tensor to generate the tensor for the first filter circuit. Further, Wang in view of Tu in view of Ahmad are silent with disclosing patch of a plurality of patches; a first work unit of each patch and a second work unit of another patch; wherein the tensor of the input data comprises more than 3 dimensions and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units, and at least one work unit of the plurality of work units represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Power teaches receive a first tensor and a second tensor as part of the output data; and perform an element-wise operation between the first tensor and the second tensor (Fig. 10, 1002, 1004, [0149-0150]; [0029], [0073]).
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad’s neural processing system with Power’s element-wise techniques because they are in the claimed invention’s same field of endeavor of convolutional neural networks ([0028-0029]). Wang teaches further processing of the output data includes element-wise operations to generate the final output data ([0031]), practically offering us a finite number of post-processing operations afterward. Therefore, it would have been obvious to try Power’s element-wise techniques, as a person with ordinary skill in the art would recognize that it would yield predictable results, as element-wise post-processing operations are taught by Wang. Making this modification would be beneficial, as Wang in view of Tu in view of Ahmad’s system would also have improved element-wise operations as such techniques are executed in an acceleration circuitry, of which are implemented to accelerate AI tasks and workloads ([0043]).
Wang in view of Tu in view of Ahmad in view of Power is silent to disclosing the line buffer. Further, Wang in view of Tu in view of Ahmad in view of Power are silent with disclosing patch of a plurality of patches; a first work unit of each patch and a second work unit of another patch; wherein the tensor of the input data comprises more than 3 dimensions and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units, and at least one work unit of the plurality of work units represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Bai teaches the line (Fig. 5, Pg. 1417, 1) Line Buffer section) buffer.
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad in view of Power’s neural processing circuitry with Bai’s line buffer because they are in the claimed invention’s same field of endeavor of convolutional neural networks (abstract). It would have been obvious to one of ordinary skill in the art to implement a line buffer instead of the buffer in Wang in view of Tu in view of Ahmad in view of Power’s circuit because Bai’s line buffer is capable of fitting different input sizes (Pg. 1417, Col. 1-2, 1) Line Buffer section), and thus providing the capability to handle various precisions and data types. Making this modification would be beneficial, as Wang in view of Tu in view of Ahmad in view of Power’s circuitry is better suited for handling matrix multiplications and various data precisions.
Wang in view of Tu in view of Ahmad in view of Power in view of Bai are silent with disclosing patch of a plurality of patches; a first work unit of each patch and a second work unit of another patch; wherein the tensor of the input data comprises more than 3 dimensions and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units, and at least one work unit of the plurality of work units represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Batruni discloses patch of a plurality of patches (Fig. 4A “406” Tensor (4-D) contains 3 examples of patches where 1 patch is of the form “404”; co. 10 ln. 62-67); a first work unit of each patch (Fig. 4A the top, middle, and bottom patches of “406” that includes at least data corresponding to dim 1 x variates x dim 2, for example including the front and middle matrices of data; co. 10 ln. 62-67) and a second work unit of another patch (Fig. 4A the middle patch of “406” that includes at least data corresponding to dim 1 x variates x dim 2, for example including the back matrices of data as in the one not selected by the first work unit; co. 10 ln. 62-67); wherein the tensor of the input data comprises more than 3 dimensions (co. 10 ln. 62-67) and a plurality of patches, at least one patch of the plurality of patches includes 3 dimensions and a plurality of work units (Fig. 4A the top, middle, and bottom patches of “406” that includes at least data corresponding to dim 1 x variates x dim 2, for example including the front and middle matrices of data; co. 10 ln. 62-67), and at least one work unit of the plurality of work units (Fig. 4A the middle patch of “406” that includes at least data corresponding to dim 1 x variates x dim 2, for example including the back matrices of data as in the one not selected by the first work unit; co. 10 ln. 62-67) represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
For reference, annotated versions of Batruni’s Fig. 4A is reproduced below:
PNG
media_image1.png
654
1158
media_image1.png
Greyscale
Annotated figure above depicts 3 patches.
PNG
media_image2.png
654
1254
media_image2.png
Greyscale
Annotated figure above depicts the first work units of each patch.
PNG
media_image3.png
654
1254
media_image3.png
Greyscale
Annotated figure above depicts the second work units of each patch.
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad in view of Power in view of Bai’s neural processing circuitry with Batruni’s tensor characteristics because they are in the claimed invention’s same field of endeavor of convolution operations (abstract). It would have been obvious to one of ordinary skill in the art to try the specific tensor characteristics of data ordering given the finite practicality of dimensions in tensor representation (co. 10 ln. 7-43; co. 12 ln. 16-45), and thus providing the capability to handle more data by incorporating more dimensions. Making this modification would have been obvious to try given the finite number of dimensions to practically implement, and Wang in view of Tu in view of Ahmad in view of Power in view of Bai’s circuitry would be better suited for including more dimensions to more accurately describe the data.
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni are silent with disclosing represents a dataset of a size capable of being processed by the planar engine circuit in a single operating cycle.
Katchi discloses represents a dataset of a size ([0134] three 8-bit pixels) capable of being processed by the planar engine circuit in a single operating cycle ([0134] one cycle).
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni’s neural processing circuitry with Katchi’s dataset size feature because they are in the claimed invention’s same field of endeavor of convolutional neural networks ([0062]). It would have been obvious to one of ordinary skill in the art to implement the dataset of a size capable of being processed in a single operating cycle because Batruni notes that increasing dimensional size of tensors proportionally increases the processor cycles needed to compute (Batruni, co. 1 ln. 31-48). Thus, by implementing a dataset of a size capable of being processed in a single operating cycle would be beneficial in reducing processing time (Katchi, [0134]). Making this modification would be beneficial, as Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni’s circuitry is better suited for handling operations more efficiently and would reduce processing time.
Regarding claim 2, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the planar engine circuit, the line buffer, reduced values, and during the defined number of operating cycles (see claim 1 mapping). The motivation to combine provided with respect to claim 1 equally applies.
Wang further teaches the neural processor circuit further comprises:
a post-processor circuit (Fig. 2C, 2038, [0057]; Fig. 2A, [0034] from a 2038 found in an adjacent core) coupled to the line buffer, the post-processor circuit configured, to:
perform a first post-processing operation on a channel vector comprising the reduced values from the line buffer to generate a first output vector (Fig. 2C, Channel 1, [0060], first layer); and
perform a second post-processing operation on the channel vector following the first post-processing operation to generate a second output vector (Fig. 2C, Channel k, [0060], second).
Regarding claim 3, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the planar engine circuit, the first filter circuit, the output data, the tensor (see claim 1 mapping). The motivation to combine provided with respect to claim 1 equally applies.
Wang teaches the neural processor circuit wherein:
each patch comprises a first number of work units, the respective reduced value ([0049] element-wise operations, matrices) being stored ([0054]) in the one or more registers (Fig. 2B/2C, 2034, 2036, [0056-0059]) within the first filter circuit (Fig. 2B/2C, 2020; [0048]) configured to store ([0054]) the plurality of reduced values ([0049] element-wise operations, matrices) generated in a second number of operating cycles ([0052]) before the respective reduced value being transferred ([0054]) to the line buffer (Fig. 2B/2C, 2030; [0048]), wherein the first number ([0028][0062] first and second work unit – work items) is same as the second number.
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni disclose each patch comprises a first number of work units, as in claim 1.
Wang is silent with disclosing reduced, in a second number of operating cycles, and is same as the second number.
Tu teaches in a second number of operating cycles, and is same as the second number (Pg. 348, Col. 2, Para. 3; Pg. 342, Col. 2, Sec. D, Para. 1; Pg. 343, Col. 2, Sec. III-A, Para. 1; Pg. 347, Col. 1, Para. 1-2).
The motivation to combine provided with respect to claim 1 equally applies.
Wang in view of Tu are silent with disclosing reduced.
Ahmad teaches reduced (Pg. 2693, Col. 2, IV-A, Para. 1; Pg. 2694, Col. 1, Para. 1-2).
The motivation to combine provided with respect to claim 1 equally applies.
Regarding claim 9, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the reduction operation, the reduced values, the line buffer (see claim 1 mapping) and a plurality of post-processing operations (see claim 2 mapping).
Wang is silent with disclosing a first value of the refresh flag initiates an initialization of the line buffer prior to; and a second value of the refresh flag configures the line buffer to skip the initialization and retain.
Tu teaches a first value of the refresh flag initiates an initialization of the line buffer prior to the reduction operation (Pg. 345, Col. 1, Para. 2, flag is valid; Pg. 348, Col. 2, Para. 2); and a second value of the refresh flag configures the line buffer to skip the initialization (Pg. 345, Col. 1, Para. 2, flag is not set to valid) and retain (Pg. 348, Col. 2, Para. 2-3) the reduced values for a plurality of post-processing operations performed after the reduction operation.
The motivation to combine provided with respect to claim 1 equally applies.
Regarding claim 10, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the first filter circuit, the reduction operation, the reduced values, the line buffer, and the retained and during the defined number of operating cycles (see claim 1 mapping). The motivation to combine provided with respect to claim 1 equally applies.
Wang further teaches the neural processor circuit further comprises:
the at least one reduction operation comprises operations not affecting the plurality of reduced values ([0058]).
Claims 11-13, and 18 are directed to a method that would be performed by the device of claim 1. All limitations recited in claims 11-13, and 18 are taught by device claims 1-3, 9-10 (both claims 9 and 10 for claim 18), respectively. The claims 1-3, 9-10 analysis equally applies.
Claim 19 is directed to a device that similarly recites the limitations of claim 1. The claim 1 analysis equally applies. Additionally, Wang teaches: a system memory storing input data (Fig. 2A, 221, [0040]); and a neural processor circuit (Fig. 2D, plurality of 200 connected to 220, [0040] 220 associated with 221) coupled to the system memory, the neural processor circuit including: a data processor circuit (Fig. 2B/2C, 2032, [0055]), configured to receive the input data from the system memory ([0057]); for storage into the data processor circuit ([0043]).
Claims 20-21 are directed to a device that similarly recites the limitations of claims 2-3. The claims 2-3 analysis equally applies.
Claims 5-6, 15, and 22 are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi as applied to claims 1, 11, and 19 above, and further in view of US 10411709 B1 Ghasemi et al. (hereinafter “Ghasemi”).
Regarding claim 5, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches first filter circuit, the reduced value of the plurality of reduced values, the line buffer, and for the defined number of operating cycles as indicated by the refresh flag (see claim 1 mapping).
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi disclose determine indexing information for each reduced value of the plurality of reduced values; and the line buffer is further configured to: store the indexing information for each reduced value, and retain the indexing information.
Ghasemi teaches determine indexing information for each reduced value of the plurality of reduced values; and the line buffer is further configured to: store the indexing information for each reduced value, and retain the indexing information (Col. 11, lines 21-35, request generator circuit generates an ordered set of addresses, parameters include address offset which are dependent on the line buffer size; Col. 5, lines 63-67, Col. 6, lines 1-11, data elements of 2-D IFM planes).
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi’s neural processing system with Ghasemi’s indexing techniques because they are in the claimed invention’s same field of endeavor of convolutional neural networks (Col. 2, lines 59-65; Col. 3, lines 5-13). While Bai generally teaches the line buffer (Fig. 5, Pg. 1417, 1) Line Buffer section), they are silent with disclosing specific details regarding it’s indexing and addressing. By incorporating Ghasemi’s indexing techniques, the line buffer is better able to accommodate and index larger data sets, and even those that do not fit normally. Ghasemi’s line buffer is especially taught for configuring data set that do not normally fit, and applies approaches to divide the data (Col. 4, lines 32-48). Thus, it is essential to consider sizing parameters related to the size of the data as it used for addressing purposes (Col. 11, lines 21-35). Making this modification would be beneficial, as Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi’s system could accommodate larger data sets.
Regarding claim 6, in addition to the teachings addressed in the claim 5 analysis, the rejection of claim 5 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the reduced value in the tensor (see claim 1 mapping).
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi disclose the indexing information for each reduced value comprise information about spatial coordinates of that reduced value in the tensor.
Ghasemi teaches the indexing information for each reduced value comprise information about spatial coordinates of that reduced value in the tensor (Col. 11, lines 21-35, ordered set of addresses; Col. 5, lines 63-67, Col. 6, lines 1-11, data elements of 2-D IFM planes).
The motivation to combine provided with respect to claim 5 equally applies.
Claim 15 is directed to a method that would be performed by the device of claims 5-6. All limitations recited in claim 15 are taught by device claims 5-6 (both claims 5 and 6 for claim 15). The claim 5-6 analysis equally applies.
Claim 22 is directed to a device that similarly recites the limitations of claim 5. The claim 5 analysis equally applies.
Claims 7-8 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi, and further in view of US 20220044109 A1 Donnelly (hereinafter “Donnelly”).
Regarding claim 7, in addition to the teachings addressed in the claim 1 analysis, the rejection of claim 1 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the first filter circuit is further configured to perform the reduction operation, a corresponding channel of the plurality of channels of the tensor for the dataset (see claim 1 mapping).
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi disclose by finding a respective minimum value or a respective maximum value for a corresponding channel of the plurality of channels of the tensor; and determine corresponding indexing information for the respective minimum value or the respective maximum value.
Ghasemi teaches determine corresponding indexing information (Col. 11, lines 21-35, request generator circuit generates an ordered set of addresses, parameters include address offset which are dependent on the line buffer size; Col. 5, lines 63-67, Col. 6, lines 1-11, data elements of 2-D IFM planes).
The motivation to combine provided with respect to claim 5 equally applies.
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi in view of Donnelly disclose by finding a respective minimum value or a respective maximum value for a corresponding channel of the plurality of channels of the tensor; and determine corresponding indexing information for the respective minimum value or the respective maximum value.
Donnelly teaches finding a respective minimum value or a respective maximum value for a corresponding channel of the plurality of channels of the tensor; determine corresponding indexing information for the respective minimum value or the respective maximum value ([0075], [0134]).
It would have been obvious to one of ordinary skill in the art before the effective
filing date to modify Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi’s neural processing system with Donnelly’s maximum/minimum techniques because they are in the claimed invention’s same field of endeavor of convolutional neural networks ([0019]). Wang teaches the first operation unit (Fig. 2C, 2020) is capable of performing quantization (Fig. 2C, 280) and de-quantization (Fig. 2C, 270) operations on the data ([0064]). Donnelly teaches that the maximum and minimum tensors are utilized in determining the range of their quantization operations ([0027], [0057]), as quantizing some extrema case values could introduce error into the computations by grossly approximating ([0028], [0030]). By incorporating Donnelly’s maximum/minimum techniques, Wang’s circuity is able to determine a range between where values should be, and thus better able to perform quantization techniques without reducing accuracy of the neural network ([0016-0018]), as taught by Donnelly. Thus, making this modification would be beneficial, as Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi’s system would have less incorrect approximations during quantization, and thereby have improved accuracy of the network.
Regarding claim 8, in addition to the teachings addressed in the claim 7 analysis, the rejection of claim 7 is incorporated and Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi teaches the line buffer and for the defined number of operating cycles as indicated by the refresh flag (see claim 1 mapping).
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi disclose store the respective minimum value or the respective maximum value and the corresponding indexing information; and retain the respective minimum value or the respective maximum value and the corresponding indexing information.
Ghasemi teaches store the respective minimum value or the respective maximum value and the corresponding indexing information; and retain the respective minimum value or the respective maximum value and the corresponding indexing information (Col. 11, lines 21-35, request generator circuit generates an ordered set of addresses, parameters include address offset which are dependent on the line buffer size; Col. 5, lines 63-67, Col. 6, lines 1-11, data elements of 2-D IFM planes).
The motivation to combine provided with respect to claim 5 equally applies.
Wang in view of Tu in view of Ahmad in view of Power in view of Bai in view of Batruni in view of Katchi in view of Ghasemi in view of Donnelly disclose store the respective minimum value or the respective maximum value and the corresponding indexing information; and retain the respective minimum value or the respective maximum value and the corresponding indexing information.
Donnelly teaches store the respective minimum value or the respective maximum value and the corresponding indexing information; and retain the respective minimum value or the respective maximum value and the corresponding indexing information ([0075], [0134]).
The motivation to combine provided with respect to claim 7 equally applies.
Claims 16-17 are directed to a method that would be performed by the device of claims 7-8. All limitations recited in claims 16-17 are taught by device claims 7-8, respectively. The claim 7-8 analysis equally applies.
Response to Arguments
35 USC 103. Applicant’s arguments, see Remarks p. 12-13, filed 11/26/2025, with respect to the rejection(s) of claim(s) 1, 11, and 19 under 35 USC 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Batruni and in view of Katchi, as necessitated by the amendment.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARKUS A VILLANUEVA whose telephone number is (703)756-1603. The examiner can normally be reached M - F 8:30 am - 5:30 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, James Trujillo can be reached at (571) 272-3677. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARKUS ANTHONY VILLANUEVA/Examiner, Art Unit 2151
/James Trujillo/Supervisory Patent Examiner, Art Unit 2151