Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is in response to the Request for Continued Examination filed 10/23/2025. Claim 1, 4-7, 11-12, 14, 16-17 have been amended. Claims 8-10, 18, 20 have been canceled. Claims 1-7, 11-17, 19, 21-25 are pending and have been considered below.
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 10/23/2025 has been entered.
35 USC § 101
Paragraph 0013 of Applicant’s specification discloses that the system comprising an initial processor is manufactured on silicon. Therefore, the system of independent Claim 1 falls into one of the statutory categories of invention of a process, machine, manufacture or composition of matter.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-3, 7, 11, 16-17, 21-25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. (US 2019/0180170 A1) in view of Zmora et al. (US 2018/0307624 A1) and further in view of Booss et al. (US 2019/0258580 A1) and Rutenberg (US 4,965,725).
Claim 1. Huang discloses
convolutional neural network system, a convolutional neural network (P 0043), comprising:
a first part of the convolutional neural network system comprising: a memory configured to store a set of weight factors locally at the first part of the convolutional neural network system, the weight values for one neural network can be stored in the local memory banks of two or more neural network processing engines including a first neural network (P 0061) weights can be divided and stored in the memory of a first neural network and in the memory of a second neural network (P 0138);
a first processor configured to: … different subcomponents of the input data, an input image is applied to a neural network, a filter is applied to a particular “neighborhood” of the input image to produce an output activation (P 0042) a plurality of filters may be applied to the input image to produce a plurality of corresponding outputs (P 0044) a neural network processing engine is designated for processing the neural network (P 0061) the first neural network executes computations on the first set of weights (P 0139) It is clear that a plurality of filters may be applied to different “neighborhoods” of the input image to produce a plurality of corresponding data that is input activations, and this is equivalent to the claimed different subcomponents, and
distribute a first subcomponent of the different subcomponents of the input data to a first … memory controller and a second subcomponent of the different subcomponents of the input data to a second … memory controller, the weights are applied in convolutions to different neighborhoods of the input feature map to create different activations which are used to create an output feature map (P 0042) results of the activation are stored in memory system (P 0071) results of an output array can be stored in each of a plurality of memory location in a memory bank, output from a processing engine array can be written into memory banks that can then subsequently provide input data for the processing engine array (P 0075) for additional computation (P 0082) wherein the activation can combine the results from the processing engine array into one or more output activations (P 0083) performing a first task (P 0105) The application of the weights to different neighborhoods of the input feature map to produce different activation inputs are analogous to the different subcomponents, the plurality activation outputs, which are the results of filters being applied to an input image, are processed by the processing engine and each is stored in a location in a memory subsystem array;
the first [far side] memory controller configured to: access a first weight factor set from the set of weight factors that are stored locally in the memory of the first part of the part of the convolutional neural network system, the neural network processing engine can further include a set of memory banks local to the array of processing engines used to store intermediate results (P 0059) the first neural network executes computations on the first set of weights (P 0139) performing a task by a neural network includes reading a first weight value from a first memory bank and reading a second weight value from a second memory bank (P 0147), Activations are created by applying the filters of Figs 3A and 3B to an input feature map to generate output feature maps, activation blocks produce the activation outputs that are stored in the memory subsystem and applied back to the processing engine (Fig 5), the output activations are placed in a plurality of memory banks in the memory system and then read from the memory banks to compute the task
process the first subcomponent of the different subcomponents of the input data set using the first weight factor set from the set of weight factors, the stored weights can include fewer than all of the weights for the neural network, with the remaining weights being read from an additional memory when a computation is in-progress (P 0104) the first neural network executes computations on the first set of weights (P 0138, 0139); and
the second [far side] memory controller configured to: access a second weight factor set from the set of weight factors that are stored locally in the memory of the first part of the part of the convolutional neural network system, the neural network processing engine can load additional weights into the available space from an off-chip memory (P 0060) a second portion of the divided weights are stored in the memory of a second neural network (P 0138),
process the second subcomponent of the different subcomponents of the input data set using the second weight factor set from the set of weight factors, results of the activation are stored in memory system (P 0071) results of an output array can be stored in each of a plurality of memory location in a memory bank, output from a processing engine array can be written into memory banks that can then subsequently provide input data for the processing engine array (P 0075) for additional computation (P 0082) for performing a second task (P 0107) the second neural network executes the second set of weights (P 0139).
Huang does not disclose crop apart input data into different subcomponents of the input data; a first far side memory controller; a second far side memory controller, as disclosed in the claims. However, Huang discloses an input image is applied to a neural network, a filter is applied to a particular “neighborhood” of the input image to produce an output activation (P 0042) a plurality of filters may be applied to the input image to produce a plurality of corresponding outputs (P 0044) a neural network processing engine is designated for processing the neural network (P 0061). In the same field of invention, Zmora discloses using far memory such as double data rate memory (DDR) (P 0139) data blobs are broken into smaller blocks and each input block at a time is fetched from external memory, i.e. DDR, to the engine's local memory to perform deep learning when input/output data cannot fit into the engine’s local memory (P 0146) computations are split to enable training of very large neural networks in which the weights of all layers would not fit into the memory of a single computational node (P 0185). Therefore, considering the teachings of Huang and Zmora, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine crop apart input data into different subcomponents of the input data; a first far side memory controller; a second far side memory controller a with the teachings of Huang with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Huang does not disclose wherein the convolutional neural network system is configured to divide the set of weight factors according to a ratio of amounts of computing power to be provided by the first and second far side memory controllers to weight factors, as disclosed in the claims. Huang discloses the computational capacity of a processing engine array is comprised of an array of processing engines (P 0077 0079) wherein weights are stored in different proportions and are sent to the processing engine array (P 0090). While it would be obvious to relate the proportions of the weights stored in the memory to a number of processing engines in the processing engine array, Huang does not explicitly disclose this feature. However, in the same field of invention, Booss discloses data may be assigned a weight class based on the importance or priority of the data, wherein data stored in cache is assigned a different weight class than data stored in memory pages (P 0028) dividing, into a first portion of memory resources and a second portion of memory resources, a plurality of memory resources included in a cache coupled with a database, the plurality of memory resources storing data from the database, the first portion of memory resources being occupied by data assigned to a first weight class, and the second portion of memory resources being occupied by data assigned to a second weight class (Claim 1) the first portion of memory resources is selected based at least on the ratio associated with the first portion of memory resources being higher than a respective ratio of the second portion of memory resources (Claim 4). Adding Booss to Huang would assign weighted data to respective processing engines based on the weights assigned to the data. Therefore, considering the teachings of Huang, Zmora and Booss, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the convolutional neural network system is configured to divide the set of weight factors according to a ratio of amounts of computing power to be provided by the first and second far side memory controllers to weight factors with the teachings of Huang and Zmora with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Huang does not disclose a second part of the convolutional neural network system comprising: a main computing system comprising a second processor configured to cause the main computing system to: receive the processed different subcomponents of the input data from the first [far side] memory controller and the second [far side] memory controller; process a set of final layers of a second convolution neural network system using the processed different subcomponents of the input data provided from the first part of the convolutional neural network system; and generate a classification of the input data using the set of final layers of the second part of the convolution neural network system, as disclosed in the claims. Huang discloses fully connected layers can be applied after the convolutional layers, for classification purposes (P 0045). Huang does not disclose transmitting the output of the convolutions of the neural networks to a second part of the convolutional neural network comprising a main computing system. However, in the same field of invention, Rutenberg discloses a primary classifier and a secondary classifier comprising a neural computer (C 2 L 13-21) uses images to classify a specimen (C 2 L 22-36) wherein a feed-forwarding mode is used to classify input patterns by the neural network into exemplar categories (C 5 L 48-60) the neural network can be a "Delta" processor neurocomputer of Science Applications International Corp. (SAIC) in feed-forward (i.e., non-training) mode (C 6 L 6-12) using the primary classifier implemented as a SAIC neurocomputer optimized for feed-forwarding to a separate unmodified neurocomputer which contains both learning and feed-forwarding functions (C 7 L 20-27 Fig 6). Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine a second part of the convolutional neural network system comprising: a main computing system comprising a second processor configured to cause the main computing system to: receive the processed different subcomponents of the input data from the first [far side] memory controller and the second [far side] memory controller; process a set of final layers of a second convolution neural network system using the processed different subcomponents of the input data provided from the first part of the convolutional neural network system; and generate a classification of the input data using the set of final layers of the second part of the convolution neural network system with the teachings of Huang, Zmora and Booss with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim 2. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and the combination of Huang in view of Rutenberg further discloses wherein the first processor is connected to the main computing system by a memory fabric, a neural network processor may be connected to the system through dynamic RAM and DMA controllers (P 0137) all weights for a neural network are stored on-chip to minimize or reduce reads from processor memory (P 0142) neural network engines may be hosted on a with a memory and one or more processing unit (P 0138, 0179, 0186, Fig 11) wherein components are connected by one or more communication channels (P 0188 Fig 15). Rutenberg has been combined with Huang for the limitations of the second part of the convolutional neural network comprising a main computing system.
Claim 3. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and the combination of Huang in view of Rutenberg further discloses wherein the first processor and the main computing system are disposed on the same silicon, by having the weight values on chip, the computations may be limited only by the relatively short on-chip memory latency (P 0064) the weights may be stored in memory on the same chip or on separate chips (P 0134, 0142, 0172). Rutenberg has been combined with Huang for the limitations of the second part of the convolutional neural network comprising a main computing system.
Claim(s) 7 is/are directed to method claim(s) similar to the system claim(s) of Claim(s) 1 and is/are rejected with the same rationale.
Claim 8-10. Canceled.
Claim 11. Huang, Zmora, Booss and Rutenberg disclose the method of claim 7, and the combination of Huang in view of Boss discloses wherein the division point further defines a number of weight factors in a first portion of the set of weight factors set and the number of weight factors in a second portion of the set of weight factors, the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). Booss has been combined with Huang for limitations directed to a ratio of memory resources.
Claim(s) 16 is/are directed to method claim(s) similar to the system claim(s) of Claim(s) 1 and is/are rejected with the same rationale. Claim 16 additionally includes the limitation dividing the set of weight factors based on at least one division parameter. Huang discloses this limitation, the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170).
Claim 17. Huang, Zmora, Booss and Rutenberg disclose the method of claim 16, and the combination of Huang in view of Booss further discloses wherein a second ratio is based on the ratio of maximum computing power available and processing speed, as a computation progresses and memory space becomes available, the neural network processing engine can load additional weights into the available space (P 0060) space that becomes available, over the course of a computation, in the memory subsystem of a neural network processing engine can be used to store the weights of another neural network (P 0098) a determination is made that the memory banks of a neural network processor has available space, and wherein available space may be from intermediate results requiring less memory space and/or from weight values that are no longer needed (P 0150). Booss has been combined with Huang for limitations directed to a ratio of memory resources.
Claim 18. Canceled.
Claim 20. Canceled.
Claim 21. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and Huang discloses performing a task by a neural network includes reading a first weight value from a first memory bank and reading a second weight value from a second memory bank (P 0147) for a first task (P 0105) and a second task wherein the second task may be different from the first task (P 0107) and during processing, additional weight values are read from the memory (P 0148) and Zmora discloses using far memory such as double data rate memory (DDR) (P 0139) data blobs are broken into smaller blocks and each input block at a time is fetched from external memory, i.e. DDR, to engine's local memory to perform deep learning when input/output data cannot fit into the engine’s local memory (P 0146) computations are split to enable training of very large neural networks in which the weights of all layers would not fit into the memory of a single computational node (P 0185). While Huang discloses that different weights are stored in different memories, Huang does not disclose that the memories are far side memories. Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the first weight factor set is unique to the first far side memory controller and the second weight factor set is unique to the second far side memory controller with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim 22. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and Huang discloses memory can be in an on-chip hierarchy (P 0051) for storing the weights and immediate results (P 0057). Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the memory configured to store the set of weight factors locally at the first part of the convolutional neural network system is partitioned into a first memory located in situ with respect to the first far side memory controller and a second memory located in situ with respect to the second far side memory controller with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim 23. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and Huang discloses the weights are applied in convolutions to different neighborhoods of the input feature map to create different activations which are used to create an output feature map (P 0042) results of the activation are stored in memory system (P 0071) results of an output array can be stored in each of a plurality of memory location in a memory bank, output from a processing engine array can be written into memory banks that can then subsequently provide input data for the processing engine array (P 0075). Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the first subcomponent of the different subcomponents of the input data are transmitted in batch form to the first far side memory controller and the second far side memory controller with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim 24. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and Huang discloses the input activations are combined to produce an output activation in an output feature map (P 0042) the results of the convolution of each point are summed across all channels to produce output activations that together form one channel, M, of output feature map (P 0044) the activation block can combine the results from the processing engine array into one or more output activations (P 0083) and Zmora discloses using far memory such as double data rate memory (DDR) (P 0139) data blobs are broken into smaller blocks and each input block at a time is fetched from external memory, i.e. DDR, to engine's local memory to perform deep learning when input/output data cannot fit into the engine’s local memory (P 0146) computations are split to enable training of very large neural networks in which the weights of all layers would not fit into the memory of a single computational node (P 0185). Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the main computing system is further configured to: upon receiving the processed different subcomponents of the input data from the first far side memory controller and the second far side memory controller, recombine the processed different subcomponents of the input data; and provide the recombined processed different subcomponents of the input data to the convolution neural network system with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim 25. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, and Huang discloses the transfer of an in-progress computation from one neural network processing engine to another can include transferring between individual neural network processors (P 0063) weights can be copied from one neural network processing engine to another (P 0064) as a computation for a first neural network progresses and memory space becomes available in the memory of a neural network processor, the weights for a second neural network can be stored in the available space (P 0101) once the first neural network processing engine has executed computations for each layer for which the first neural network processing engine has weights, the first neural network processing engine can cause the in-progress computation to be moved to the second neural network processing engine (P 0130) resuming an in-progress computation can include inputting weights from a middle layer into the processing engine array, along with intermediate results from the first neural network processing engine (P 0132) and Rutenberg discloses the primary classifier is used in conjunction with morphological classification and area classification, wherein the primary classifier transmits output to a morphological classifier, which transmits its output to a secondary and/or alternate secondary classifier (C 8 L 22-29 Fig 7). The second morphological classifier 19, which receives output from the primary classifier and the transmits data to the secondary classifier and/or an alternate secondary classifer, is analogous to the claimed intermittent processor. Therefore, considering the teachings of Huang, Zmora, Booss and Rutenberg, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the first part of the convolution neural network system is configured to send the processed different subcomponents of the input data from the first far side memory controller and the second far side memory controller to an intermittent processor, and from the intermittent processor, send the processed different subcomponents of the input data to the main computer system with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more efficient system of parallel processing and increase the sizes of the sets of training data available to the systems (Zmora: P 0002).
Claim(s) 4-6, 12-15, 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. (US 2019/0180170 A1) in view of Zmora et al. (US 2018/0307624 A1) and Booss et al. (US 2019/0258580 A1) and Rutenberg (US 4,965,725) and further in view of Sarkar et al. (US 2018/0349788 A1).
Claim 4. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, but Huang does not disclose wherein the ratio is set to include 10 percent or less of the first weight factor set and provide at least 50 percent of a computing effort, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the ratio is set to include 10 percent or less of the first weight factor set and provide at least 50 percent of a computing effort with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 5. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, but Huang does not disclose wherein the ratio is set to include 50 percent or less of the first weight factor set and provide at least 90 percent of a computing effort, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the ratio is set to include 50 percent or less of the first weight factor set and provide at least 90 percent of a computing effort with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 6. Huang, Zmora, Booss and Rutenberg disclose the convolutional neural network system of claim 1, but Huang does not disclose wherein the ratio is set to include 50 percent or less of the first weight factor set and provide at least 50 percent of a computing effort, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the ratio is set to include 50 percent or less of the first weight factor set and provide at least 50 percent of a computing effort with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 12. Huang, Zmora, Booss and Rutenberg disclose the method of claim 11, but Huang does not disclose wherein the ratio divides the weight factor set to include 10 percent or less of weight factors in the first part of the first convolutional neural network, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the ratio divides the weight factor set to include 10 percent or less of weight factors in the first part of the first convolutional neural network with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 13. Huang, Zmora, Booss, Rutenberg and Sarkar disclose the method of claim 12, but Huang does not disclose wherein processing the data set in the first part of the first convolutional neural network comprises at least 50 percent of a computing effort, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein processing the data set in the first part of the first convolutional neural network comprises at least 50 percent of a computing effort with the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 14. Huang, Zmora, Booss and Rutenberg disclose the method of claim 11, but Huang does not disclose wherein the ratio divides the set of weight factors to include 50 percent or less of weight factors in the first part of the first convolutional neural network, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein the ratio divides the set of weight factors to include 50 percent or less of weight factors in the first part of the first convolutional neural network with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 15. Huang, Zmora, Booss, Rutenberg and Sarkar disclose the method of claim 14, but Huang does not disclose wherein processing the data set in the first part of the first convolutional neural network comprises at least 90 percent of the computing effort, as disclosed in the claims. However, Huang discloses the neural network generates an output only when the inputs cross some threshold (P 0027) the weights stored in the memory banks can include all of the weights for the neural network or fewer than all of the weights for the neural network (P 0104) the available memory in the second neural network processing engine can be used to store weights for a neural network that is being executed by the first neural network processing engine (P 0124) the sets of weights are determined according to a defined task to be performed by the neural network (P 0146) for a specific context (P 0163, 0164) a second set of weights are used by the neural network to perform a second task (P 0167) the number of weights are reduced to select only needed weights to complete a computation of the result (P 0170). That is, while Huang clearly discloses that the weights used in each of the first and second neural networks, Huang does not disclose specific percentages. However, in the same field of invention, Sarkar discloses training weights are provided from a source neural network to a target neural network (P 0005) a percentage of training weights are selected from the set of training weights, for example 50%, 25% of the 50% to 75% percentiles, or from the bottom 25% (P 0025). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine wherein processing the data set in the first part of the first convolutional neural network comprises at least 90 percent of the computing effort with the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Claim 19. Huang, Zmora, Booss and Rutenberg disclose the method of claim 16, but Huang does not explicitly disclose placing a higher percentage of the set of weight factors on the first part of the convolutional neural network relative to the second part of the convolutional neural network, as disclosed in the claims. However, Huang discloses weights are stored first in the memory of the first neural network until the capacity if filled, then the remainder of the weights for the first neural network are placed in the memory of the second neural network, and per Fig 9, the percent of weights in the memory of the first neural network is higher than the percent in the memory of the second neural network (P 0124 Fig 9) the weights of two neural network processors can be divided in two, with the two parts being the same in size or different in size (P 0138). That is, while there may be cases where the number of weights for the first neural network are more than the number of weights for the second neural network, there is no indication that this allocation may be specifically set. However, in the same field of invention, Sarkar discloses sample training sets for a second neural network are generated by a first neural network and the generated sample training sets are less than the all the training sets in the weight histories used by the first neural network to generate the sample training sets (P 0024). Therefore, considering the teachings of Huang, Zmora, Booss, Rutenberg and Sarkar, one having ordinary skill in the art before the effective filing date of the invention would have been motivated to combine placing a higher percentage of the set of weight factors on the first part of the convolutional neural network relative to the second part of the convolutional neural network with the teachings of Huang, Zmora, Booss and Rutenberg with the motivation to provide a more optimum system for providing the necessary weights for training a neural network (Sarkar: P 0004).
Response to Arguments
Applicant's arguments filed 10/23/2025 have been fully considered but they are not persuasive.
The applicant argues:
[C]ontrary to the Final Action's interpretation of Huang, Huang does not actually teach or suggest applying anything to different portions/aspects of an image, i.e., the claimed first and second subcomponents.
Applicant notes that on p. 7, the Final Action admits that Huang does not disclose "crop apart input data into," but if this is the case, "a first processor configured to: different subcomponents of the input data," where the ellipsis is meant to suggest ignoring the limitations, is nonsensical, and vitiates the meaning of the limitation.
The examiner respectfully disagrees. Huang discloses performing windowing data on an input image into a neural network. Areas of the image, or a series of images that are selected via a filter, are referred to as neighborhoods, and the used as input activations. Weights are applied in convolutions to different neighborhoods of the input feature map to create different activations which are used to create an output feature map (P 0042, 0044). The results of the activation are stored in memory system (P 0071) in a plurality of memory locations (P 0075). The selection of neighborhoods of input images that are then used as input activations and stored in a memory array is analogous to the claimed first and second subcomponents of an image.
The examiner notes that the inclusion of “a first process configured to: …” is included merely to preserve the language of the claim. As noted above, Huang clearly discloses different subcomponents of the input data. The examiner rejected the limitations including “crop apart input data” simply because Huang does not disclose the language “crop”, although Huang does discloses the functionality.
The applicant argues:
Modifying the teachings of Huang with Zmora would again, ruin Huang's disclosed method of operation. Applicant notes that as described at, e.g., [0011] of the present application, storage space is not a concern - rather, when weight factors are stored separately, e.g., weight factors stored in a first part of the network locally do not need to be updated between input data sets. Hence, as will be discussed below, there is a particular manner in which the weight factors are divided.
The examiner respectfully disagrees. As noted above, specifically discloses breaking data into smaller blocks (P 0146) and splitting computations to enable training of very large neural networks in which the weights of all layers would not fit into the memory of a single computational node (P 0185). Each of Huang and Zmora describe different similar embodiment of selecting subcomponents of input data, storing the subcomponents into different memory locations for the purpose of applying the data to neural network computations. Therefore, the examiner disagrees that combining Zmora with Huang ruin Huang's disclosed method of operation.
The applicant argues:
The Final Action additionally asserts that Huang contemplates "distribute a first subcomponent of the different subcomponents of the input data to a first.. memory controller and a second subcomponent of the different subcomponents of the input data to a second memory controller." The Final Action at p. 5 further explains that "[t]he application of the weights to the different neighborhoods.. to produce different activation inputs[sic], which are the results of filters being applied Applicant disagrees.
The examiner respectfully disagrees. As explained above, the activation inputs derived from the filtered neighborhoods of the input image are in fact distributed to different memory locations of the memory array. This is evident in Figures 6A – 7C of Huang.
Applicant’s arguments with respect to claim(s) 1, 7, 16 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Independent Claims 1, 7 and 16 have been amended to include the following or similar limitations:
wherein the convolutional neural network system is configured to divide the set of weight factors according to a ratio of amounts of computing power to be provided by the first and second far side memory controllers to weight factors
The examiner has combined new prior art reference Booss with Huang to reject these limitations. As noted in the rejection, Huang discloses the computational capacity of a processing engine array is comprised of an array of processing engines (P 0077 0079) wherein weights are stored in different proportions and are sent to the processing engine array (P 0090). While it would be obvious to relate the proportions of the weights stored in the memory to a number of processing engines in the processing engine array, Huang does not explicitly disclose this feature. Booss discloses data may be assigned a weight class based on the importance or priority of the data, wherein data stored in cache is assigned a different weight class than data stored in memory pages (P 0028) dividing, into a first portion of memory resources and a second portion of memory resources, a plurality of memory resources included in a cache coupled with a database, the plurality of memory resources storing data from the database, the first portion of memory resources being occupied by data assigned to a first weight class, and the second portion of memory resources being occupied by data assigned to a second weight class (Claim 1) the first portion of memory resources is selected based at least on the ratio associated with the first portion of memory resources being higher than a respective ratio of the second portion of memory resources (Claim 4). Adding Booss to Huang would assign weighted data to respective processing engines based on the weights assigned to the data.
Conclusion
Any inquiry concerning this communication should be directed to JOHN M HEFFINGTON at telephone number (571)270-1696.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHN M HEFFINGTON whose telephone number is (571)270-1696. The examiner can normally be reached on Monday through Friday from 9:30 am to 5:30 pm Eastern.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Cesar B Paula, can be reached at telephone number (571)272-4128. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center and the Private Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from Patent Center or Private PAIR. Status information for unpublished applications is available through Patent Center and Private PAIR for authorized users only. Should you have questions about access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-automated- interview-request-air-form.
/J.M.H/Examiner, Art Unit 2145 1/9/2026
/CESAR B PAULA/Supervisory Patent Examiner, Art Unit 2145