DETAILED ACTION
Status of the Claims
The following is a non-final Office Action in response to claims filed 17 July 2023.
Claims 1-20 are pending.
Claims 1-20 have been examined.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 17 July 2023 are being considered by the Examiner.
Priority
Applicant’s claim for the benefit of a prior-filed application(s) Korean Application 10-2022-0176169, filed 15 December 2022 which claims priority to provisional Application 61/928,308 under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, or 365(c) is acknowledged.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claims are directed to a process (an act, or series of acts or steps), a machine (a concrete thing, consisting of parts, or of certain devices and combination of devices), and a manufacture (an article produced from raw or prepared materials by giving these materials new forms, qualities, properties, or combinations, whether by hand labor or by machinery). Thus, each of the claims falls within one of the four statutory categories (Step 1). The claims recite a method (process) and apparatus, however, the claim(s) recite(s) optimizing partitions of a neural network which is an abstract idea of which is a mathematical concept.
The limitations of “converting, a neural network, stored in the storage hardware, from a first neural network format into a second neural network format; obtaining, information about hardware configured to perform a neural network operation for the neural network and obtaining partition information; dividing the neural network in the second neural network format into partitions, wherein the dividing is based on the information about the hardware and the partition information, wherein each partition comprises a respective layer with an input thereto and an output thereof; optimizing each of the partitions based on a relationship between the input and the output of the corresponding layer; and converting the optimized partitions into the first neural network format,” as drafted, is a process that, under its broadest reasonable interpretation, covers mathematical concepts—mathematical relationships, mathematical formulas or equations, mathematical calculations but for the recitation of generic computer components (Step 2A Prong 1). That is, other than reciting “A method of processing data performed by a computing device comprising a processing hardware,” (or “An apparatus comprising: one or more processors; memory storing instructions configured to cause the one or more processors to perform a process comprising:” in claim 13) nothing in the claim element precludes the steps from the mathematical concept grouping. For example, but for the “A method of processing data performed by a computing device comprising a processing hardware” (or “An apparatus comprising: one or more processors; memory storing instructions configured to cause the one or more processors to perform a process comprising:” in claim 13) language, “converting,” “obtaining,” “dividing,” “optimizing,” and “converting” in the context of this claim encompasses mathematical analysis i.e. optimization, which is mathematical concept of using mathematical concepts to assist in achieving a most desired result. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitations as a mathematical concept, but for the recitation of generic computer components, then it falls within the grouping of abstract ideas. (Step 2A, Prong One: YES). Accordingly, the claim(s) recite(s) an abstract idea.
This judicial exception is not integrated into a practical application (Step 2A Prong Two). In particular, the claim only recites one additional element – using a processing hardware or one or more processors to perform the steps. The a processing hardware or one or more processors in the steps is recited at a high-level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer component. Specifically the claims amount to nothing more than an instruction to apply the abstract idea using a generic computer or invoking computers as tools by adding the words “apply it” (or an equivalent) with the judicial exception, or mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea - see MPEP 2106.04(d)(I) discussing MPEP 2106.05(f). The recitation of “machine learning model” in the limitations also merely indicates a field of use or technological environment in which the judicial exception is performed. Although the additional element “neural network” limits the identified judicial exceptions, this type of limitation merely confines the use of the abstract idea to a particular technological environment (neural networks) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(h). Accordingly, the combination of these additional elements does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea, even when considered as a whole (Step 2A Prong Two: NO).
The claim does not include a combination of additional elements that are sufficient to amount to significantly more than the judicial exception (Step 2B). As discussed above with respect to integration of the abstract idea into a practical application (Step 2A Prong 2), the combination of additional elements of using a processing hardware or one or more processors to perform the steps amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Therefore, when considering the additional elements alone, and in combination, there is no inventive concept in the claim. As such, the claim(s) is/are not patent eligible, even when considered as a whole (Step 2B: NO).
Claims 2-12 and 14-20 recite additional limitations that further limit the abstract idea previously identified (i.e. how the dividing/partitions are performed/determined and optimized) which is still directed towards the abstract idea previously identified and is not an inventive concept that meaningfully limits the abstract idea. Again, as discussed with respect to claims 1 and 13, the claims are simply limitations which are no more than mere instructions to apply the exception using a computer or with computing components. Accordingly, the additional element(s) does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Even when considered as a whole, the claims do not integrate the judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claims 1-20 are therefore not eligible subject matter, even when considered as a whole.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-3 and 9-15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. (US PG Pub. 2021/0374518) and further in view of Darvish et al. (US PG Pub. 2020/0210840).
As per claims 1 and 13, Zhu discloses method of processing data performed by a computing device comprising processing hardware and storage hardware, and apparatus comprising: one or more processors; memory storing instructions configured to cause the one or more processors to perform a process comprising:; the method comprising (processors, model, memory, GPUs, Zhu ¶57-¶58):
obtaining, by the processing hardware, information about hardware configured to perform a neural network operation for the neural network and obtaining partition information (In at least one embodiment, deep networks comprise L layers in sequence. In at least one embodiment, each layer L.sub.i is modeled by a forward computation function ƒ.sub.i with parameters w.sub.i. In at least one embodiment, given a number of partitions K, e.g., a number of GPUs typically, a model is partitioned into K parts, Zhu ¶67-¶69);
dividing the neural network in the second neural network format into partitions, wherein the dividing is based on the information about the hardware and the partition information, wherein each partition comprises a respective layer with an input thereto and an output thereof (In at least one embodiment, deep networks comprise L layers in sequence. In at least one embodiment, each layer L.sub.i is modeled by a forward computation function ƒ.sub.i with parameters w.sub.i. In at least one embodiment, given a number of partitions K, e.g., a number of GPUs typically, a model is partitioned into K parts, Zhu ¶67-¶69; In at least one embodiment, a dependency of intermediate encoder in skip s.sub.1 is present in a pipeline-based parallel U-Net. In at least one embodiment, while using a synchronous stochastic gradient descent and pipeline parallelism technique, it requires that a model needs to be implemented in a sequential way. In at least one embodiment, however, each e.sub.i, i=1, 2, . . . , 4, is used in both encoder and decoder, which affects automated partition while using a synchronous stochastic gradient descent and pipeline parallelism technique, ¶71; based upon devices, ¶74) (Examiner notes the ability to divide between the determined number of GPUs as the ability to divide the neural network into partitions based upon information about the hardware);
optimizing each of the partitions based on a relationship between the input and the output of the corresponding layer (additional fixed function logic 3116 can also include machine-learning acceleration logic, such as fixed function matrix multiplication logic, for implementations including optimizations for machine learning training or inferencing, ¶431); and
converting the optimized partitions into the first neural network format (neural network is modified, for precedent layer, Zhu ¶72; model can be updated, retrained, fine-tuned, ¶561).
While Zhu does disclose the ability to modify neural networks, as well as how the models can be updated, retrained and fine-tuned (Zhu ¶72 and ¶561), Zhu does not expressly disclose converting, by the processing hardware, a neural network, stored in the storage hardware, from a first neural network format into a second neural network format; converting the optimized partitions into the first neural network format.
However, Darvish teaches converting, by the processing hardware, a neural network, stored in the storage hardware, from a first neural network format into a second neural network format; converting the optimized partitions into the first neural network format (In one example of the disclosed technology, a neural network accelerator is configured to performing training operations for layers of a neural network, including forward propagation and back propagation. The values of one or more of the neural network layers can be expressed in a quantized format, that has lower precision than normal-precision floating-point formats. For example, block floating-point formats can be used to accelerate computations performed in training and inference operations using the neural network accelerator. Use of quantized formats can improve neural network processing by, for example, allowing for faster hardware, reduced memory overhead, simpler hardware design, reduced energy use, reduced integrated circuit area, cost savings and other technological improvements. It is often desirable that operations be performed to mitigate noise or other inaccuracies introduced by using lower-precision quantized formats. Further, portions of neural network training, such as temporary storage of activation values, can be improved by compressing a portion of these values (e.g., for an input, hidden, or output layer of a neural network), either from normal-precision floating-point or from a first block floating-point, to a lower precision number format. The activation values can be later retrieved for use during, for example, back propagation during the training phase. An input tensor for the given layer can be converted from a normal-precision floating-point format to a quantized-precision floating-point format. A tensor operation can be performed using the converted input tensor. A result of the tensor operation can be converted from the block floating-point format to the normal-precision floating-point format. The tensor operation can be performed during a forward-propagation mode or a back-propagation mode of the neural network. For example, during a back-propagation mode, the input tensor can be an output error term from a layer adjacent to (e.g., following) the given layer or weights of the given layer. As another example, during a forward-propagation mode, the input tensor can be an output term from a layer adjacent to (e.g., preceding) the given layer or weights of the given layer. The converted result can be used to generate an output tensor of the layer of the neural network, where the output tensor is in normal-precision floating-point format. In this manner, the neural network accelerator can potentially be made smaller and more efficient than a comparable accelerator that uses only a normal-precision floating-point format. A smaller and more efficient accelerator may have increased computational performance and/or increased energy efficiency. Additionally, the neural network accelerator can potentially have increased accuracy compared to an accelerator that uses only a quantized-precision floating-point format. By increasing the accuracy of the accelerator, a convergence time for training may be decreased and the accelerator may be more accurate when classifying inputs to the neural network. Reducing the computational complexity of using the models can potentially decrease the time to extract a feature during inference, decrease the time for adjustment during training, and/or reduce energy consumption during training and/or inference, Darvish ¶41-¶42).
The Darvish and Zhu references are analogous in that both are directed towards/concerned with optimizing neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use Darvish’s ability to convert formats of different layers of a neural network in Zhu’s system to improve the system and method with reasonable expectation that this would result in a neural network optimizing technique that results in improvements and cost savings.
The motivation being that the use of quantized formats can improve neural network processing by, for example, allowing for faster hardware, reduced memory overhead, simpler hardware design, reduced energy use, reduced integrated circuit area, cost savings and other technological improvements. It is often desirable that operations be performed to mitigate noise or other inaccuracies introduced by using lower-precision quantized formats. Further, portions of neural network training, such as temporary storage of activation values, can be improved by compressing a portion of these values (e.g., for an input, hidden, or output layer of a neural network), either from normal-precision floating-point or from a first block floating-point, to a lower precision number format. The activation values can be later retrieved for use during, for example, back propagation during the training phase (Darvish ¶41).
As per claims 2 and 14, Zhu and Darvish disclose as shown above with respect to claims 1 and 13. Darvish further teaches wherein the partition information comprises data division direction information, the dividing of the neural network in the second format is based on the data division direction information, and the data division direction information comprises a height direction of the data, a width direction of the data, or a channel direction of the data (It is often desirable that operations be performed to mitigate noise or other inaccuracies introduced by using lower-precision quantized formats. Further, portions of neural network training, such as temporary storage of activation values, can be improved by compressing a portion of these values (e.g., for an input, hidden, or output layer of a neural network), either from normal-precision floating-point or from a first block floating-point, to a lower precision number format. The activation values can be later retrieved for use during, for example, back propagation during the training phase, Darvish ¶41; matrix or vector, ¶32) (Examiner notes the backpropagation as the direction information which dictates the format).
As per claims 3 and 15, Zhu and Darvish disclose as shown above with respect to claims 1 and 13. Zhu further discloses wherein the information about the hardware comprises a number of elements of the hardware, and the dividing of the neural network comprises: determining a number of partitions to be formed based on the number of the hardware; and dividing the neural network in the second format into the partitions based on the determined number of partitions to be formed (In at least one embodiment, deep networks comprise L layers in sequence. In at least one embodiment, each layer L.sub.i is modeled by a forward computation function ƒ.sub.i with parameters w.sub.i. In at least one embodiment, given a number of partitions K, e.g., a number of GPUs typically, a model is partitioned into K parts, Zhu ¶67-¶69; In at least one embodiment, a dependency of intermediate encoder in skip s.sub.1 is present in a pipeline-based parallel U-Net. In at least one embodiment, while using a synchronous stochastic gradient descent and pipeline parallelism technique, it requires that a model needs to be implemented in a sequential way. In at least one embodiment, however, each e.sub.i, i=1, 2, . . . , 4, is used in both encoder and decoder, which affects automated partition while using a synchronous stochastic gradient descent and pipeline parallelism technique, ¶71; based upon devices, ¶74).
As per claim 9, Zhu and Darvish disclose as shown above with respect to claim 1. Zhu further discloses wherein the converting of the optimized partitions into the first neural network format is based on information corresponding to a weight dimension, an operator type, and/or a size of a feature of the neural network (weight, Zhu ¶72; size, ¶68).
As per claim 10, Zhu and Darvish disclose as shown above with respect to claim 1. Zhu further teaches herein the converting of the optimized partitions into the first neural network format comprises adding a real-time operator for synchronization between the optimized partitions in the first neural network format when executed by the hardware (synchronized command, Zhu ¶642; additional storage and computation pairs can be included, ¶100).
As per claim 11, Zhu and Darvish disclose as shown above with respect to claim 2. Zhu further teaches wherein the dividing of the neural network comprises converting the partitions into multi-directional division partitions by setting a data division direction to multiple directions (bi-directional, Zhu ¶376; see also back-propagation ¶68).
As per claim 12, Zhu and Darvish disclose as shown above with respect to claim 2. Zhu further teaches wherein the dividing of the neural network comprises generating an intermediate data transmission division partition in a data division direction using multiple directions and multiple layers (intermediate, Zhu ¶329).
Claim(s) 4-8 and 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. (US PG Pub. 2021/0374518) and Darvish et al. (US PG Pub. 2020/0210840) further in view of Dutta et al. (US PG Pub. 2024/0046099).
As per claims 4 and 16, Zhu and Darvish disclose as shown above with respect to claims 1 and 13. The combination of Zhu and Darvish do not expressly disclose wherein the optimizing of the partitions comprises removing an operator that satisfies a predetermined condition among operators comprised in each of the partitions.
However, Dutta teaches wherein the optimizing of the partitions comprises removing an operator that satisfies a predetermined condition among operators comprised in each of the partitions (neural network pruning and hardware acceleration, Dutta ¶32; The set of layer-wise pruning ratios Δ for each DNN model is the pruning configuration and to finding the optimal set of Δ can minimize the objectives. The method of the present disclosure first generates a random population batch of configurations and select the best-fit instance for further mutation. The optimal pruning search technique uses a cost function (Equation to determine the fitness of each DNN model. Here, selection of the best fit individual element from the entire population is shown to be faster. The user selection process returns one individual configuration called the parent configuration, or simply the parent. The values of such configuration (like some Δs in its encoding) may require a random modification which is called as mutation. This results in a new individual element configuration that can produce a better global sparsity at an accuracy closer to the base model. Typically, mutation happens when a change in probability value falls below a threshold probability known as the mutation rate. Lower the mutation rate is the chance of mutation for any member of the configuration, ¶70) (Examiner notes the ability to prune layers as the ability to remove operations/functions).
The Dutta, Darvish, and Zhu references are analogous in that both are directed towards/concerned with optimizing neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to use Dutta’s ability to prune neural networks in Zhu’s system to improve the system and method with reasonable expectation that this would result in a neural network optimizing technique that results in improvements and cost savings.
The motivation being that it is not always feasible to change the hardware for accommodating the new DNN model based on inference in such edge devices. Selecting deployment hardware based on the DNN inference workload, multiple dependencies with many different stakeholders, mandatory testing cycles, and tight schedules makes it difficult to completely replace an existing edge hardware setup. In another approach, automated transformation required a route through the model architecture dynamically composed of different network operations making a series of decisions using a reinforcement learning. However, such approach requires training data to redesign the model (Dutta ¶6).
As per claims 5 and 17, Zhu and Darvish disclose as shown above with respect to claims 1 and 13. The combination of Zhu and Darvish do not expressly disclose wherein the optimizing of the partitions comprises determining whether to remove a crop operator or a concat operator among operators comprised in the partitions.
However, Dutta teaches wherein the optimizing of the partitions comprises determining whether to remove an [operator] among operators comprised in the partitions (neural network pruning and hardware acceleration, Dutta ¶32; The set of layer-wise pruning ratios Δ for each DNN model is the pruning configuration and to finding the optimal set of Δ can minimize the objectives. The method of the present disclosure first generates a random population batch of configurations and select the best-fit instance for further mutation. The optimal pruning search technique uses a cost function (Equation to determine the fitness of each DNN model. Here, selection of the best fit individual element from the entire population is shown to be faster. The user selection process returns one individual configuration called the parent configuration, or simply the parent. The values of such configuration (like some Δs in its encoding) may require a random modification which is called as mutation. This results in a new individual element configuration that can produce a better global sparsity at an accuracy closer to the base model. Typically, mutation happens when a change in probability value falls below a threshold probability known as the mutation rate. Lower the mutation rate is the chance of mutation for any member of the configuration, ¶70) (Examiner notes the ability to prune layers as the ability to remove operations/functions).
While Dutta teaches a the ability to prune layers of a neural network, Dutta does not expressly teach the “crop operator or a concat operator.”
However, the Examiner asserts that the type of operator such as crop or concat is simply a label for the components and adds little, if anything, to the claimed acts or steps and thus does not serve to distinguish over the prior art. Any differences related merely to the meaning and information conveyed through labels (i.e., the specific type of operator) which does not explicitly alter or impact the steps of the method does not patentably distinguish the claimed invention from the prior art in terms of patentability (MPEP 2144.04)
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the operators or functions to include crop or concat operator since the specific type of component does not functionally alter or relate to the steps of the method and merely labeling the information differently from that in the prior art does not patentably distinguish the claimed invention.
As per claims 6 and 18, Zhu, Darvish, and Dutta disclose as shown above with respect to claims 5 and 17. Zhu further discloses wherein, for one of the layers, the optimizing of the partitions comprises adjusting a size of the output of the one layer to correspond to a size of the input of the one layer by adding a dependent operator to the output of the one layer in response to the size of the one output of the layer being less than the size of the input of the one layer (additional or subsequent layers, Zhu ¶65).
As per claims 7 and 19, Zhu, Darvish, and Dutta disclose as shown above with respect to claims 5 and 17. Dutta further teaches wherein the optimizing of the partitions comprises removing the crop operator and the concat operator in response to the size of the output of the one layer being the same as the size of the input of the one layer (neural network pruning and hardware acceleration, Dutta ¶32; The set of layer-wise pruning ratios Δ for each DNN model is the pruning configuration and to finding the optimal set of Δ can minimize the objectives. The method of the present disclosure first generates a random population batch of configurations and select the best-fit instance for further mutation. The optimal pruning search technique uses a cost function (Equation to determine the fitness of each DNN model. Here, selection of the best fit individual element from the entire population is shown to be faster. The user selection process returns one individual configuration called the parent configuration, or simply the parent. The values of such configuration (like some Δs in its encoding) may require a random modification which is called as mutation. This results in a new individual element configuration that can produce a better global sparsity at an accuracy closer to the base model. Typically, mutation happens when a change in probability value falls below a threshold probability known as the mutation rate. Lower the mutation rate is the chance of mutation for any member of the configuration, ¶70) (Examiner notes the ability to prune layers as the ability to remove operations/functions).
While Dutta teaches a the ability to prune layers of a neural network, Dutta does not expressly teach the “crop operator or a concat operator.”
However, the Examiner asserts that the type of operator such as crop or concat is simply a label for the components and adds little, if anything, to the claimed acts or steps and thus does not serve to distinguish over the prior art. Any differences related merely to the meaning and information conveyed through labels (i.e., the specific type of operator) which does not explicitly alter or impact the steps of the method does not patentably distinguish the claimed invention from the prior art in terms of patentability (MPEP 2144.04)
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the operators or functions to include crop or concat operator since the specific type of component does not functionally alter or relate to the steps of the method and merely labeling the information differently from that in the prior art does not patentably distinguish the claimed invention..
As per claims 8 and 20, Zhu, Darvish, and Dutta disclose as shown above with respect to claims 5 and 17. Zwherein the optimizing of the partitions comprises removing the concat operator in response to the size of the output of the one layer being greater than the size of the input of the one layer (neural network pruning and hardware acceleration, Dutta ¶32; The set of layer-wise pruning ratios Δ for each DNN model is the pruning configuration and to finding the optimal set of Δ can minimize the objectives. The method of the present disclosure first generates a random population batch of configurations and select the best-fit instance for further mutation. The optimal pruning search technique uses a cost function (Equation to determine the fitness of each DNN model. Here, selection of the best fit individual element from the entire population is shown to be faster. The user selection process returns one individual configuration called the parent configuration, or simply the parent. The values of such configuration (like some Δs in its encoding) may require a random modification which is called as mutation. This results in a new individual element configuration that can produce a better global sparsity at an accuracy closer to the base model. Typically, mutation happens when a change in probability value falls below a threshold probability known as the mutation rate. Lower the mutation rate is the chance of mutation for any member of the configuration, ¶70) (Examiner notes the ability to prune layers as the ability to remove operations/functions).
While Dutta teaches a the ability to prune layers of a neural network, Dutta does not expressly teach the “concat operator.”
However, the Examiner asserts that the type of operator such as concat is simply a label for the components and adds little, if anything, to the claimed acts or steps and thus does not serve to distinguish over the prior art. Any differences related merely to the meaning and information conveyed through labels (i.e., the specific type of operator) which does not explicitly alter or impact the steps of the method does not patentably distinguish the claimed invention from the prior art in terms of patentability (MPEP 2144.04)
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the operators or functions to include concat operator since the specific type of component does not functionally alter or relate to the steps of the method and merely labeling the information differently from that in the prior art does not patentably distinguish the claimed invention.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure (additional art can be located on the PTO-892):
Ambardekar et al. (US PG Pub. 2018/0300616) Dynamically partitioning workload in a deep neural network module to reduce power consumption.
Croxford et al. (US PG Pub. 2023/0040673) Optimised machine learning processing
Exner et al. (US PG Pub. 2021/0390402) Updating a neural network model on a computation device
Any inquiry concerning this communication or earlier communications from the Examiner should be directed to ANDREW B WHITAKER whose telephone number is (571)270-7563. The examiner can normally be reached on M-F, 8am-5pm, EST.
If attempts to reach the examiner by telephone are unsuccessful, the Examiner’s supervisor, Lynda Jasmin can be reached on (571) 272-6782. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from Patent Center. Status information for published applications may be obtained from Patent Center. Status information for unpublished applications is available through Patent Center for authorized users only. Should you have questions about access to Patent Center, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) Form at https://www.uspto.gov/patents/uspto-
automated- interview-request-air-form
/ANDREW B WHITAKER/Primary Examiner, Art Unit 3629