Prosecution Insights
Last updated: April 19, 2026
Application No. 17/463,341

RECONFIGURABLE EXECUTION OF MACHINE LEARNING NETWORKS

Final Rejection §103
Filed
Aug 31, 2021
Examiner
MAC, GARY
Art Unit
2127
Tech Center
2100 — Computer Architecture & Software
Assignee
Texas Instruments Incorporated
OA Round
4 (Final)
36%
Grant Probability
At Risk
5-6
OA Rounds
5y 0m
To Grant
61%
With Interview

Examiner Intelligence

Grants only 36% of cases
36%
Career Allow Rate
5 granted / 14 resolved
-19.3% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
5y 0m
Avg Prosecution
36 currently pending
Career history
50
Total Applications
across all art units

Statute-Specific Performance

§101
38.4%
-1.6% vs TC avg
§103
41.9%
+1.9% vs TC avg
§102
8.0%
-32.0% vs TC avg
§112
10.1%
-29.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 14 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Arguments Applicant’s argument filed 11/25/2025 have been fully considered but they are not persuasive in regards to 103 rejections. The amendments have overcome 101 rejections and thus, 101 rejections have been withdrawn. Applicant’s Argument: On pages 8-10 of Applicant’s response, applicant states “There is no grouping of layers in Lin. More specifically, Lin fails to teach grouping of layers into a first set of groups in executing a machine learning (ML) model in an initial execution configuration, and grouping the layers into a second set of groups in executing the ML model in a subsequent execution configuration that is different from the initial execution configuration, as recited in claim 1. Moreover, as further indicated in claim 1, one of these execution configurations is a configuration of the ML model that prioritizes a first performance criterion over a second performance criterion, while the other of these execution configurations is a configuration of the ML model that prioritizes the second performance criterion over the first. Lin also fails to show this feature of claim 1, in which different layer groupings are associated with different relative priorities of performance criteria. Huang, which is applied for layer grouping, does not offset the deficiencies of Lin. Huang indicates that a subset of layers of a neural network may be processed together. However, this falls well short of teaching that layers are grouped differently in different execution configurations, which have different priorities among first and second performance criterion.” Examiner’s Response: Applicant’s argument is not persuasive. In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). Applicant does not provide any explanation on how the combination of Lin in view of Huang does not teach the limitation of applying different layer groupings and executing different configurations. Applicant’s Argument: On pages 8-10 of Applicant’s response, applicant states “A further distinction over Lin is that the "machine learning (ML) model [is] configured for the electronic device according to first and second performance criteria enabling the ML model to be executed by the electronic device in different configurations," as recited in claim 1. In addition to the execution configurations noted above, the different configurations further includes "at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria," as recited in claim 1. Thus, claim 1 provides a more proactive approach in that different execution configurations configured for the electronic device are already stored on the electronic device, which may then select from, and switch between, the different execution conditions.” Examiner’s Response: Applicant’s argument is not persuasive. Applicant’s arguments with respect to claim 1 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-2, 5-9, 12-14, and 21-24 are rejected under 35 U.S.C. 103 as being unpatentable by Lin (US20160328644A1), in view of Huang (US20220121914A1) and Palm (US20220391555A1). Regarding claim 1, Lin teaches: “an electronic device, comprising: a cache memory” ([0027, 0096, Figure 1 & 2], A system is shown in Figure 1 with a dedicated memory block to store instructions. Each processor also has a local memory block (cache memory) associated with the processor for local storage. During execution, the processor may load some instructions into cache to increase access speed.) “a second memory to store instructions for execution of a machine learning (ML) model configured for the electronic device according to first and second performance criteria enabling the ML model to be executed by the electronic device in different configurations including a configuration prioritizing the first performance criterion over the second performance criterion, a configuration prioritizing the second performance criterion over the first performance criterion, and ” ([0027, 0046, 0054-0056, 0082, Figure 1], A program memory (second memory) may provide instructions to the general purpose processor for executing means for executing a first or a second configuration. Configurations of a machine learning model contains information to describe a model’s latency, processor requirements, and memory bandwidth requirements. A current configuration (first execution data) may have the model optimized to run based on having a performance requirement of lowest latency. When it is determined that one or more applications on the same processor consumes an increased amount of memory bandwidth, the system may select a new configuration (second execution data) of the model having a performance requirement of lowest amount of memory bandwidth. The proposed configurations are intended to be an improvement over the previous configurations based on the system resources and performance specifications, which will prioritize one performance criteria over another. The deep convolutional network may include multiple types of layers.) “a ML accelerator coupled to the cache memory and the second memory” ([0027, Figure 1], A system is shown in Figure 1 with a neural processing unit (ML accelerator) that consists of a memory block to store task information and it is coupled to the dedicated memory block to receive instructions for executing tasks.) “one or more processors coupled to the cache memory, the second memory, and the ML accelerator, wherein the one or more processors are configured to execute instructions causing the electronic device to” ([0027,0029, Figure 1], The adaptive selection method of an artificial neural network is performed using a system-on-a-chip, which may include GPU and CPU (one or more processors) with distributed memory blocks associated with each processor. It is implied that the memory blocks associated with each processor are on-chip memory (cache memory). The dedicated memory block (second memory) is an external memory. The NPU is the ML accelerator. These components cause the system to perform the process of adaptive selection technique.) “select an initial execution configuration, from the different configurations, based on an initial condition associated with executing the ML model, and at least one of the first performance criterion and the second performance criterion” ([0052-0053, Figure 4], The system resources (condition) and performance requirements, such as latency and accuracy criteria (first performance criterion and the second performance criterion), are factors that are used to determine the adaptive selection of configurations. The system runs the initial baseline model with a current configuration and continuously monitor changes in requirements and resource constraints. The system may choose from a plurality of configurations.) “execute the ML model in the initial execution configuration ” ([0053, 0058-0059], The system may execute the current configuration with a first processor.) “detect a runtime condition different from the initial condition” ([0054-0057, Figure 4], A mapper collect information about resource availability and performance specifications to propose new configurations when the conditions have change. The system may dynamically select the proposed new configuration if the system determines that the new configuration is an improvement over the previous configuration. Configurations contain information to describe the model, such as latency, processor requirements, and memory bandwidth requirements. Configuration may be selected based on the memory bandwidth factor (runtime condition).) “select a subsequent execution configuration, from different configurations, based on the runtime condition associated with executing the ML model and at least one of the first performance criterion and the second performance criterion, the subsequent execution configuration being different than the initial execution configuration” ([0054-0057, Figure 4], The system may dynamically select the proposed new configuration if the system determines that the new configuration is an improvement over the previous configuration. Configurations contain information to describe the model, such as latency, processor requirements, and memory bandwidth requirements. Configuration may be selected based on the memory bandwidth factor (runtime condition).) “switch execution of the ML model from the initial execution configuration to the subsequent execution configuration ” ([0055], The controller may dynamically select the proposed new configurations.) Lin does not explicitly disclose an implementation of “at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria”, “execute the ML model in the initial execution configuration including grouping the set of layers into a first set of groups” and “switch execution of the ML model from the initial execution configuration to the subsequent execution configuration including grouping the set of layers into a second set of groups that is different than the first set of groups”. However, Huang discloses in the same field of endeavor: “execute the ML model in the initial execution configuration including grouping the set of layers into a first set of groups” ([0002, 0004, 0006-0007, 0121-0129], When evaluating (execute) the neural network, the layers of the neural network may be grouped together as a subset. The specification (par. 122-128) shows the possible configurations of grouping the layers together.) “” ([0002, 0004, 0006-0007, 0121, 0130, Fig. 11A & 11B], When evaluating (execute) the neural network, the layers of the neural network may be grouped together as a subset. Figure 11A and 11B shows 2 different configurations of grouping the layers for evaluation.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria”, “execute the ML model in the initial execution configuration including grouping the set of layers into a first set of groups” and “switch execution of the ML model from the initial execution configuration to the subsequent execution configuration including grouping the set of layers into a second set of groups that is different than the first set of groups” from Huang into the teaching of Lin. Doing so can optimize the execution of a neural network on a multicore hardware (Huang, abstract). Lin in view of Huang does not explicitly disclose an implementation of “at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria”. However, Palm discloses in the same field of endeavor: “... and at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria, ...” ([0023, 0068, 0084, Figure 3], A multi-objective optimization is used to model a system based on determining pareto-optimal points. Pareto front is performed to optimize the configuration based on 2 target parameters.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “at least one compromise configuration on a Pareto curve that is a function of the first and second performance criteria” from Palm into the teaching of Lin in view of Huang. Doing so can improve the designing of a neural network system by implementing a multi-objective optimization process using a pareto-optimal layout (Palm, abstract). Regarding claim 8: Claim 8 recites a process (“A method comprising”) that performs the same process as the system described in Claim 1. Therefore claim 8 is rejected under the same reasons mention for claim 1. Regarding claim 2, Lin teaches: “wherein the first performance criterion comprises an execution time of the ML model using the electronic device, and wherein the second performance criterion comprises a memory access bandwidth” ([0026,0053,0055-0056, Figure 1], Performance requirements may include latency (execution time), memory bandwidth, and processor occupancy. The selection of a configuration for a machine learning model may be based on different factors. The performance requirement is related to the ML model running on the system.) Regarding claims 5 and 12, Lin teaches: “wherein each of the initial condition and the runtime condition is one of memory bandwidth utilization, temperature, amount of power, and amount of processor compute cycles available” ([0054], The system resource availability may include power consumption, amount of memory bandwidth, and processor occupancy.) Regarding claims 6 and 13, Lin teaches: “wherein the initial condition is processor compute cycles and the runtime condition is an amount of memory bandwidth in use” ([0051, 0054], The system resource availability may include power consumption, amount of memory bandwidth, and processor occupancy. Adaptive model conversion may take place when conditions change such as a model is transferred from a server to a mobile device. The model may be designed for a server, where there is a large amount of processor compute cycles availability. When the model is transferred to a mobile device, the device may be limited in memory bandwidth usage.) Regarding claim 7, Lin in view of Huang and Palm teaches: “wherein the at least one compromise configuration of the ML model that equally prioritizes the first performance criterion and the second performance criterion” ([0054-0057, Fig. 4], A mapper component continuously monitors system resource availability and model performance, such as latency, power consumption, processor occupancy, and memory bandwidth available. The mapper can propose a new configuration based on more than one performance criteria. For example, the proposed new model configuration can be based on running the model with the lowest latency time and using the lowest amount of memory bandwidth. The determination of new configurations is based on many factors and is a multidimensional optimization problem. Many configurations are stored into a database and each configuration are evaluated to determine if the requirements are satisfied. The database can contain a third configuration of the ML model. Palm (par. 23) further teaches a multi-objective optimization method to optimize multiple target parameters.) Regarding claim 9, Lin teaches: “wherein the first performance criterion comprises an execution time of the ML model using the computing device, and wherein the second performance criterion comprises a memory access bandwidth” ([0026,0053,0055-0056, Figure 1], Performance requirements may include latency (execution time), memory bandwidth, and processor occupancy. The selection of a configuration for a machine learning model may be based on different factors. The performance requirement is related to the ML model running on the system.) Regarding claim 14, Lin in view of Huang and Palm teaches: “wherein the at least one compromise configuration of the ML model equally prioritizes the first performance criterion and the second performance criterion” ([0054-0057, Fig. 4], A mapper component continuously monitors system resource availability and model performance, such as latency, power consumption, processor occupancy, and memory bandwidth available. The mapper can propose a new configuration based on more than one performance criteria and system resources. For example, the proposed new model configuration can be based on running the model with the lowest latency time and using the lowest amount of memory bandwidth. The determination of new configurations is based on many factors and is a multidimensional optimization problem. Many configurations are stored into a database and each configuration are evaluated to determine if the requirements are satisfied. The database can contain a third configuration of the ML model. Palm (par. 23) further teaches a multi-objective optimization method to optimize multiple target parameters.) Regarding claim 21, Lin teaches: “wherein the execution of ML model in the initial execution configuration includes loading data ” ([0030,0046-0048, Fig. 2, Fig. 3B], The local processing units can be used to execute a portion of the deep convolutional network. The routing connection processing unit is responsible for copying the output of one local processing unit to provide as input to another local processing unit.) Lin does not explicitly disclose an implementation of “wherein the execution of ML model in the initial execution configuration includes loading data associated with layers of a group of the first set of groups into the cache memory, executing the layers of the group”. However, Huang discloses in the same field of endeavor: “wherein the execution of ML model in the initial execution configuration includes loading data associated with layers of a group of the first set of groups into the cache memory, executing the layers of the group, and copying an output of the executing of the layers of the group from the cache memory to a second memory” ([0144-0145, Fig. 15], Input data for the first layer groups is loaded into memory for processing. A portion of the output is written to the local on-chip memory and another portion of the output is written to the shared on-chip memory.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “wherein the execution of ML model in the initial execution configuration includes loading data associated with layers of a group of the first set of groups into the cache memory, executing the layers of the group” from Huang into the teaching of Lin. Doing so can optimize the execution of a neural network on a multicore hardware (Huang, abstract). Regarding claim 22, Lin teaches: “wherein the loading of the data ” ([0027,0030,0046-0048, Fig. 2, Fig. 3B], The local processing units can be used to execute a portion of the deep convolutional network. The routing connection processing unit is responsible for copying the output of one local processing unit to provide as input to another local processing unit. The data is stored into local memory.) Lin does not explicitly disclose an implementation of “wherein the loading of the data associated with the layers of the group of the first set of groups loads the data from the second memory”. However, Huang discloses in the same field of endeavor: “wherein the loading of the data associated with the layers of the group of the first set of groups loads the data from the second memory” ([0146, Fig. 15], Data from the preceding layer group is read from the local on-chip memory and the shared on-chip memory.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “wherein the loading of the data associated with the layers of the group of the first set of groups loads the data from the second memory” from Huang into the teaching of Lin. Doing so can optimize the execution of a neural network on a multicore hardware (Huang, abstract). Regarding claim 23, Lin teaches: “wherein the executing of the ML model in the subsequent execution configuration includes loading data ” ([0030,0046-0048, Fig. 2, Fig. 3B], The local processing units can be used to execute a portion of the deep convolutional network. The routing connection processing unit is responsible for copying the output of one local processing unit to provide as input to another local processing unit.) Lin does not explicitly disclose an implementation of “loading data associated with layers of a group of the second set of groups into the cache memory,”. However, Huang discloses in the same field of endeavor: “wherein the executing of the ML model in the subsequent execution configuration includes loading data associated with layers of a group of the second set of groups into the cache memory, executing the layers of the group, and copying an output of the executing of the layers of the group from the cache memory to a second memory” ([0144-0147, Fig. 15], The second and subsequent layer groups are processed when the previous layer group has finished. Data is loaded to evaluate the subsequent layer groups. A portion of the output is written to the local on-chip memory and another portion of the output is written to the shared on-chip memory.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “loading data associated with layers of a group of the second set of groups into the cache memory” from Huang into the teaching of Lin. Doing so can optimize the execution of a neural network on a multicore hardware (Huang, abstract). Regarding claim 24, Lin teaches: “wherein the loading of the data ” ([0027,0030,0046-0048, Fig. 2, Fig. 3B], The local processing units can be used to execute a portion of the deep convolutional network. The routing connection processing unit is responsible for copying the output of one local processing unit to provide as input to another local processing unit. The data is stored into local memory.) Lin does not explicitly disclose an implementation of “wherein the loading of the data associated with the layers of the group of the second set of groups includes loading the data from the second memory”. However, Huang discloses in the same field of endeavor: “wherein the loading of the data associated with the layers of the group of the second set of groups includes loading the data from the second memory” ([0146-0147, Fig. 15], Data from the preceding layer group is read from the local on-chip memory and the shared on-chip memory. The processing loop continues until all of the layer groups have been evaluated. Therefore, the second layer group will be loaded from memory to evaluate the third layer group.) It would be obvious to one of the ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teaching of “wherein the loading of the data associated with the layers of the group of the second set of groups includes loading the data from the second memory” from Huang into the teaching of Lin. Doing so can optimize the execution of a neural network on a multicore hardware (Huang, abstract). Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to GARY MAC whose telephone number is (703)756-1517. The examiner can normally be reached Monday - Friday 8:00 AM - 5:00 PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abdullah Kawsar can be reached on (571) 270-3169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /GARY MAC/ Examiner, Art Unit 2127 /ABDULLAH AL KAWSAR/Supervisory Patent Examiner, Art Unit 2127
Read full office action

Prosecution Timeline

Aug 31, 2021
Application Filed
Sep 18, 2024
Non-Final Rejection — §103
Jan 22, 2025
Response Filed
Feb 14, 2025
Final Rejection — §103
Jun 20, 2025
Request for Continued Examination
Jun 24, 2025
Response after Non-Final Action
Jul 23, 2025
Non-Final Rejection — §103
Nov 25, 2025
Response Filed
Jan 28, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596907
NEURAL NETWORK OPERATION APPARATUS AND METHOD
2y 5m to grant Granted Apr 07, 2026
Patent 12572842
METHODS AND SYSTEMS FOR DECENTRALIZED FEDERATED LEARNING
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 2 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

5-6
Expected OA Rounds
36%
Grant Probability
61%
With Interview (+25.0%)
5y 0m
Median Time to Grant
High
PTA Risk
Based on 14 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month