Last updated: May 29, 2026
Application No. 17/669,301
NEURAL NETWORK PROCESSING

Non-Final OA §103
Filed
Feb 10, 2022
Examiner
PHAKOUSONH, DARAVANH
Art Unit
2121
Tech Center
2100 — Computer Architecture & Software
Assignee
Arm Limited
OA Round
3 (Non-Final)
This examiner grants 50% of cases after interview

— +100.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 2 resolved cases, 2023–2026
Examiner Intelligence

PHAKOUSONH, DARAVANH View full profile →
Grants 50% of resolved cases
Career Allowance Rate
1 granted / 2 resolved
-5.0% vs TC avg
Strong +100% interview lift
Without
With
+100.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
13 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
34.6%
-5.4% vs TC avg
§103
27.3%
-12.7% vs TC avg
§102
29.1%
-10.9% vs TC avg
§112
5.5%
-34.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 2 resolved cases
Office Action

§103
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on February 18, 2026 has been entered.

Response to Amendment/Arguments
	Applicant’s arguments filed on February 18, 2026 regarding the rejection of claims 1, 4-11, and 14-24 have been fully considered but are not persuasive. Applicant contends on pages 11-12 of the Remarks that Han does not disclose selecting one branch and using only that branch for neural network processing because the Mixture-of-Experts (MoE) structure described in Han relies on multiple expert branches whose outputs are fused. Applicant therefore asserts that Han does not disclose the amended limitations of independent claims 1 and 11 requiring that when the primary branch is selected, only the primary branch is used for the neural network processing for the part, and when the secondary branch is selected, only the secondary is used for neural network processing for the part.
	However, Applicant’s contentions are not persuasive because they mischaracterize the rejection. The Office Action does not rely on Han alone for teaching exclusive branching execution. Rather, Huang expressly teaches selecting between alternative neural network branches and executing only the selected branch. For example, Huang discloses that “Layer-B 707 and Layer-C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726” and that “only the branch including Layer-B 707 or the branch including Layer-C 737 is executed.” Thus, Huang teaches that when one branch is selected, only the selected branch is used for neural network processing, while the other branch is not executed.
	Applicant further contends on pages 11-12 of the Remarks that Han does not disclose processing an output feature map as a plurality of parts because certain embodiments describe fusing outputs from multiple branches. This contention is also not persuasive. The rejection does not rely on the MoE fusion embodiments for teaching the claimed “parts” limitations. Rather, Han was relied upon for teaching region-wise processing of feature map data. As discussed in the Office Action, Han discloses generating a segmentation mask that divides input data into multiple regions for which processing is performed separately [Han, section 3.1.2]. Under the broadest reasonable interpretation, these regions correspond to the claimed plurality of separate parts of an output feature map. 
	Accordingly, Applicant’s arguments are not persuasive because they attack Han individually, whereas the rejection relies on the combined teachings of Huang and Han, with Huang teaching exclusive branch execution and Han teaches processing with respect to different parts of an output feature map. 
	Applicant’s arguments regarding the rejection of claims 8 and 18 have been fully considered but are not persuasive. Applicant contends on pages 12-13 of the Remarks that Huang only discloses branches having different processing costs but does not disclose selecting which branch to use based on a measure of relative cost. Applicant further asserts that the amended claim 8 now requires determining whether to use the secondary branch based on a measure of relative cost between processing parts of an output feature map using the secondary branch and processing parts without using the secondary branch. 
	However, Applicant’s contentions are not persuasive. Under the broadest reasonable interpretation, the recited “measure of relative cost” encompasses differences in computational effort associated with executing different neural network branches. Huang expressly teaches neural network architectures in which different branches require different computational effort. For example, Huang discloses that one branch may have fewer layers and therefore require few computations, while other branches may require more processing depending on the task being  performed (Huang, col. 13, lines 4-55). Huang further discloses that shorter branches having fewer layers may be selected when the neural network determines that less processing is needed (Huang, col. 14, lines 10-16). Under the broadest reasonable interpretation, selecting a branch having few layers and fewer computations when  less processing is required corresponds to selecting a branch based on a relative processing cost between the available branches. Huang’s disclosure that shorter branches may be selected when less processing is needed inherently requires determining whether to proceed with that shorter branch or continue with an alternative branch requiring greater computational effort, which constitutes the recited relative cost comparison between processing using a secondary branch or processing without using the secondary branch. 
	Moreover, the rejection does not rely on Huang alone for determining which portions of the output feature map are processed using particular branches. As discussed in the Office Action, Han teaches selectively activating neural network branches conditioned on the input, and Nakvosas teaches processing different portions of the feature data using different neural network branches. Thus, the combined teaches of Huang, Han, and Nakvosas teach determining whether to use a particular branch for processing parts of an output feature map based on properties of those parts and the relative computational effort associated with different branches. 
	Accordingly, the combined teaches of Huang, Han, and Nakvosas render the subject matter of claims 8 and 18 obvious, and Applicant’s arguments are not persuasive. 
	Applicant’s arguments regarding the rejection of claims 9-10, 19-20, and 23-24 have been fully considered but are not persuasive. Applicant contends on pages 14-15 of the Remarks that Parashar does not disclose selecting which branch of a neural network is used for neural network processing based on availability of processing resources. Applicant further asserts that paragraph 98 of Parashar related only allocation of processors for asynchronous processing and that paragraph 99 related to pipelined processing, and therefore these disclosures do not teach selecting a branch based on processor availability. Applicant additionally argues that the cited disclosures only determine which processor performs processing rather than selecting which neural network branch is used. 
	However, Applicant’s contentions are not persuasive because they mischaracterize the rejection. The rejection does not rely on Parashar alone for teaching the neural network branching architecture or the relative complexity of different neural network branches. As set forth in the Office Action, Huang teaches neural network architectures in which layers may be followed by multiple alternative branches having different computational complexity. For example, Huang discloses that one branch may be shorter and therefore require fewer computations than another branch, and that shorter branches may be selected when less processing is needed (Huang, col. 13, lines 44-55; col. 14, lines 10-16). Under the broadest reasonable interpretation, differences in the number of layers and computations required correspond to differences in the relative complexity of the processing performed by different branches. 
	The rejection relies on Parashar for teaching allocation of neural network processing operations across multiple processors based on available processing resources. Specifically, Parashar discloses dynamically allocating processing tasks based on processor availability, including selecting a faster processor and selecting another processor when a preferred processor is busy, based on parameters such as processing speed, processor load, and busy status [Parashar, paragraph [0098]). Parashar further discloses neural network processing structures including layers connected in series and/or parallel, thereby forming branched processing paths (Parashar, paragraph [0099]). Under the broadest reasonable interpretation, allocating neural network processing operations based on processor availability corresponds to determining which processing path or branch is executed based on available processing resources. 
	Thus, contrary to Applicant’s assertion, the rejection does not equate processor allocation alone with branch selection. Rather, Huang teaches the existence of multiple neural network branches having different computational complexity, while Parashar teaches determining execution of neural network processing operations based on available processing resources. When these teachings are combined, one of ordinary skill in the art would have found it obvious to determine which neural network branch to execute based on both the relative complexity of the branch and the availability of processing resources, as recited in claim 10. 
	Accordingly, the combined teachings of Huang, Han, and Parashar teach or suggest selecting which branch of neural network processing to use based on both the availability of processing resources and the relative complexity of the branches, and Applicant’s arguments are not persuasive. 
	Applicant’s arguments regarding the rejection of claim 20 have been fully considered but are not persuasive. Applicant contends on pages 15-16 of the Remarks that Han does not disclose selecting only a single branch for processing part of an output feature map and instead discloses selecting only a single branch for processing a part of an output feature map and instead discloses fusing outputs from multiple branches. Applicant further asserts that Parashar only determines which processor performs processing based on processor availability and does not disclose selecting which neural network branch is used. 
	However, Applicant’s contentions are not persuasive because they do not address the rejection as set forth in the Office Action. The rejection does not rely on Han alone for teaching selection of a single branch. As discussed in the Office Action, Huang teaches a neural network architecture in which a conditional layer selects between alternative execution branches, and further teaches that in some examples only the selected branch is executed while the other branch is not executed (Huang, col. 12, lines 38-48). Under the broadest reasonable interpretation, this corresponds to selecting a single branch of the two or more branches for neural network processing, as recited in the claim. 
	With respect to processing parts of an output feature map, Han teaches spatially adaptive neural network processing in which feature map data is divided into spatial regions and processing operations are performed with respect to those regions. For example, Han describes dynamic region-aware convolution in which a segmentation mask divides the feature representation into multiple regions, each corresponding to a portion of the feature map that may be processed separately. Under the broadest reasonable interpretation, these regions correspond to the claimed plurality of parts of the output feature map. 
	Further, the rejection relies on Parashar for teaching allocation of neural network processing operations based on availability of processing resources. Parashar discloses dynamically allocating processing tasks across processors based on parameters including processing speed, processor load, and busy status (Parashar, paragraphs [0098]-[0099]). Under the broadest reasonable interpretation, allocating neural  network processing operations based on processor availability corresponds to determining which neural network processing path is executed based on available processing resources.
	Accordingly, the combined teachings of Huang, Han, and Parashar teach or suggest selecting which neural network branch is used for processing a part of an output feature map based on properties of the part and based on availability of processing resources, as recited in claim 20. Therefore, Applicant’s arguments are not persuasive. 
	In view of the foregoing, the arguments presented by Applicant has been fully considered but are not persuasive. As set forth in the Office Action and further explained above, the combined teachings of Huang, Han, Nakvosas, and Parashar teach or suggest all of the limitations of claims 1, 4-11, and 14-24. Each of these references relates to neural network architectures and techniques for executing neural network processing, including conditional execution of neural network branches, processing of feature map data, and allocation of neural network processing operations based on available computing resources. One of ordinary skill in the art would have found it obvious, before the effective filing date of the claimed invention, to combine the teachings of Huang, Han, Nakvosas, and Parashar in order to enable adaptive neural network processing that selects among alternative processing branches and allocates processing operations based on characteristics of the data and available processing resources, thereby improving computational efficiency and flexibility of neural network execution. 
	Therefore, the combination of Huang in view of Han, further in view of Nakvosas and Parashar, renders the subject matter of claims 1, 4-11, and 14-24 obvious under 35 U.S.C. § 103, and the rejection of these claims is maintained.
 
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1, 4-7, 11, 14-17, 21, and 22  are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al., (Pat. No.: US 12008466 B1 (Filed: 2018)) in view of Han et al., (NPL: “Dynamic Neural Networks: A Survey” (Published: 2021)).

Regarding claim 1, Huang teaches the following limitations:
A method of operating a data processing system, the data processing system comprising one or more processors operable to execute neural network processing (col. 1, lines 19-28 mentions CPUs and GPUs that can be used to execute a neural network), and memory for storing data (col. 3, lines 17-21 mentions memory banks for fast temporary storage) relating to the neural network processing being performed by the one or more processors, the method comprising:
one or more of the one or more processors executing a neural network comprising a sequence of plural layers (Fig. 4 displays a sequence of layers) of neural network processing to process an initial input data set to generate a final output data set that is a result of processing the initial input data set using the neural network (col. 2, lines 27-45 mentions the processing of input data from input layer to output layer via a neural network);
at least one of the layers of the sequence of plural layers of the neural network is followed by two or more branches of neural network processing (Fig. 8C, 8D shows at least one layer is followed by two or more branches), each branch comprising a different sequence of one or more layers of neural network processing (Fig. 7, Figs. 8A-8D, displays one or more sequence of layers following the branches), whereby the neural network processing from the layer that is followed by two or more branches of neural network processing onwards can be selectively performed via one or more of the branches of neural network layers (Huang, col. 12, lines 38-51 and col. 14, lines 3-18 mentions various branches that can be used for processing data based on processing needs);
one of the branches of the two or more branches is a primary branch of neural network processing that is relatively more complex in terms of the neural network processing that it performs (Huang, col. 13, lines 46-55 “ the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason. For example, one branch may be optimized for recognizing dogs, while the other branch is optimized for recognizing cats. As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – this teaches that neural network includes multiple branches having different numbers of layers and computational requirements, where the longer branches execute more computations and performs more complex neural-network processing, corresponding to the claimed primary branch.);
another one of the branches of the two or more branches is a secondary branch of neural network processing that is relatively simpler in terms of the neural network processing that it performs (Huang, col. 13, lines 53-55 “As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – teaches another branch of neural network performs fewer computations and is simpler in terms of neural-network processing that it performs, corresponding to the claimed secondary branch.);
when the primary branch is selected to be used for the neural network processing for the part, using only the primary branch of the two or more branches for the neural network processing for the part (Huang, [col. 12, lines 38-48, FIGS. 8A-8D] “In the example of FIG. 7, Layer-B 707 and Layer C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726. In the illustrated example, Layer-B 707 can lead to the first output layer 708 being executed to output a first result 710, and Layer-C 737 can lead to the second output layer 738 being executed to output a second result 730. The second result 730 can be different from the first result 710. In some examples, only the branch including Layer-B 707 or the branch including Layer-C 737 is executed.” – teaches a neural network architecture including multiple alternative execution branches that are selected based on a condition determined at a conditional layer. Huang further teaches that when one of the branches is selected, only the selected branch is executed while the other branch is not executed. Thus, Huang teaches executing the selected branch for neural network processing, corresponding to “using only the primary branch of the two or more branches for the neural network processing” when the primary branch is selected, as recited in the claim. As further illustrated in FIGS. 8A-8D, conditional layer 826 directs execution to a selected branch (e.g., Layer-B 808, Layer-C 810, or Layer-D 812) for neural network processing.);
when the secondary branch is selected to be used for the neural network processing for the part, using only the secondary branch of the two or more branches for the neural network processing for the part (Huang, [col. 12, lines 38-48, FIGS. 8A-8D] “In the example of FIG. 7, Layer-B 707 and Layer C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726. In the illustrated example, Layer-B 707 can lead to the first output layer 708 being executed to output a first result 710, and Layer-C 737 can lead to the second output layer 738 being executed to output a second result 730. The second result 730 can be different from the first result 710. In some examples, only the branch including Layer-B 707 or the branch including Layer-C 737 is executed.” – teaches that a conditional layer determines which of multiple execution branches the neural network follows. Huang further teaches that when one of the branches is selected, only the selected branch is executed while the other branch is not executed. Thus, Huang teaches executing the selected branch for neural network processing, corresponding to “using only the secondary branch of the two or more branches for the neural network processing for the part” when the secondary branch is selected, as recited.).
However, Huang does not teach but Huang in view of Han teaches the following limitations:
the primary branch and the secondary branch perform the same overall processing operation (Han, pages 4-5 “the MoE [41], [76] structure, which means that multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights. Conventional soft MoEs [41], [76], [77] adopt real-valued weights to dynamically rescale the representations obtained  from different experts (Fig. 5 (a)). In this way, all the branches still need to be executed, and thus the computation cannot be reduced at test time. Hard gates with only a fraction of non-zero elements are developed to increase the inference efficiency of the MoE structure (see Fig. 5 (b)) [78]” – teaches a configuration in which multiple neural network branches operate in parallel on the same representations and perform the same type of processing, differing only in how their computations are allocated or gated based on input conditions. This corresponds to the claimed limitation that the primary and secondary branches perform the same overall processing operation with differing relative complexity.); and 
the primary branch and the secondary branch are operable to process parts of an output feature map from the layer that is followed by the two or more branches, the primary branch and the secondary branch are operable to process separate ones of the parts to one another, and the parts each comprise some but not all of the output feature map from the layer that is followed by the two or more branches (Han, page 8, Fig. 1, section 3.1.2“an                                 
                                    H
                                     
                                    x
                                     
                                    W
                                     
                                    x
                                     
                                    
                                        
                                            k
                                        
                                        
                                            2
                                        
                                    
                                
                             kernel map to produce spatially dynamic weights (                                
                                    H
                                    ,
                                    W
                                
                             are the spatial size of the output feature and k is the kernel size). Considering the pixels belonging to the same object may share identical weights, dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel…adaptive connected network [156] realizes a dynamic trade-off among self transformation (e.g.                                 
                                    1
                                    x
                                    1
                                
                             convolution), local inference (e.g.                                 
                                    3
                                    x
                                    3
                                
                             convolution) and global inference (e.g. FC layer). The three branches of outputs are fused with data-dependent weighted summation.” – describes a neural network configuration in which computation on an output feature map is divided into multiple regions, and different branches (e.g., local vs. global inference paths) operate on separate subsets of the feature-map data, each handling some but not all of the output feature map from the preceding layer. This corresponds to the claimed arrangement in which the primary and secondary branches are operable to process different parts of an output feature map.);
the method further comprising, when executing the neural network: processing the output feature map from the layer that is followed by the two or more branches as a plurality of the parts of the output feature map (Han, section 3.1.2-3.2 “an                                 
                                    H
                                     
                                    x
                                     
                                    W
                                     
                                    x
                                     
                                    
                                        
                                            k
                                        
                                        
                                            2
                                        
                                    
                                
                             kernel map to produce spatially dynamic weights (                                
                                    H
                                    ,
                                    W
                                
                             are the spatial size of the output feature and k is the kernel size). Considering the pixels belonging to the same object may share identical weights, dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel…adaptive connected network [156] realizes a dynamic trade-off among self transformation (e.g.                                 
                                    1
                                    x
                                    1
                                
                             convolution), local inference (e.g.                                 
                                    3
                                    x
                                    3
                                
                             convolution) and global inference (e.g. FC layer). The three branches of outputs are fused with data-dependent weighted summation.” – describes a neural network layer followed by multiple branches (e.g., self-transformation, local-inference, and global-inference paths) that each operate on different portions of the output feature map into a plurality of regions (m regions), and each branch processes separate subsets of that map through distinct weight-generation networks, corresponding to the claimed configuration in which the primary and secondary branches are operable to process parts of an output feature map, each comprising some but not all of the output feature map, and in which the network processes the output feature map as a plurality of parts following the layer that fans out into two or more branches.); and
for a part of the plurality of the parts of the output feature map, selecting which of the two or more branches to use for the neural network processing for the part based on a property or properties of part (Han, section 2.1.2 “In addition to dynamic network depth (Sec. 2.1.1), a finer-grained form of conditional computation is performing inference with dynamic width: although every layer is executed, its multiple units (e.g. neurons, channels or branches) are selectively activated conditioned on the input…the MoE [41], [76] structure, which means that multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights…Hard gates with only a fraction of non-zero elements are developed to increase the inference efficiency of the MoE structure (see Fig. 5 (b))… let G denote a gating module whose output is a N-dimensional vector                                  
                                    α
                                     
                                
                            controlling the execution of                                 
                                    N
                                
                             experts                                 
                                    
                                        
                                            F
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            F
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    
                                        
                                            F
                                        
                                        
                                            N
                                        
                                    
                                
                            , the final output can be written as (see formula 5)… and the n-th expert will not be executed if                                 
                                    
                                        
                                            α
                                        
                                        
                                            n
                                        
                                    
                                    =
                                    0
                                
                            .” Section 3.1.2 “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – describes an output feature map is divided into multiple regions (parts), and that for each part, a gating module determines which branch (expert) is activated for neural network processing based on input-dependent characteristics (i.e., properties of that part of the feature map).)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having a combination of Huang and Han before them, to incorporate the adaptive gating and input-dependent branch selection techniques from Han into the multi-branch neural network architecture of Huang. One would have been motivated to make such a combination in order to enable dynamic selection of the simpler and more complex branch in Huang’s network based on properties of the output feature map, as taught by Han, thereby optimizing computational efficiency while maintaining accuracy.)
Regarding claim 4, Huang in view of Han, as outlined above, all the elements claim 1, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the property that the selection of the branch to use is based on comprises a measure of the variability of data values in the part of the plurality of parts of the output feature map (Huang, col. 8, lines 11-13 mentions output feature maps can describe the input image at a more abstract level; col 14, lines 25-44 mentions the selection of the branch is based on a condition computed from the output function 722 which reflects the uncertainty or confidence (e.g., probability values), and this relates to variability in the output feature map since greater uncertainty implies greater variation in activation values or features, prompting a branch choice. Han, section 3.2.1 teaches that “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – this combination describes that the selection of a branch can be based on a measure of variability in data values within a part of the output feature map.).
Regarding claim 5, Huang in view of Han, as outlined above, all the elements claim 1, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the property that the selection of the branch to use is based on comprises a measure of the relative compressibility of some or all of the output feature map (col 7, lines 66-67 and col 8 lines 1-25 mentions pooling operations producing output feature maps and also mentions that these output features maps can describe the input image at a more abstract level. col. 8, lines 41-56 mentions pooling can be used to progressively reduce the spatial size of the input representation (compressibility). As a further example, pooling can assist in determining an almost scale invariant representation of the image; col. 9 lines 2-4 mentions the fully connected layer can classify that input image into various classes based on training data; col. 13, lines 25-44 mentions the conditional layer determines the branch selection, the condition could be the probability based on the input image).
Regarding claim 6, Huang in view of Han, as outlined above, all the elements claim 1, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
The method of claim 1, comprising using metadata for the output feature map as a measure of the property or properties that the selection of the branch to use is based on (Fig. 7, col. 13, lines 25-44 mentions the function 722 determines the “metadata” needed to determine the branch selection. The function 722, determines the probability of the input to create the metadata needed for branch decision within the conditional layer.)
Regarding claim 7, Huang in view of Han, as outlined above, all the elements claim 1, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
selecting the branch or branches to use for the neural network processing based on a property or properties of the part of the plurality of parts of the output feature map, together with one or more further conditions or criteria (col. 11, lines 39-50 mentions the additional condition could be the test values and statistical thresholds that can be used to test against probabilities of classification of output feature map; col. 13, lines 25-44 mentions the branch selection depends on the probability of the input image (output feature maps). Han, section 3.2.1 teaches that “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – the combination describes selecting the branch or branches to use for neural network processing based on a property of a part of the plurality of parts of the output feature map, together with one or more further conditions or criteria.). 
Regarding claim 11, Huang discloses:
A data processing system, the data processing system comprising: one or more processors operable to execute neural network processing (col. 1, lines 19-28 mentions CPUs and GPUs that can be used to execute a neural network); memory for storing data (col. 3, lines 17-21 mentions memory banks for fast temporary storage) relating to the neural network processing being performed by the one or more processors;
a processing circuit (col. 3, lines 10-15 mentions an integrated circuit for a neural network processor) configured to:  when one or more of the one or more processors is executing a neural network comprising a sequence of plural layers (Fig. 4 displays a sequence of layers) of neural network processing to process an initial input data set to generate a final output data set that is a result of processing the initial input data set using the neural network (col. 2, lines 27-45 mentions the processing of input data from input layer to output layer via a neural network) , 
at least one of the layers of the sequence of plural layers of the neural network is followed by two or more branches of neural network processing (Fig. 8C, 8D shows at least one layer is followed by two or more branches), each branch comprising a different sequence of one or more layers of neural network processing (Fig. 7 displays one or more sequence of layers following the branches), such that the neural network processing from the layer that is followed by two or more branches of neural network processing onwards can be selectively performed via one or more of the branches of neural network layers (col. 14, lines 3-18 mentions various branches that can be used for processing data based on processing needs):
one of the branches of the two or more branches is a primary branch of neural network processing that is relatively more complex in terms of the neural network processing that it performs (Huang, col. 13, lines 46-55 “ the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason. For example, one branch may be optimized for recognizing dogs, while the other branch is optimized for recognizing cats. As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – this teaches that neural network includes multiple branches having different numbers of layers and computational requirements, where the longer branches execute more computations and performs more complex neural-network processing, corresponding to the claimed primary branch.),
another one of the branches of the two or more branches is a secondary branch of neural network processing that is relatively simpler in terms of the neural network processing that it performs (Huang, col. 13, lines 53-55 “As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – teaches another branch of neural network performs fewer computations and is simpler in terms of neural-network processing that it performs, corresponding to the claimed secondary branch.),
when the primary branch is selected to be used for the neural network processing for the part, use only the primary branch of the two or more branches for the neural network processing for the part; and when the secondary branch is selected to be used for the neural network processing for the part, use only the secondary branch of the two or more branches for the neural network processing for the part (Huang, [col. 12, lines 39-48, FIGS. 8A-8D] “In the example of FIG. 7, Layer-B 707 and Layer C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726. In the illustrated example, Layer-B 707 can lead to the first output layer 708 being executed to output a first result 710, and Layer-C 737 can lead to the second output layer 738 being executed to output a second result 730. The second result 730 can be different from the first result 710. In some examples, only the branch including Layer-B 707 or the branch including Layer-C 737 is executed.” – Huang teaches a neural network architecture in which a conditional layer determines which of multiple branches of neural network processing is executed. Huang further teaches that only the selected branch is executed while the other branch is not executed. Under BRI, executing only the selected branch corresponds to use only the selected primary branch when the primary branch is selected and use only the selected secondary branch when the secondary branch is selected, as recited in the claim.).
However, Huang does not teach but Huang in view of Han teaches the following limitations:
the primary branch and the secondary branch perform the same overall processing operation (Han, pages 4-5 “the MoE [41], [76] structure, which means that multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights. Conventional soft MoEs [41], [76], [77] adopt real-valued weights to dynamically rescale the representations obtained  from different experts (Fig. 5 (a)). In this way, all the branches still need to be executed, and thus the computation cannot be reduced at test time. Hard gates with only a fraction of non-zero elements are developed to increase the inference efficiency of the MoE structure (see Fig. 5 (b)) [78]” – teaches a configuration in which multiple neural network branches operate in parallel on the same representations and perform the same type of processing, differing only in how their computations are allocated or gated based on input conditions. This corresponds to the claimed limitation that the primary and secondary branches perform the same overall processing operation with differing relative complexity.),and
the primary branch and the secondary branch are operable to process parts of an output feature map from the layer that is followed by the two or more branches, the primary branch and the secondary branch are operable to process separate ones of the parts to one another, and the parts each comprise some but not all of the output feature map from the layer that is followed by the two or more branches (Han, Fig. 1, page 8, section 3.1.2 “an                                 
                                    H
                                     
                                    x
                                     
                                    W
                                     
                                    x
                                     
                                    
                                        
                                            k
                                        
                                        
                                            2
                                        
                                    
                                
                             kernel map to produce spatially dynamic weights (                                
                                    H
                                    ,
                                    W
                                
                             are the spatial size of the output feature and k is the kernel size). Considering the pixels belonging to the same object may share identical weights, dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel…adaptive connected network [156] realizes a dynamic trade-off among self transformation (e.g.                                 
                                    1
                                    x
                                    1
                                
                             convolution), local inference (e.g.                                 
                                    3
                                    x
                                    3
                                
                             convolution) and global inference (e.g. FC layer). The three branches of outputs are fused with data-dependent weighted summation.” – describes a neural network configuration in which computation on an output feature map is divided into multiple regions, and different branches (e.g., local vs. global inference paths) operate on separate subsets of the feature-map data, each handling some but not all of the output feature map from the preceding layer. This corresponds to the claimed arrangement in which the primary and secondary branches are operable to process different parts of an output feature map.):
process the output feature map from the layer that is followed by the two or more branches as a plurality of the parts of the output feature map  (Han, section 3.1.2-3.2 “an                                 
                                    H
                                     
                                    x
                                     
                                    W
                                     
                                    x
                                     
                                    
                                        
                                            k
                                        
                                        
                                            2
                                        
                                    
                                
                             kernel map to produce spatially dynamic weights (                                
                                    H
                                    ,
                                    W
                                
                             are the spatial size of the output feature and k is the kernel size). Considering the pixels belonging to the same object may share identical weights, dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel…adaptive connected network [156] realizes a dynamic trade-off among self transformation (e.g.                                 
                                    1
                                    x
                                    1
                                
                             convolution), local inference (e.g.                                 
                                    3
                                    x
                                    3
                                
                             convolution) and global inference (e.g. FC layer). The three branches of outputs are fused with data-dependent weighted summation.” – describes a neural network layer followed by multiple branches (e.g., self-transformation, local-inference, and global-inference paths) that each operate on different portions of the output feature map into a plurality of regions (m regions), and each branch processes separate subsets of that map through distinct weight-generation networks, corresponding to the claimed configuration in which the primary and secondary branches are operable to process parts of an output feature map, each comprising some but not all of the output feature map, and in which the network processes the output feature map as a plurality of parts following the layer that fans out into two or more branches.); and
for a part of the plurality of the parts of the output feature map, select which of the two or more branches to use for the neural network processing for the part based on a property or properties of the part (Han, section 2.1.2 “In addition to dynamic network depth (Sec. 2.1.1), a finer-grained form of conditional computation is performing inference with dynamic width: although every layer is executed, its multiple units (e.g. neurons, channels or branches) are selectively activated conditioned on the input…the MoE [41], [76] structure, which means that multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights…Hard gates with only a fraction of non-zero elements are developed to increase the inference efficiency of the MoE structure (see Fig. 5 (b))… let G denote a gating module whose output is a N-dimensional vector                                  
                                    α
                                     
                                
                            controlling the execution of                                 
                                    N
                                
                             experts                                 
                                    
                                        
                                            F
                                        
                                        
                                            1
                                        
                                    
                                    ,
                                     
                                    
                                        
                                            F
                                        
                                        
                                            2
                                        
                                    
                                    ,
                                    …
                                    
                                        
                                            F
                                        
                                        
                                            N
                                        
                                    
                                
                            , the final output can be written as (see formula 5)… and the n-th expert will not be executed if                                 
                                    
                                        
                                            α
                                        
                                        
                                            n
                                        
                                    
                                    =
                                    0
                                
                            .” Section 3.1.2 “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – describes an output feature map is divided into multiple regions (parts), and that for each part, a gating module determines which branch (expert) is activated for neural network processing based on input-dependent characteristics (i.e., properties of that part of the feature map).)
Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having a combination of Huang and Han before them, to incorporate the adaptive gating and input-dependent branch selection techniques from Han into the multi-branch neural network architecture of Huang. One would have been motivated to make such a combination in order to enable dynamic selection of the simpler and more complex branch in Huang’s network based on properties of the output feature map, as taught by Han, thereby optimizing computational efficiency while maintaining accuracy.)
Regarding claim 14, Huang in view of Han, as outlined above, all the elements claim 11, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the property that the selection of the branch to use is based on comprises a measure of the variability of data values in the part of the plurality of parts of the output feature map (Huang, col. 8, lines 11-13 mentions output feature maps can describe the input image at a more abstract level; col 14, lines 25-44 mentions the selection of the branch is based on a condition computed from the output function 722 which reflects the uncertainty or confidence (e.g., probability values), and this relates to variability in the output feature map since greater uncertainty implies greater variation in activation values or features, prompting a branch choice. Han, section 3.2.1 teaches that “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – this combination describes that the selection of a branch can be based on a measure of variability in data values within a part of the output feature map.).
Regarding claim 15, Huang in view of Han, as outlined above, all the elements claim 11, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the property that the selection of the branch to use is based on comprises a measure of the relative compressibility of some or all of the output feature map (col. 8, lines 41-56 mentions pooling can be used to progressively reduce the spatial size of the input representation (compressibility). As a further example, pooling can assist in determining an almost scale invariant representation of the image.).
Regarding claim 16, Huang in view of Han, as outlined above, all the elements claim 11, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the processing circuit is configured to use metadata for the output feature map as a measure of the property or properties that the selection of the branch to use is based on (Fig. 7, col. 13, lines 25-44 mentions the function 722 determines the “metadata” needed to determine the branch selection. The function 722, determines the probability of the input to create the metadata needed for branch decision within the conditional layer).
Regarding claim 17, Huang in view of Han, as outlined above, all the elements claim 11, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the processing circuit is configured to: select the branch or branches to use for the neural network processing based on a property or properties of the part of the plurality of parts of the output feature map, together with one or more further conditions or criteria (col. 11, lines 39-50 mentions the additional condition could be the test values and statistical thresholds that can be used to test against probabilities of classification of output feature map; col. 13, lines 25-44 mentions the branch selection depends on the probability of the input image (output feature maps). Han, section 3.2.1 teaches that “dynamic region-aware convolution (DRConv) [155] generates a segmentation mask for an input image, dividing it into m regions, for each of which a weight generation network is responsible for producing a data-dependent kernel.” – the combination describes selecting the branch or branches to use for neural network processing based on a property of a part of the plurality of parts of the output feature map, together with one or more further conditions or criteria.).
Regarding claim 21, Huang in view of Han, as outlined above, all the elements claim 1, therefore is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the initial input data set comprises at least one of: image data; video data; and sound data (Huang, col. 5 & 6, lines 62-67 and lines 1-10, “ Neural networks have been used for a variety of applications, including, for example, in the areas of image and video, speech and language, medicine, game play, and robotics. In image and video, neural networks have been used for image classification, object localization and detection, image segmentation, and action recognition. In speech and language, neural networks have been used for speech recognition, machine translation, natural language processing, and audio generation. In the medical field, neural networks have been used in genomics and medical imaging. In game play, neural networks have been used to play video and board games, including games with immense numbers of possible moves such as Go. In robotics, neural networks have been used for motion planning of a robot, visual navigation, control stabilization, and driving strategies for autonomous vehicles”).
Regarding claim 22, Huang in view of Han, as outlined above, all the elements claim 21, therefore is rejected for the same reasons as those presented for claim 21, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the same overall processing operation that the primary branch and the secondary branch perform contributes to identifying or classifying features present within the initial input data set (Huang, col 13, lines 45-60 “) The neural network configuration illustrated in FIG. 7 can be used for various purposes. For example, the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason. For example, one branch may be optimized for recognizing dogs, while the other branch is optimized for recognizing cats. As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result. As another example, the condition 728 may different results for a set of input data. For example, for a set of input data, one branch can compute steering instructions for a self-driving car while another branch simultaneously computes acceleration instructions for the car.” – both branches (based on complexity) execute the same over type of image-processing operation – namely, identifying and classifying features within an input image.).


Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al., (Pat. No.: US 12008466 B1, 2018) in view of Han et al., (NPL: “Dynamic Neural Networks: A Survey” (Published: 2021)) further in view of Nakvosas, (Pub No.: US 20210326571 A1, 2019)

Regarding claim 8, Huang in view of Han teaches, as outlined above, all the elements of claim 1, therefore it is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang in view of Han does not teach but Huang in view of Han further in view of Nakvosas teaches the following limitations: 
selecting whether to use the secondary branch for the neural network processing for the output feature map by: using a property or properties of parts of the output feature map to determine a set of parts of the output feature map that can use the secondary branch for the neural network processing; (Nakvosas, paragraph [0035] “Output of the convolution layer from the last of said blocks (102) is propagated into different convolution branches (103). In a preferred embodiment each feature in the last activation map (103) has a spatial resolution roughly equal to ⅛ of the input resolution. It is possible to construct multiple versions of branching hierarchies or not to split last layers into separate branches at all but in a preferred embodiment each of said branches is responsible for a specific fingerprint feature estimation…These features may be decoded at least as fingerprint minutia orientation, location and class,”  Han, [section 2.1.2] “In addition to dynamic network depth (Sec. 2.1.1), a finer grained form of conditional computation is performing inference with dynamic width: although every layer is executed, its multiple units (e.g. neurons, channels or branches) are selectively activated conditioned on the input.” – Nakvosas teaches that the output of convolutional layers produces a feature map that is propagated into different convolutional branches, and that the feature map includes feature information such as orientation, location, and class. Under the broadest reasonable interpretation, such feature information corresponds to properties of parts of the output feature map. Han teaches selectively activating neural network branches conditioned on the input. Therefore, the combination of Han and Nakvosas teaches using properties of parts of the output feature map to determine which parts can use a particular branch, corresponding to determining a set of parts of the output feature map that can use the secondary branch for neural network processing as recited in the claim.)
determining whether to use the secondary branch for the neural network processing for the set of parts based on a measure of the relative cost between: processing the parts of the output feature map by using the secondary branch for the set of parts; and processing the parts of the output feature map without using the secondary branch for processing any parts of the output feature map (Huang, [col. 13, lines 45-55] “The neural network configuration illustrated in FIG. 7 can be used for various purposes. For example, the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason… As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” [col. 14, lines 10-16] “ In some examples, the example structures can be used to, for example, re-use layers or to have fewer layers in some branches than in others. Layers can be re-used, for example, to re-process data that has not produced enough information. Shorter branches (e.g., having fewer layers) can be selected when the neural network determines that data needs less processing.” – As discussed above with respect to Han and Nakvosas, the output feature map may be treated as comprising parts associated with particular properties. Huang teaches selecting between alternative neural network branches based on differences in computational effort, such as selecting branches that require fewer computations, reuse layers, or contain fewer layers when less processing is needed. Under BRI, differences in computational effort correspond to differences in processing cost. Thus, Huang teaches determining whether to use a particular branch based on the relative processing cost of executing that branch compared to alternative processing paths that do not use that branch. Accordingly, Huang teaches determining whether to use the secondary branch for neural network processing of the parts of the output feature map based on a measure of the relative cost between processing using the secondary branch and processing without using a secondary branch, as recited in the claim.).
Accordingly, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the claimed invention, having the combination of Huang, Han, and Nakvosas before them, to incorporate different parts or aspects of the output feature map into different branches as taught by Nakvosas, into the spatially adaptive branching Network of Hung and Han. One would have made such a combination in order to improve accuracy and computational efficiency by allocating processing based on local feature characteristics while balancing the relative cost between uniform and part-based branch processing.

Regarding claim 18, Huang in view of Han teaches, as outlined above, all the elements of claim 11, therefore it is rejected for the same reasons as those presented for claim 1, mutatis mutandis. Huang further teaches:
select the branch or branches to use for the neural network processing based on a measure of the relative cost between processing all of the output feature map down the same, single branch (col. 8, lines 1-15 mentions output feature maps, which can describe the input mage at a more abstract level; (col. 14 lines 3-18 mentions that certain branches may reuse layers or use shorter paths when less processing is needed. This supports selecting between different branches based on a measure of relative cost, including complexity or processing time, for example, by selecting a branch optimized for recognizing cats versus a branch optimized for recognizing dogs (input image)).
However, Huang in view Han of does not teach but Huang in view of Han further in view of Nakvosas teaches the following limitations: 
processing different parts of the output feature map down different branches (Nakvosas, Paragraph [0035] discloses that the output of the neural network is a feature map (103) and that the output of the final convolution block (102) is propagated into different convolutional branches (103). Each branch is responsible for estimating a specific fingerprint feature – such as orientation, location, or class – which are decoded from the output feature map. This teaches that different parts of the output feature map are processed through different branches, based on the type of feature being extracted). 
	Accordingly, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the claimed invention, having the combination of Huang, Han, and Nakvosas before them, to incorporate different parts or aspects of the output feature map into different branches as taught by Nakvosas, into the spatially adaptive branching Network of Hung and Han. One would have made such a combination in order to improve accuracy and computational efficiency by allocating processing based on local feature characteristics while balancing the relative cost between uniform and part-based branch processing.


	Claims 9, 10, 19, 20, 23, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al., (Pat. No.: US 12008466 B1, 2018) in view of Han et al., (NPL: “Dynamic Neural Networks: A Survey” published (2021)) in view of Parashar et al., (Pub. No.: US 20210232921 A1, 2020). 

Regarding claim 9, Huang in view of Han teaches, as outlined above, all the elements of claim 1, therefore it is rejected for the same reasons as those presented for claim 1, mutatis mutandis. However, Huang in view of Han does not teach, but Huang in view of Han further in view of Parashar teaches the limitation:
selecting a processing resource to use to execute the selected branch of neural network processing based on the type of processing required for the selected branch (Parashar, [0098] mentions selecting a processor based the current parameters of processors which may be, but not limited to, processing speed, processor load, busy status, and so on of each processor).
Huang in view of Han further in view of Parashar are analogous art because they are from the same field of endeavor and their disclosure generally relates to (neural network processing). 
       Accordingly, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the claimed invention, having the combination of Huang and Parashar before them, to incorporate the selection of a processing resource based on processing requirements as taught by Parashar. One would have motivated to make such a combination to improve the efficiency and responsiveness of neural network execution across varying hardware environments (Parashar, abstract). 

Regarding claim 10, Huang in view of Han teaches, as outlined above, similar elements as in claim 1, therefore it is rejected for the same reasons as those presented for claim 1, mutatis mutandis. However, Huang in view of Han does not teach but Huang in view of Han further in view of Parashar teaches the following limitation: 
for a layer of the neural network that is followed by two or more branches of neural network processing, selecting the branch or branches to use for the neural network processing from that layer onwards based on a relative complexity of the processing required by different ones of the two or more branches and an available processing resource of the one or more processors for performing the neural network processing, such that which of the branch or branches is used for the neural network processing is based on availability of one or more processing resources of the one or more processors and the relative complexity of the processing required by different ones of the two or more branches (Huang teaches that neural network architectures may include multiple branches of neural network processing that can be selectively executed. For example, Huang discloses that the neural network configuration of FIG. 7 can be used “to reduce the number of computations that need to be executed for certain inputs” and that “one branch may be shorter) meaning, may have fewer layers) and thus requires fewer computations as a result” [Huang, col. 13, lines 44-55]. Huang further teaches that the structures may “re-use layers or have fewer layers in some branches than others” and that “shorter branches (e.g., having fewer layers) can be selected when the neural network determines that the data needs less processing.” Under BRI, differences in the number of layers executed and the number of computations required correspond to differences in the relative complexity of the processing required by different branches. Parashar, paragraphs [0098]-[0099] discloses that the controller dynamically allocates processing tasks based on available processing resources of multiple processors, where the controller “selects the fastest processor over a slower processor” and “selects a next preference free processor if the fastest processor is busy,” thereby performing allocation “based on the current parameters of the processors…including processing speed, processor load, and busy status.” Parashar further teaches that during pipelined processing, the controller splits the neural network into sub-networks…where in the layers may include the layers connecting in series and/or layers connected in parallel (the branched connections). Together, these disclosures show that the neural network includes layers followed by multiple branched paths, where Huang teaches that the branches may have different processing complexity, and Parashar teaches selecting which processing path to activate based on available processing resources of processors. Accordingly, it would have been obvious to select which branch or branches to use for neural network processing based on both the relative complexity and the availability of processing resources, as recited in the claim.).
one of the branches of the two or more branches is a primary branch of neural network processing that is relatively more complex in terms of the neural network processing that it performs (Huang, col. 13, lines 46-55 “ the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason. For example, one branch may be optimized for recognizing dogs, while the other branch is optimized for recognizing cats. As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – this teaches that neural network includes multiple branches having different numbers of layers and computational requirements, where the longer branches execute more computations and performs more complex neural-network processing, corresponding to the claimed primary branch.);
another one of the branches of the two or more branches is a secondary branch of neural network processing that is relatively simpler in terms of the neural network processing that it performs (Huang, col. 13, lines 53-55 “As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result.” – teaches another branch of neural network performs fewer computations and is simpler in terms of neural-network processing that it performs, corresponding to the claimed secondary branch.);
the primary branch and the secondary branch perform the same overall processing operation (Han, pages 4-5 “the MoE [41], [76] structure, which means that multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights. Conventional soft MoEs [41], [76], [77] adopt real-valued weights to dynamically rescale the representations obtained  from different experts (Fig. 5 (a)). In this way, all the branches still need to be executed, and thus the computation cannot be reduced at test time. Hard gates with only a fraction of non-zero elements are developed to increase the inference efficiency of the MoE structure (see Fig. 5 (b)) [78]” – teaches a configuration in which multiple neural network branches operate in parallel on the same representations and perform the same type of processing, differing only in how their computations are allocated or gated based on input conditions. This corresponds to the claimed limitation that the primary and secondary branches perform the same overall processing operation with differing relative complexity.);
       Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having the combination of Huang, Han, and Parashar before them, to integrate Parashar’s resource-aware controller with the multi-branch neural-network architectures of Huang and Han. One would have been motivated to make such a combination in order to dynamically selected which branch or processing path to execute based on processor availability or load, thereby optimizing computational efficiency and maintaining inference throughput under varying hardware conditions. This combination would predictably yield a system that selects among more complex and less complex branches according to real-time resource conditions while performing the same overall neural-network operation, improving adaptability and processing performance. 

Regarding claim 19, Huang in view of Han teaches, as outlined above, all the elements of claim 11, therefore it is rejected for the same reasons as those presented for claim 11, mutatis mutandis. However, Huang in view of Han does not teach:
once a branch of the neural network processing has been selected, select a processing resource to use to execute the selected branch of neural network processing based on the type of processing required for the selected branch.
However, Huang in view Han further in view of Parashar teaches the limitations:
selecting a processing resource to use to execute the selected branch of neural network processing based on the type of processing required for the selected branch (Parashar, [0098] mentions selecting a processor based the current parameters of processors which may be, but not limited to, processing speed, processor load, busy status, and so on of each processor).
Huang in view of Parashar are analogous art because they are from the same field of endeavor and their disclosure generally relates to (neural network processing). 
       Accordingly, it would have been obvious to a person having ordinary skill in the art, before the effective filing date of the claimed invention, having the combination of Huang, Han, and Parashar before them, to incorporate the selection of a processing resource based on processing requirements as taught by Parashar. One would have motivated to make such a combination to improve the efficiency and responsiveness of neural network execution across varying hardware environments (Parashar, abstract).

Regarding claim 20, Huang teaches the following limitations:
A non-transitory computer readable storage medium storing computer software code
which when executing on one or more processors performs a method of generating a neural network that can be executed by one or more processors to perform neural network processing, the method comprising (col. 30, lines 19-31 mentions of non-transitory computer readable medium for software embodiments).
configuring the neural network for it to be selected which single branch of the two or more branches to use for the neural network processing for the part based on at least one of  (Huang, [col. 12, lines 38-41] “In the example of FIG. 7, Layer-B 707 and Layer C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726.” – teaches a neural network architecture in which a conditional layer determines which of multiple branches of neural network processing is executed. The conditional layer evaluates an outcome and selects between alternative execution flows of the neural network. Under BRI, determining which branch to execute based on the outcome of the conditional layer corresponds to selecting which single branch of two or more branches to use for neural network processing, as recited in the claim.)
However, Huang does not teach, but Huang in view of Han teaches the following limitations:
for at least one set of two or more branches of neural network processing that follow a layer of the sequence of plural layers of the neural network (Fig. 8C, 8D shows at least one layer is followed by two or more branches and Fig. 7, Figs. 8A-8D, displays one or more sequence of layers following the branches), wherein all of the two or more branches of neural network processing perform the same overall processing operation (Han, pages 4-5 discloses that “multiple network branches are built as experts in parallel. These experts could be selectively executed, and their outputs are fused with data-dependent weights…all the branches still need to be executed” – Because each branch (expert) performs the same neural-inference operation to generate representations that are combined into a unified output, Han teaches that all of the two or more branches perform the same overall processing operation.)
configuring the neural network to be able to process a part of a plurality of separate parts of an output feature map from the layer that is followed by the two or more branches using only a single selected branch of the two or more branches (Han, [section 3] “spatial-wise dynamic networks are built to perform adaptive inference with respect to different spatial locations of images. According to the granularity of dynamic computation, we further categorize the relevant approaches into three levels: pixel level (Sec. 3.1), region level (Sec. 3.2) and resolution level (Sec. 3.3).” Huang, [col. 12, lines 38-48] “ Layer-B 707 and Layer C 737 illustrate two alternative execution flows or branches that the neural network can take depending on an outcome determined at the conditional layer 726…In some examples, only the branch including Layer-B 707 or the branch including Layer-C 737 is executed.” – Han teaches performing neural network inference with respect to different spatial locations of an image, including pixel-level or region-level portions. Because convolutional neural networks generate feature maps representing spatial regions of the input, these spatial locations correspond to separate parts of an output feature map. Huang teaches a neural network architecture in which a conditional layer selects between alternative branches of neural network processing, and further teaches that only the selected branch is executed. Under BRI, selecting and executing only one branch while processing different spatial portions of the feature map corresponds to configuring the neural network to  process a part of the plurality of separate parts of an output feature map using only a single selected branch of the two or more branches, as recited in the claim.),
a property or properties of the part of the output feature map (Han, section 3 “To this end, spatial-wise dynamic networks are built to perform adaptive inference with respect to different spatial locations of images. According to the granularity of dynamic computation, we further categorize the relevant approaches into three levels: pixel level (Sec. 3.1), region level (Sec. 3.2) and resolution level (Sec. 3.3).”);
However, Huang  in view of Han does not teach but Huang in view of Han further in view of Parashar teaches the limitation:
configuring one of the branches of neural network processing for processing on a first type of processing resource; and configuring another one of the branches of neural network processing for processing on a second, different type of processing resource (Parashar, Paragraph [0018] mentions “configuring the neural network across the plurality of processors for each of the at least one input in parallel may include: determining parameters of each processor of the plurality of processors at a current instance of time, wherein the parameters of each processor comprise at least one of load on each processor, resources available on each processor, and a busy status of each processor; extracting an individual profile on each processor with respect to metadata of the neural network for the requested at least once task from the unified neural network profile…”).
an availability of the first type of processing resource; and an availability of the second, different type of processing resource (Parashar paragraph [0098] “Further, the electronic device 200 includes three processors 202a-202n (a processor 1, a processor 2, and a processor 3). In such a scenario, the controller 204 divides the received four (4) input video frames across the three processors 202a-202n based on the unified neural network profile, and the current parameters of the processors 202a-202n. Examples of the current parameters of the processors 202a-202n may be, but not limited to, processing speed, processor load, busy status, and so on of each processor 202a-202n. In an example, the controller 204 selects a fastest processor (i.e., the processing speed of the processor is higher than the other processors) over a slowest processor (i.e., the processing speed of the processor is lower than the other processors), as the fastest processor may perform processing of the neural network faster and may become free than slower one. In an example, the controller 204 may select a next preference free processor, if the fastest processor is busy or assigned with some other task (may or may not be for processing the neural network). In this example scenario, consider that the processor 1 may be the fastest processor and the processor 3 may be the slowest processor and the processor 1 is free. Therefore, the controller 204 may allocate two inputs to the processor 1, one input to the processor 2 and the processor 3. The controller 204 enables the processors 1-3 to process the allocated inputs in parallel to perform the requested task.”).
		Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, having a combination of Huang, Han, and Parashar before them, to incorporate the dynamic resource allocation logic of Parashar into the complexity-based adaptive branching architecture of Huang as reinforced by Han.  One would have been motivated to make such a combination in order to dynamically select the optimal complexity path for a given computational segment (a part of a feature map) based on real-time available processing resources, and to configure the network with hardware-specific paths (e.g., NPU-optimized vs GPU-optimized) to maintain maximum efficiency and responsiveness under varying processor load conditions.).
Regarding claim 23, Huang in view of Han, as outlined above, all the elements claim 10, therefore is rejected for the same reasons as those presented for claim 10, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the initial input data set comprises at least one of: image data; video data; and sound data (Huang, col. 5 & 6, lines 62-67 and lines 1-10, “ Neural networks have been used for a variety of applications, including, for example, in the areas of image and video, speech and language, medicine, game play, and robotics. In image and video, neural networks have been used for image classification, object localization and detection, image segmentation, and action recognition. In speech and language, neural networks have been used for speech recognition, machine translation, natural language processing, and audio generation. In the medical field, neural networks have been used in genomics and medical imaging. In game play, neural networks have been used to play video and board games, including games with immense numbers of possible moves such as Go. In robotics, neural networks have been used for motion planning of a robot, visual navigation, control stabilization, and driving strategies for autonomous vehicles”).
Regarding claim 24, Huang in view of Han, as outlined above, all the elements claim 23, therefore is rejected for the same reasons as those presented for claim 23, mutatis mutandis. Huang in view of Han further teaches the limitation: 
wherein the same overall processing operation that the primary branch and the secondary branch perform contributes to identifying or classifying features present within the initial input data set (Huang, col 13, lines 45-60 “) The neural network configuration illustrated in FIG. 7 can be used for various purposes. For example, the example neural network 700 can be used to produce more accurate results, to reduce the number of computations that need to be executed for certain inputs, to enable the neural network 700 to perform multiple tasks or more complex tasks, or for another reason. For example, one branch may be optimized for recognizing dogs, while the other branch is optimized for recognizing cats. As another example, one branch may be shorter (meaning, may have fewer layers) and thus requires fewer computations to produce a result. As another example, the condition 728 may different results for a set of input data. For example, for a set of input data, one branch can compute steering instructions for a self-driving car while another branch simultaneously computes acceleration instructions for the car.” – both branches (based on complexity) execute the same over type of image-processing operation – namely, identifying and classifying features within an input image.).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Daravanh Phakousonh whose telephone number is (571)272-6324. The examiner can normally be reached Mon - Thurs 7 AM - 5 PM, Every other Friday 7 AM - 4PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Li B Zhen can be reached at 571-272-3768. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/Daravanh Phakousonh/Examiner, Art Unit 2121                                                                                                                                                                                                        




/Li B. Zhen/Supervisory Patent Examiner, Art Unit 2121
Read full office action
Prosecution Timeline

Feb 10, 2022
Application Filed
Apr 10, 2025
Non-Final Rejection mailed — §103
Aug 04, 2025
Response Filed
Nov 18, 2025
Final Rejection mailed — §103
Feb 18, 2026
Request for Continued Examination
Feb 27, 2026
Response after Non-Final Action
Mar 27, 2026
Non-Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/670,194
Patent 12572821
ACCURACY PRIOR AND DIVERSITY PRIOR BASED FUTURE PREDICTION
4y 0m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
99%
With Interview (+100.0%)
3y 10m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 2 resolved cases by this examiner. Grant probability derived from career allowance rate.