DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
The present application is being examined under the claims filed 10/14/2025.
Claims 11-19 are pending.
Response to Amendment
This Office Action is in response to Applicant’s communication filed 10/14/2025 in response to office action mailed 05/14/2025. The Applicant’s remarks and any amendments to the claims or specification have been considered with the results that follow.
Response to Arguments
Regarding 35 U.S.C. 101
In Remarks page 6-8, Argument 1
(Examiner summarizes Applicant’s arguments) Applicant argues that the limitations of the claim should not be construed as mental processes because they are necessarily performed by a neural networks. Applicant argues that the office’s justification is overly generalized, ignores the details of the claim limitations, and that the claimed subject matter is directed to improvements in latency. Applicant cites appeals cases (Ex parte Das and Ex parte Desjardins)
In response to Argument 1,
Examiner disagrees. Applicant’s comments amount to mere statements that examiner’s justification is insufficient and not specific enough, but does not substantially challenge the merits of the rejection itself. Merely stating that an examiner’s reasoning is faulty is not sufficient to overcome the rejections under 35 U.S.C. 101. Regarding appeals cases, it is important to examine the facts on a case-by-case basis and prior decisions on a particular patent application do not automatically render others eligible.
In the instant application, examiner maintains that the limitations identified under step 2A prong 1 could be performed in the human mind, even when considering the full level of detail provided by the limitations. Examiner disagrees that the descriptions provided were overly general, when in fact they were quite specific to each limitation. For example “selecting a first path through the deep neural network along the skip connection” can be performed by observing a neural network architecture along with known decision factors to make a judgement about the best path forward (see previous office action page 13). Propagating an input variable of the deep neural network along the first path is recited without the details of the neural network path. Matrix multiplications and activation functions, as well as other straightforward operations not tied to computer functionality are the building blocks of neural networks and accordingly propagating an input through a neural network can be performed by hand or in the human mind (see previous office action page 13). The same rationale may apply to a range of different limitations that involve broadly recited neural network functions. Using the same explanation for different limitations does not negate the applicability of the explanation to the limitation.
Furthermore, technical improvements cannot be provided by an abstract idea alone (see MPEP 2106.05(a). In the instant application, any alleged benefits are obtained entirely by processes that could be performed in the human mind or by hand on pen and paper.
Therefore, the claims are still directed to a mental process under step 2A prong 1 and the rejections are maintained. A complete analysis can be found in the rejections under 35 U.S.C. 101 below.
In Remarks pages 11-12, Argument 2,
(Examiner summarizes Applicant’s arguments) Applicant argues that Examiner improperly applied step 2B, stating that examiner failed to provide evidence that any additional elements are well-understood, routine, and conventional (citing to Berkheimer v. HP, Inc).
In response to Argument 2
Examiner never stated nor relied upon the additional elements being directed to well-understood, routine, and conventional activity. Page 15 of the office action (steps 2A prong 2 and 2B) state that attaching verbiage such as “propagating a signal” to what would otherwise be a mental process amounts to mere instructions to apply a judicial exception, which cannot integrate the judicial exception nor provide significantly more. MPEP 2106.05(f) recites:
Another consideration when determining whether a claim integrates a judicial exception into a practical application in Step 2A Prong Two or recites significantly more than a judicial exception in Step 2B is whether the additional elements amount to more than a recitation of the words "apply it" (or an equivalent) or are more than mere instructions to implement an abstract idea or other exception on a computer. […] Thus, for example, claims that amount to nothing more than an instruction to apply the abstract idea using a generic computer do not render an abstract idea eligible.
[…]
For claim limitations that do not amount to more than a recitation of the words "apply it" (or an equivalent), such as mere instructions to implement an abstract idea on a computer, examiners should explain why they do not meaningfully limit the claim in an eligibility rejection. For example, an examiner could explain that implementing an abstract idea on a generic computer, does not integrate the abstract idea into a practical application in Step 2A Prong Two or add significantly more in Step 2B, similar to how the recitation of the computer in the claim in Alice amounted to mere instructions to apply the abstract idea of intermediated settlement on a generic computer.
That is, analysis of additional elements under MPEP 2106.05(f) does not require a showing that the additional element is well-understood, routine, and conventional rendering Applicant’s arguments moot.
Regarding prior art rejections
In Remarks page 12-13, Argument 3
(Examiner summarizes Applicant’s arguments) Applicant argues that the claim is amended so that each of the paths outputs via the output layer. Applicant compares figure 1 of the specification to figure 1 of the BranchyNet reference, stating that BranchyNet exits via one of three output layers. Applicant further argues that the amendments to the independent claim obviate all prior art rejections.
Examiner’s response to Argument 3
Examiner disagrees. Applicant’s own specification describes the output layer as “The layer that receives the input variable (x) is referred to in the following as the input layer, and the layer that outputs the output variable (y) is referred to in the following as the output layer.” It is consistent with the specification to interpret a layer containing multiple parts (e.g. exit1, exit2, exit3 of BranchyNet) as a single output layer that outputs the output variable. Moreover, Applicant’s specification further discloses that “It is to be noted that, as shown in Figure 1, the output layer of the deep neural network(10) ascertains the output variable (y) as a function of three intermediate variables, but, given the use of the first path(10a), ascertains it as a function of only one intermediate variable.” Thus Applicant’s own specification discloses the output layer as being composed of different parts (at least a function of three variables, and a separate function of one variable).
Further, BranchyNet teaches applying a softmax function to the results of any prior processing of exit1, exit2, … exitn, which was mapped to the output layer in the prior rejection. Therefore it is entirely reasonable and consistent with the specification to interpret BranchyNet as teaching the claim as amended and the rejections are maintained.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 11-19 are rejected under 35 U.S.C. 101 for containing an abstract idea without significantly more.
Regarding Claim 11:
Step 1 – Is the claim to a process, machine, manufacture, or composition of matter?
Yes, the claim is to a process.
Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites the abstract ideas of:
A method for operating a deep neural network that has at least one skip connection, the method comprising the following steps: — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because some of the component steps of the claimed method recite a mental process (as shown below).
selecting a first path through the deep neural network along the skip connection — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it corresponds to an judgement about which path should be the first path in a neural network. It is analogous to making a selection among an arbitrary number of options which can be performed in the human mind or by a human using pen and paper.
ascertaining an output variable by propagating […] an input variable of the deep neural network along the first path — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because propagating an input variable of a deep neural network along a path could be performed by, for example, a series of matrix multiplications and activation functions (e.g. ReLU). This operation could be performed by evaluation by a human using pen and paper as an aid.
checking whether the output variable meets a specifiable criterion — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to making an observation about whether a value meets certain conditions which can be performed in the human mind. For example, observing whether a number is greater than or less than a threshold.
and selecting, based on the specifiable criterion not being met, a second path through the deep neural network that differs from the first path, and ascertaining the output variable by propagating […] the input variable of the deep neural network along the second path — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it corresponds to making a judgement about which path should be the second path of the neural network. It is analogous to making a selection among an arbitrary number of options which can be performed in the human mind or by a human using pen and paper.
wherein the deep neural network determines the output variable depending on the input variable, wherein a layer that receives the input variable is an input layer and a layer that outputs the output variable is an output layer, wherein a third path characterizes an uninterrupted, forward-directed sequence of a multiplicity of layers of the deep neural network and the third path characterizes how the multiplicity of layers of the deep neural network are connected in succession in order to propagate the input variable through the deep neural network starting at the input layer up to the output layer — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it could be performed by, for example, a series of matrix multiplications and activation functions (e.g. ReLU) . This operation could be performed by evaluation by a human using pen and paper as an aid.
wherein each one of the first path, the second path, and the third path outputs the output variable via the output layer — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it could be performed by, for example, a series of matrix multiplications and activation functions (e.g. ReLU) . This operation could be performed by evaluation by a human using pen and paper as an aid.
Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No, the claim does not recite additional elements that integrate the judicial exception into a practical application. The additional elements:
propagating a signal representing — This limitation is directed to mere instructions to apply a judicial exception. Using signal propagation to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the signal propagation is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.
Step 2B – Does the claim recite additional elements that amount to significantly more than the abstract idea itself?
No, the claim does not recite additional elements which amount to significantly more than the abstract idea itself. The additional elements as identified in step 2A prong 2:
propagating a signal representing — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Regarding Claim 12
Claim 12 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 1). The claim recites the additional limitations:
Step 2A Prong 1:
wherein layers of the neural network that are not required for a respective path used being deactivated during the propagation of […] the input variable along the first and second paths, and not being activated until that layer is required for the respective path — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it could be performed by making a judgement to stop evaluating certain model parameters (i.e. deciding not to evaluate a certain series of matrix multiplications/activation functions).
Step 2A Prong 2:
propagation of the signal representing — This limitation is directed to mere instructions to apply a judicial exception. Using signal propagation to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the signal propagation is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
The additional elements as identified in step 2A prong 2:
propagation of the signal representing — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 13
Claim 13 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 11). The claim recites the additional limitations:
Step 2A Prong 2:
wherein intermediate variables of layers that were ascertained during the propagation along the first path are reused in the propagation of the signal representing the input variable along the second path — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the propagation of the input variable.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein intermediate variables of layers that were ascertained during the propagation along the first path are reused in the propagation of the signal representing the input variable along the second path — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 14:
Claim 14 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 13 which included an abstract idea (see rejection for claim 13). The claim recites the additional limitations:
Step 2A Prong 1:
wherein only intermediate variables of layers by which the second path differs from the first path are ascertained during the propagation along the second path — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to making a judgement about which neural network operations to perform which may include matrix multiplication/activation function operations.
and ascertaining, during the propagation along the second path, intermediate variables as a function of a provided intermediate variable of a respective immediately preceding connected layer of the second path and of a preceding provided intermediate variable of an immediately preceding connected layer of the first path — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to performing an evaluation of a neural network which may include matrix multiplications and activation functions.
Step 2A Prong 2:
and wherein the deep neural network comprises layers of the second path being connected to more than one preceding layer, at least one of the preceding layers were also contained in the first path — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the deep neural network.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
The additional elements as identified in step 2A prong 2:
and wherein the deep neural network comprises layers of the second path being connected to more than one preceding layer, at least one of the preceding layers were also contained in the first path — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 15
Claim 15 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 11). The claim recites the additional limitations:
Step 2A Prong 1:
providing a plurality of different paths through the deep neural network — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to a judgement about which paths on a neural network are provided.
selecting at least the first path and the second path from the plurality of different paths — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to a judgement about which paths should be the first path and the second path (or making a selection from an arbitrary number of choices).
ascertaining the output variable by propagating […] the input variable of the deep neural network simultaneously along at least the first path and the second path, intermediate variables of common layers of the first path and second path being first ascertained, in succession, up to the layer starting from which a sequence of the layers of the first and second paths differ — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to performing an evaluation of a neural network which may include matrix multiplications and activation functions.
ascertaining intermediate variables of remaining layers of the first path and of remaining layers of the second path that have a same position in the sequence of layers of the first and second paths, respectively ascertained in parallel — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to performing an evaluation of a neural network which may include matrix multiplications and activation functions.
checking the specifiable criterion when the output variable of the first path is outputted — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to making an observation about a particular criteria based on values which can be understood by the human mind.
and continuing the propagation when the criterion is not met until the output variable of the second path is outputted — This limitation is directed to the abstract idea of a mental process (including an observation, evaluation, judgement, opinion) which can be performed by the human mind, or by a human using pen and paper (see MPEP 2106.04(a)(2) III. C.). This limitation is directed to a mental process because it amounts to performing an evaluation of a neural network which may include matrix multiplications and activation functions.
Step 2A Prong 2:
by propagating the signal representing — This limitation is directed to mere instructions to apply a judicial exception. Using signal propagation to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the signal propagation is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
The additional elements as identified in step 2A prong 2:
by propagating the signal representing — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 16
Claim 16 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 11). The claim recites the additional limitations:
Step 2A Prong 1:
wherein: (i) the selection of the first and second paths is a function of the input variable and/or a function of a specifiable energy/time contingent that has a maximum permissible consumption level in order to propagate the signal representing the input variable through the deep neural network — This limitation is directed to the abstract idea of a mathematical process, and a mathematical relationship in particular (MPEP 2106.04(a)(2) I. A.). The claim describes the mathematical operations of evaluating a mathematical function in words.
Step 2A Prong 2:
and/or (ii) the specifiable criterion characterizes a specifiable minimum accuracy or reliability of the output variable — This limitation is directed to merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) as it merely limits the field of the specifiable criterion.
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
The additional elements as identified in step 2A prong 2:
and/or (ii) the specifiable criterion characterizes a specifiable minimum accuracy or reliability of the output variable — Merely limiting a judicial exception to a particular field of use (see MPEP 2106.05(h)) cannot amount to significantly more than the judicial exception.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 17
Claim 17 is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The claim is dependent on claim 11 which included an abstract idea (see rejection for claim 11). The claim recites the additional limitations:
Step 2A Prong 2:
wherein each path of the deep neural network is trained separately from one another, and/or at least one group of paths is trained in common — This limitation is directed to mere instructions to apply a judicial exception. Using neural network training to apply a judicial exception (see MPEP 2106.05(f)) is insufficient to integrate the judicial exception into a practical application. Even if the neural network training is implemented on a generic computer (see MPEP 2106.05(f)(2), 2106.04(d)), the limitation does not integrate the judicial exception into a practical application.
Step 2B:
The additional elements as identified in step 2A prong 2:
wherein each path of the deep neural network is trained separately from one another, and/or at least one group of paths is trained in common — Mere instructions to apply a judicial exception (see MPEP 2106.05(f)) and using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 18
Independent claim 18 is a device claim corresponding to method claim 11, which was directed to an abstract idea, therefore the same rejection and rationale applies. The only difference is that claim 18 recites the following additional elements treated under step 2A prong 2 and step 2B:
Step 2A Prong 2:
A device configured to […], the device including a processor configured to — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
A device configured to […], the device including a processor configured to — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Regarding Claim 19
Independent claim 19 is a non-transitory computer-readable medium claim corresponding to method claim 11, which was directed to an abstract idea, therefore the same rejection and rationale applies. The only difference is that claim 19 recites the following additional elements treated under step 2A prong 2 and step 2B:
Step 2A Prong 2:
A non-transitory machine-readable storage element on which is stored a computer program for […] the computer program, when executed by a computer, causing the computer to perform the following steps: — This limitation is directed to merely applying an abstract idea using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.04(d)).
Thus, the judicial exception is not integrated into a practical application (see MPEP 2106.04(d) I.), failing step 2A prong 2.
Step 2B:
A non-transitory machine-readable storage element on which is stored a computer program for […] the computer program, when executed by a computer, causing the computer to perform the following steps: — Using a generic computer as a tool (see MPEP 2106.05(f)(2), 2106.05(d)) cannot amount to significantly more than the judicial exception itself.
Thus, the claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception under step 2B.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 11, 12, and 16-18 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Teerapittayanon et al., “BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks”, herein referred to as Teerapittayanon.
Regarding Claim 11
Teeerapittayanon teaches:
A method for operating a deep neural network that has at least one skip connection, the method comprising the following steps: selecting a first path through the deep neural network along the skip connection,
(page 1 column 2, figure 1 caption) “A simple BranchyNet with two branches added to the baseline (original) AlexNet[*Examiner notes: deep neural network]. The first branch has two convolutional layers and the second branch has 1 convolutional layer. The “Exit” boxes denote the various exit points of BranchyNet[skip connections].”; Figure 1; [*Examiner note: A deep neural network has an input layer, an output layer, and at least one intermediate layer. The network in figure 1 has an input layer, an output layer (denoted by Exit1, Exit2, and Exit3), and intermediate layers (the various conv nxn blocks). Figure 1 shows that Exit1 and Exit2 allow for an expedited journey to the output layer, skipping subsequent neural network blocks. Thus the pathways of Exit1 and Exit2 are skip connections]
ascertaining an output variable by propagating a signal representing an input variable of the deep neural network along the first path;
(page 4 column 1 line 4) “For each exit point, the input sample is fed[*Examiner notes: signal representing an input variable] through the corresponding branch. The procedure then calculates the softmax and entropy of the output and checks if the entropy is below the exit point threshold Tn. If the entropy is less than Tn, the class label with the maximum score (probability) is returned.”; Figure 1
checking whether the output variable meets a specifiable criterion
(page 4 column 1 line 5) “The procedure then calculates the softmax and entropy of the output and checks if the entropy is below the exit point threshold Tn.”; Figure 2; [*Examiner note: in line 6 of figure 2, the reference checks “if e < Tn”. If the check passes, then skip to the output (“return arg max y”). If the check does not pass, then the for loop (line 2) progresses to the next branch (also, see the annotated figure below).]
and selecting, based on the specifiable criterion not being met, a second path through the deep neural network that differs from the first path
(page 4 column 1 line 7) “If the entropy is less than Tn, the class label with the maximum score (probability) is returned. Otherwise, the sample continues to the next exit point”; Figure 2; [*Examiner note: in line 6 of figure 2, the reference checks “if e < Tn”. If the check passes, then skip to the output (“return arg max y”). If the check does not pass, then the for loop (line 2) progresses to the next branch (see also the annotated figure below). The second path corresponds to the layers leading up to “Exit 2” in figure 1.]
and ascertaining the output variable by propagating the signal representing the input variable of the deep neural network along the second path
(page 4 column 1 line 4) “For each exit point, the input sample is fed through the corresponding branch. The procedure then calculates the softmax and entropy of the output and checks if the entropy is below the exit point threshold Tn. If the entropy is less than Tn, the class label with the maximum score (probability) is returned.”; Figure 1
PNG
media_image1.png
322
390
media_image1.png
Greyscale
PNG
media_image2.png
128
404
media_image2.png
Greyscale
wherein the deep neural network determines the output variable depending on the input variable
[*Examiner notes: In figure 2 below, y^ is the output variable and x is the input variable. The deep neural network procedure determines y^ using the input variable x]
PNG
media_image3.png
129
321
media_image3.png
Greyscale
wherein a layer that receives the input variable is an input layer and a layer that outputs the output variable is an output layer,
(page 5 column 1 paragraph 3) “. For a simpler dataset, such as MNIST, we can place a branch directly after the first layer[*Examiner notes: input layer] and immediately see accurate classification”; Figure 1; (page 4 column 1 paragraph 1) “The procedure then calculates the softmax[*Examiner notes: mapped to output layer] and entropy of the output and checks if the entropy is below the exit point threshold Tn.”; [*Examiner notes: All three exit paths in figure 1 share common layers up until the first branch point. The input layer can be, for example, the “first layer” identified in the section above. The output layer is the softmax function applied to the output of each of the branches which outputs y^. Therefore, the machine learning network has an input layer that receives the input variable and an output layer that outputs the output variable.]
PNG
media_image4.png
405
789
media_image4.png
Greyscale
PNG
media_image5.png
130
671
media_image5.png
Greyscale
wherein a third path characterizes an uninterrupted, forward-directed sequence of a multiplicity of layers of the deep neural network and the third path characterizes how the multiplicity of layers of the deep neural network are connected in succession in order to propagate the input variable through the deep neural network starting at the input layer up to the output layer.
[*Examiner notes: The term “uninterrupted” means without interruptions, pauses, or breaks in continuity. The third path (fexit3) is executed without any of the side branches and is thus “uninterrupted”. The broadest reasonable interpretation of the term “forward-directed” includes in the direction starting from the input and ending at the output. The third path executes the input in this direction. See figures 1 and 2 annotated below.]; (page 1 column 1 introduction paragraph 2) “To lessen these increasing costs, we present BranchyNet, a neural network architecture where side branches[*Examiner notes: first and second paths] are added to the main branch, the original baseline neural network[*Examiner notes: third path], to allow certain test samples to exit early.”; Figure 1; Figure 2
PNG
media_image6.png
329
550
media_image6.png
Greyscale
PNG
media_image7.png
133
664
media_image7.png
Greyscale
wherein each one of the first path, the second path, and the third path outputs the output variable via the output layer.
PNG
media_image8.png
331
486
media_image8.png
Greyscale
PNG
media_image9.png
136
489
media_image9.png
Greyscale
Regarding Claim 12:
Teerapittayanon teaches:
The method as recited in claim 11
(see rejection of claim 11)
wherein layers of the neural network that are not required for a respective path used being deactivated during the propagation of the signal representing the input variable along the first and second paths and not being activated until that layer is required for the respective path.
(page 4 column 1 line 4) “For each exit point, the input sample is fed through the corresponding branch. The procedure then calculates the softmax and entropy of the output and checks if the entropy is below the exit point threshold Tn. If the entropy is less than Tn, the class label with the maximum score (probability) is returned.”; (page 6 column 1 paragraph 2 line 1) “Since the majority of samples are exited at early branch points, the later branches are used more rarely.”; Figure 2; [*Examiner note: When the output is returned, the propagation through the neural network comes to a halt, and the layers that are not required for the path are not used thus are deactivated. For example, in line 6 of Figure 2 and n=1, if e < T1 is true then a value is returned and so z=fexit2(x) is never calculated.]
Regarding Claim 16:
Teerapittayanon teaches:
The method as recited in claim 11
(see rejection of claim 11)
wherein: (i) the selection of the first and second paths is a function of the input variable and/or a function of a specifiable energy/time contingent that has a maximum permissible consumption level in order to propagate the signal representing the input variable through the deep neural network, and/or (ii) the specifiable criterion characterizes a specifiable minimum accuracy or reliability of the output variable
(page 4 column 1 line 7) “If the entropy is less than Tn, the class label with the maximum score (probability) is returned. Otherwise, the sample continues to the next exit point” [*Examiner note: The entropy is a measure of a type of error. The higher the error, the lower the accuracy. Thus, specifying a maximum entropy criterion Tn effectively specifies a minimum accuracy.]
Regarding Claim 17:
Teerapittayanon teaches:
The method as recited in claim 11
(see rejection of claim 11)
wherein each path of the deep neural network is trained separately from one another, and/or at least one group of paths is trained in common.
(page 3 paragraph 2) “The design goal of each exit branch is to minimize this loss function. To train the entire BranchyNet, we form a joint optimization problem as a weighted sum of the loss functions of each exit branch”; (equation below the preceding paragraph cited by examiner); [*Examiner note: All paths are trained together, and thus form a group of paths trained in common.]
Regarding Claim 18:
Claim 18 is a device claim, corresponding to method claim 11. The only difference is that claim 18 recites a device including a processor.
Teerapittayanon further teaches:
A device configured to operate […], the device including a processor configured to:
(page 6 paragraph 2 line 7) “We present evaluation results for both CPU and GPU”
The rest of the limitations of claim 18 are rejected for the same reasons as claim 11.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 13 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Teerapittayanon in view of NPL reference Huang et al., “Multi-Scale Dense Networks for Resource Efficient Image Classification”, herein referred to as Huang.
Regarding Claim 13:
Teerapittayanon teaches:
The method as recited in claim 11
(see rejection of claim 11)
Teerapittayanon does not explicitly teach:
wherein intermediate variables of layers that were ascertained during the propagation along the first path are reused in the propagation of the signal representing the input variable along the second path
However, Huang teaches:
wherein intermediate variables of layers that were ascertained during the propagation along the first path are reused in the propagation of the signal representing the input variable along the second path.
(abstract) “To maximally re-use computation between the classifiers, we incorporate them as early-exits into a single deep convolutional neural network and inter-connect them with dense connectivity.”; (page 6, “subsequent layers” heading) “Following Huang et al. (2017), the output feature maps xsl produced at subsequent layers[mapped to propagation of input variable along second path], l>1, and scales, s, are a concatenation of transformed feature maps from all previous feature maps[mapped to layers ascertained during propagation along first path] of scale s and s − 1 (if s > 1)”; figure 2; [*Examiner note: see the annotated figure 2 below for an illustration of the first path, second path, and how variables are re-used.]
PNG
media_image10.png
445
1152
media_image10.png
Greyscale
Teerapittayanon, Huang, and the instant application are analogous because they are all directed to neural networks.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural network of Teerapittayanon in view of Huang by reusing the intermediate variables from the first path as taught by Huang because (Huang page 5 “Solution: Multi-scale feature maps”) “To address this issue, MSDNets maintain a feature representation at multiple scales throughout the network, and all the classifiers only use the coarse-level features. The feature maps at a particular layer5 and scale are computed by concatenating the results of one or two convolutions…The vertical connections produce coarse features throughout that are amenable to classification. The dashed black line in Figure 3 shows that MSDNets substantially increase the accuracy of early classifiers.” In other words, re-using computations from prior layers maintains representations of the input in different levels of detail, which results in a more accurate classifier.
Regarding Claim 14:
Teerapittayanon in view of Huang teaches:
The method as recited in claim 13
(see rejection of claim 13)
And Huang further teaches:
wherein only intermediate variables of layers by which the second path differs from the first path are ascertained during the propagation along the second path
(page 1 abstract line 9) “To maximally re-use computation between the classifiers, we incorporate them as early-exits into a single deep convolutional neural network and inter-connect them with dense connectivity.” .”; (page 6, “subsequent layers” heading) “Following Huang et al. (2017), the output feature maps xsl produced at subsequent layers[mapped to propagation of input variable along second path], l>1, and scales, s, are a concatenation of transformed feature maps from all previous feature maps[mapped to layers ascertained during propagation along first path] of scale s and s − 1 (if s > 1)”; figure 2; [*Examiner note: propagation along the second path involves reusing outputs from layers of the first path by concatenating them. Thus further computations are only performed in layers in which the second path differs from the first path (dark blue arrows in the annotated figure 2 below). See annotated figure 2 for an illustration of the first path, second path, and how variables are re-used.]
PNG
media_image10.png
445
1152
media_image10.png
Greyscale
and wherein the deep neural network comprises layers of the second path being connected to more than one preceding layer, at least one of the preceding layers were also contained in the first path
(page 5 last paragraph) "Dense connectivity (Huang et al., 2017) connects each layer with all subsequent layers and allows later layers to bypass features optimized for the short-term, to maintain the high accuracy of the final classifier”; [*Examiner notes: Take the neural network layer in column 2 row 2 of figure 2 (annotated above) as an example. This layer is connected to more than one preceding layer, both of which are also contained in the first path.]
and ascertaining, during the propagation along the second path intermediate variables as a function of a provided intermediate variable of a respective immediately preceding connected layer of the second path and of a preceding provided intermediate variable of an immediately preceding connected layer of the first path.
(page 5 last paragraph) “If an earlier layer[mapped to immediately preceding connected layer] collapses information to generate short-term features, the lost information can be recovered through the direct connection to its preceding layer[ascertaining as a function of preceding layer]. The final classifier’s[propogation along second path] performance becomes (more or less) independent of the location of the intermediate classifier”; Figure 2; [*Examiner note: See an illustration in the annotation of figure 2 below.]
PNG
media_image11.png
494
1285
media_image11.png
Greyscale
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Teerapittayanon with Huang for the same reasons given in claim 13 above.
Claim 15 is rejected under 35 U.S.C. 103 as being unpatentable over Teerapittayanon in view of Huang, and further in view of NPL reference Chakradhar et al., “A Dynamically Configurable Coprocessor for Convolutional Neural Networks”, herein referred to as Chakradhar.
Regarding Claim 15:
Teerapittayanon teaches:
The method as recited in claim 11
(see rejection of claim 11)
And Teerapittayanon further teaches:
further comprising the following steps: providing a plurality of different paths through the deep neural network
(page 1 bottom of column 1) “To lessen these increasing costs, we present BranchyNet, a neural network architecture where side branches are added to the main branch, the original baseline neural network, to allow certain test samples to exit early”
selecting at least the first path and the second path from the plurality of different paths;
(page 4 paragraph 1 line 4) “For each exit point, the input sample is fed through the corresponding branch[first path]. The procedure then calculates the softmax and entropy of the output and checks if the entropy is below the exit point threshold Tn. If the entropy is less than Tn, the class label with the maximum score (probability) is returned. Otherwise, the sample continues to the next exit point[second path]”
intermediate variables of common layers of the first path and second path being first ascertained, in succession, up to the layer starting from which a sequence of the layers of the first and second paths differ
(page 3 column 2 line 1) “where fexitn is the output of the n-th exit branch and θ r