Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/26/2025 has been entered.
Response to Remarks
Claim Rejections – 35 U.S.C. 103
Applicant’s amendments have been fully considered and they are persuasive.
Applicant argues (pg. 9-11) that the cited references do not teach the newly amended limitations that further clarify when the determination whether to turn off a portion of the omission layer is made by comparison between the current features and the previous features, they have an equivalent number of channels.
Examiner respectfully disagrees. See Equation (1) in Cavigelli, Page 1455, Col. 1 for the computation of the change-based convolution layer. Notice how the convolution iterates through a summation of the different channels in the input channels C_{in}. Furthermore, note that Cavigelli, in Page 1456, Col. 2, Paragraph 3 states: “However, note that this structure allows gradually changing inputs (e.g. two images are morphed over several frames with increments below the change detection threshold)”. This shows that Cavigelli describes “inputs” as the two images, of neighboring frames, that are being compared. Therefore, the inputs describe both the current and previous frame/features, and thus they necessarily have the same number of channels for the summation equation to work (have the same dimensions), by the aforementioned Equation (1).
The foregoing applies to all independent claims and their dependent claims.
Claim Rejections – 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 7-24, 27-28 are rejected under 35 U.S.C. 103 as being unpatentable over Cavigelli et al. (“CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams”) hereinafter known as Cavigelli in view of Habibian et al. (“Skip-Convolutions for Efficient Video Processing”) hereinafter known as Habibian in view of Chidlovskii (US 20210019629 A1) hereinafter known as Chidlovskii.
Regarding independent claim 1, Cavigelli teaches:
…
identifying an omission layer of at least one of the downstream layers; (Cavigelli [Page 6, Figure 5]: Cavigelli teaches that the omission portion of the downstream layers is specifically the change map. Cavigelli teaches computing an updated value only for the affected output pixels, relative to the previous pixels. Doing this involves turning off the processing by skipping the omission portion of the downstream layers that process the non-changed pixels. More specifically, the change map is the omission portion of the downstream layers because it can be omitted based on the aforementioned criterion regarding the absolute difference of pixels exceeding a threshold.)
performing, …, a comparison between (1) current features output from one or more of the upstream layers processing a current frame and (2) previous features of the neural network associated with a previous frame; (Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold.” Cavigelli teaches making a comparison between the current features output and previous features by taking the absolute difference between the pixels of a frame and determining whether that difference exceeds a threshold. The layers that are outputting the current features is the upstream layer.)
and determining whether to turn off a portion of the omission layer based on the comparison between the current features the previous features, the current features and the previous feature having an equivalent number of channels; (Cavigelli [Page 1454, Paragraph 5]: “We thus propose to perform the change-detection not only at the input, but before each convolution layer— relative to its previous input—and to compute an updated value only for the affected output pixels. … We propose to replace all spatial convolution layers (conv layers) with change-based spatial convolution layers (CBconv layers).” Cavigelli [Page 6, Figure 5]: Cavigelli teaches that the omission portion of the downstream layers is specifically the change map. Cavigelli teaches that when the output pixel is determined to be changed, or sufficiently different, relative to the previous frame, the omission portion of downstream layers remain on for processing. However, when the output pixel is determined to not have changed relative to the previous frame, then the omission portion of the downstream layers will be turned off to stop processing. Cavigelli [Page 1456, Col. 2, Paragraph 3]: “However, note that this structure allows gradually changing inputs (e.g. two images are morphed over several frames with increments below the change detection threshold)”. Cavigelli teaches “inputs” as the two images, of neighboring frames, that are being compared. Therefore, the inputs describe both the current and previous frame/features, and thus they necessarily have the same number of channels (to have consistent dimensions for the math to work), as they are being summed by iteration by Page 1455, Col. 1 Equation (1))
Cavigelli does not explicitly teach:
A computer-implemented method of image processing, comprising: inputting image data of frames of a video sequence into a neural network with one or more upstream layers and one or more downstream layers relative to the upstream layers;
However, Habibian teaches:
A computer-implemented method of image processing, comprising: inputting image data of frames of a video sequence into a neural network with one or more upstream layers and one or more downstream layers relative to the upstream layers; (Habibian [Page 2696, Column 2, Paragraph 3]: “Instead of considering a video as a sequence of still images, we represent it as a series of changes across frames and network activations, denoted as residual frames.” Habibian teaches inputting image data of frames of a video sequence into a neural network. Habibian [Page 2696, Column 1, Paragraph 3]: “We use residuals to represent features at every convolutional layer.” Habibian [Page 2696, Figure 2]: “Gates become more selective at deeper layers, concentrating on task specific regions” Habibian teaches a deep neural network, which has hierarchical layers, where the output of one layer becomes the input for the following layer. Habibian teaches the layers deeper in the neural network (further from the input) and this corresponds to the downstream layers. Compared to these downstream layers, the layers that are closer to the input are the upstream layers.)
Cavigelli and Habibian are in the same field of endeavor as the present invention, as the
references are directed to processing video data using convolution neural networks using skip connections. It would have been obvious, before the effective filing date of the claimed invention, to a person of ordinary skill in the art, to combine turning the omission processing layer on/off as taught in Cavigelli with inputting frames into the deep neural network as taught in Habibian. Habibian provides this additional functionality. As such, it would have been obvious to one of ordinary skill in the art to modify the teachings of Cavigelli to include teachings of Habibian because the combination would allow for video data to be processed in a frame-by-frame manner, with the possibility of omitting processing when consecutive frames are alike. This has the potential benefit of speeding up the processing of video data by skipping processing when it is unneeded.
Cavigelli and Habibian do not explicitly teach:
… at an auxiliary neural network …
However, Chidlovskii teaches:
… at an auxiliary neural network, … (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Chidlovskii teaches the outsourcing of an operation using an extra, auxiliary, neural network.)
Chidlovskii is in the same field as the present invention, since it is directed to image processing using an auxiliary neural network for computations. It would have been obvious, before the effective filing date of the claimed invention, to a person of ordinary skill in the art, to combine skipping processing of consecutive frames that are similar as taught in Cavigelli as modified by Habibian with using an auxiliary neural network for related computations as taught in Chidlovskii. Chidlovskii provides this additional functionality. As such, it would have been obvious to one of ordinary skill in the art to modify the teachings of Cavigelli as modified by Habibian to include teachings of Chidlovskii because the combination would allow for the comparison/correlation between consecutive frames to be outsourced to the auxiliary neural network. This has the potential benefit of speeding up computations in the main neural network, as supporting operations are done on a separate neural network.
Regarding dependent claim 2, Cavigelli and Habibian teach:
The method of claim 1,
Cavigelli teaches:
wherein the previous features correspond to a previous features output from a same upstream layer providing the current features. (Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel, which is the same upstream layer. The difference of the previous features and the current features is compared.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 3, Cavigelli and Habibian teach:
The method of claim 1,
Cavigelli teaches:
wherein the current features are those features arranged to be input into an available omission section of omission layers with omission layer portions that can be omitted. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. The beginning of the arrows shows that the current features can be input to this change map. The change map is omitted if there is no change in pixel from previous to current frame, and otherwise not omitted.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 4, Cavigelli and Habibian teach:
The method of claim 1,
Cavigelli teaches:
wherein the turning off of processing results in turning off power to accelerator circuits so that no power is being consumed to process data at the omission layer. (Cavigelli [Page 1462, Paragraph 6]: “We have measured the power consumption of the Tegra X2 module using the on-board sensors for two of its power modes: maximum performance (max-N) and maximum efficiency (max-Q). … The baseline uses around 9.6 J/frame in max-N mode and 6.1 J/frame in max-Q mode, whereas the CBinfer implementation uses an average of 1.1 J/frame and 0.8 J/frame, respectively.” Cavigelli teaches that the CBinfer uses much less power because it turns off the power to the data specifically at the omission layer.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 5, Cavigelli and Habibian teach:
The method of claim 1,
Cavigelli teaches:
wherein the turning off of processing refers to effectively turning off processing by omitting processing of the omission layer of the neural network to increase throughput of the neural network. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. This necessarily means that the throughput of the neural network is faster in cases where that layer is omitted.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 7, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 1,
Habibian teaches:
wherein previous features are compressed previous features associated with the previous frame and obtained as recurrent output from the auxiliary neural network and input back into the auxiliary neural network along with the current features of the current frame. (Habibian [Page 2695, Column 2, Paragraph 2]: “We reformulate standard convolution to be efficiently computed over such residual frames by limiting the computation only to the regions with significant changes while skipping the others.” Habibian teaches residual frames, which are analogous to obtaining the previous features from an auxiliary neural network and inputting them again for future use. This is because the residual frames show the relationship between any two frames, and include the previous and current frame in all cases given by this limitation.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 8, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 7,
Chidlovskii teaches:
wherein the compressed previous features are obtained from a last convolutional layer of the auxiliary neural network before an output layer of the auxiliary neural network that provides probability values as output. (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Before this output layer, Chidlovskii teaches representing the latent code in a previous layer, which is holding the information regarding the features of the data.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 9, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 7,
Cavigelli teaches:
wherein at least one of the downstream layers with at least one omission layer portion is the omission layer, and wherein … each associated with at least a region of the current frame, the one or more probabilities including a probability that using previous features as output from the omission layer rather than current features output from the omission layer allows for a main neural network to perform an intended task. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel. The difference of the previous features and the current features is compared. Cavigelli teaches this equation, which expresses the extent to which the previous features as output is adequate for the main neural network, as the difference decreases as this probability/confidence increases.)
Chidlovskii teaches:
… the auxiliary neural network has an output layer … (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Chidlovskii teaches the outsourcing of an operation using an extra, auxiliary, neural network.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 10, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 9, wherein the one or more probabilities are compared to one or more thresholds to determine whether or not to omit processing at the omission layer of the downstream layers. (Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel. The difference of the previous features and the current features is compared against some threshold.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 11, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 7, wherein the auxiliary neural network includes at least three convolutional layers. (Chidlovskii [¶ 0040]: “In further features, the auxiliary neural network includes a convolutional neural network comprising a final fully connected layer” Chidlovskii teaches the auxiliary neural network being comprised of convolutional layers. Chidlovskii [Figure 2]: Chidlovskii teaches the auxiliary neural network comprised of three layers – denoted G, Q, and D.)
The reasons to combine are substantially similar to those of claim 1.
Claim 12 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 12, Cavigelli and Habibian teach:
A system for image processing, the system comprising: memory storing image data of frames of a video sequence and neural network features; and (Habibian [Page 2698, Column 2, Paragraph 2]: “block structures can be leveraged to reduce the memory overhead involved in gathering and scattering of input and output tensors” Habibian teaches memory that stores the image data of frames of a video sequence and neural network features.)
at least one processor communicatively coupled to the memory, the at least one processor being arranged to at least: (Habibian [Page 2701, Column 2, Paragraph 2]: “The runtimes are reported on CPU” Habibian teaches a CPU that is indubitably coupled to the memory.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 13, Cavigelli and Habibian teach:
The system of claim 12,
Cavigelli teaches:
wherein the neural network layers available to have omission layer portions to be omitted are a plurality of consecutive downstream convolutional layers forming at least one available omission section of the neural network. (Cavigelli [Page 1454, Paragraph 5]: “We thus propose to perform the change-detection not only at the input, but before each convolution layer— relative to its previous input—and to compute an updated value only for the affected output pixels. … We propose to replace all spatial convolution layers (conv layers) with change-based spatial convolution layers (CBconv layers).” Cavigelli teaches computing an updated value only for the affected output pixels, relative to the previous pixels. Determining whether or not the output pixel is affected is determining whether or not to turn off processing of the omission layer portion. This omission portion is the change map in the change-based spatial convolution layers, which may be consecutive layers that are convolutional layers that make up the downstream portion.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 14, Cavigelli and Habibian teach:
The system of claim 12,
Cavigelli teaches:
wherein the neural network layers available to have omission layer portions to be omitted are layer blocks each with one convolutional layer and one or more convolutional supporting layers. (Cavigelli [Page 1456, Paragraph 3]: “To maximize throughput, we also include the ReLU activation of the affected pixels in this step, reducing the compute time by … 2) only applying the ReLU operation to changed pixels” Cavigelli teaches including a convolutional supporting layer that is comprised of the ReLU activation function in the omission block, as indicated by the fact that the ReLU only happens when the change is detected and the block isn’t omitted. As in claim 13 above, Cavigelli also teaches omission block with convolutional layers. Cavigelli [Page 1455, Figure 5]: The omission layer portion is also comprised of the change map, which is also a convolutional layer as it outputs the convolution computations.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 15, Cavigelli and Habibian teach:
The system of claim 14,
Cavigelli teaches:
wherein the convolutional supporting layers include a supporting activation function layer of individual convolutional layers. (Cavigelli [Page 1456, Paragraph 3]: “To maximize throughput, we also include the ReLU activation of the affected pixels in this step, reducing the compute time by … 2) only applying the ReLU operation to changed pixels” Cavigelli teaches including a supporting layer that is comprised of ReLU in the omission block. The ReLU is an activation function of individual convolutional layers.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 16, Cavigelli, Habibian, and Chidlovskii teach:
The system of claim 12
Cavigelli teaches:
wherein a plurality of consecutive convolutional blocks or convolutional layers is an available omission section, and the neural network may have multiple separate available omission sections … to determine the comparison for each available omission section. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel. The difference of the previous features and the current features is compared, or correlated. Cavigelli teaches this equation, which expresses the extent to which the previous features as output is adequate for the main neural network, as the difference decreases as this probability/confidence increases.)
Chidlovskii teaches:
… each with its own auxiliary neural network operations … (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Chidlovskii teaches the outsourcing of an operation using an extra, auxiliary, neural network.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 17, Cavigelli and Habibian teach:
The system of claim 12,
Cavigelli teaches:
having at least one control to operate the neural network and to turn off processing at parts of the neural network associated with feature regions of a feature surface associated with one of the frames and to be turned off initially for reasons not related to the comparison, and wherein the at least one control is operable to turn off the omission layer due to the comparison. (Cavigelli [Page 1462, Paragraph 6]: “We have measured the power consumption of the Tegra X2 module using the on-board sensors for two of its power modes: maximum performance (max-N) and maximum efficiency (max-Q). … The baseline uses around 9.6 J/frame in max-N mode and 6.1 J/frame in max-Q mode, whereas the CBinfer implementation uses an average of 1.1 J/frame and 0.8 J/frame, respectively.” Cavigelli teaches that the CBinfer uses much less power because it turns off the processing specifically at the omission layer – due to the correlation. Prior to that, being able to turn on the neural network means that turning it off entirely would be unrelated to the correlations. This is the master control.)
The reasons to combine are substantially similar to those of claim 1.
Claim 18 is substantially similar to claim 1, but has the following additional elements:
Regarding independent claim 18, Cavigelli and Habibian teach:
At least one non-transitory machine learning storage device comprising instructions to cause at least one processor to at least: (Habibian [Page 2698, Column 2, Paragraph 2]: “block structures can be leveraged to reduce the memory overhead involved in gathering and scattering of input and output tensors” Habibian teaches non-transitory memory that stores the image data of frames of a video sequence and neural network features.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 19, Cavigelli, Habibian, and Chidlovskii teach:
The at least one non-transitory machine readable storage device of claim 18,
Cavigelli teaches:
Wherein the instructions are to cause one or more of the at least one processor to input the previous features, the current features, and downstream features … to determine the comparison, wherein the downstream features are obtained from one or more layers downstream of the neural network layers available for providing the omission layer. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel. The difference of the previous features and the current features is compared, or correlated. Cavigelli teaches this equation, which expresses the extent to which the previous features as output is adequate for the main neural network, as the difference decreases as this probability/confidence increases.)
Chidlovskii teaches:
… in the auxiliary neural network … (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Chidlovskii teaches the outsourcing of an operation using an extra, auxiliary, neural network.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 20, Cavigelli and Habibian teach:
The at least one non-transitory machine readable storage device of claim 18, wherein the instructions are to cause one or more of the at least one processor to change a decision to omit or not omit an omission layer portion when the omission layer portion and an adjacent area to the omission layer portion meet at least one criterium related to relative area to portion size or relative area to portion pixel image positions. (Cavigelli [Page 1455, Paragraph 7]: “Each of these changes affects a region equal to the filter size, and these output pixels are marked for updating” Cavigelli teaches that a given pixel’s filter size, which is an adjacent area, is related to the omit or not omit decision.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 21, Cavigelli and Habibian teach:
The at least one non-transitory machine readable storage device of claim 18, wherein the instructions are to cause one or more of the at least one processor to make an individual omission decision for at least one of: (1) each pixel of an image, individual regions 4x4 pixel regions, or (3) an entire frame. (Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the omission decision for each pixel that does not meet the changed threshold.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 22, Cavigelli, Habibian, and Chidlovskii teach:
The at least one non-transitory machine readable storage device of claim 18,
Cavigelli teaches:
wherein the instructions are to cause one or more of the at least one processor to input the current features and the previous features … that generates probabilities of success of using previous features output from a downstream layer available to have the omission layer rather than outputting current features from the downstream layer, and wherein the probabilities of success are compared to a threshold to make omission layer portion-level omission decisions. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. Cavigelli [Page 1455, Paragraph 6]: “In this step, changed pixels are detected. We define a changed pixel as one where the absolute difference of the current to the previous input of any feature map/channel exceeds some threshold” Cavigelli teaches the version of previous features being the previous input of the same feature map/channel. The difference of the previous features and the current features is compared, or correlated. Cavigelli teaches this equation, which expresses the extent to which the previous features as output is adequate for the main neural network, as the difference decreases as this probability/confidence increases.)
Chidlovskii teaches:
… into an auxiliary neural network … (Chidlovskii [¶ 0039]: “In further features, outputting, by the auxiliary neural network, the second prediction of the latent code for each of the joint hidden representations comprises outputting a probability distribution for the latent code given the respective joint representation.” Chidlovskii teaches an auxiliary neural network for comparing multiple values by outputting a probability distribution of the values. Chidlovskii teaches the outsourcing of an operation using an extra, auxiliary, neural network.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 23, Cavigelli, Habibian, and Chidlovskii teach:
The at least one non-transitory machine readable storage device of claim 22,
Cavigelli teaches:
wherein the instructions are to cause one or more of the at least one processor to transmit an omit or no omit signal of multiple portions to each downstream layer available as an omission layer with omission layer portions. (Cavigelli [Page 1455, Figure 5]: Cavigelli teaches inputting the current features in to the omission layers that can be omitted. Note that in part c in this figure, this is evident, as the red arrows show that the change map is able to be omitted in the case where the change is not detected. The omit or no omit signal have to be transmitted to the downstream omission layers to either tell them to skip or not skip.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 24, Cavigelli, Habibian, and Chidlovskii teach:
The at least one non-transitory machine readable storage device of claim 23,
Cavigelli teaches:
wherein the instructions are to cause one or more of the at least one processor to use saved previous features output from a last omission layer of an available omission section of one or more multiple omission layers when no current features are output from the last omission layer. (Cavigelli [Page 1456, Paragraph 6]: “We also need to store the previous output to use it as a basis for the updated output and to use it as the previous input of the subsequent layer.” Cavigelli teaches operating by storing the previous features output from the last omission layer. This is the case even when there are no current features are output from the last omission layer.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 27, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 1, wherein the auxiliary network is to receive downstream features as input, the downstream features representing a semantic classification for an object segmentation task. (Cavigelli [Page 1454, Col. 1, Paragraph 1]: “for first application scenario - semantic segmentation - the dataset used in [47] provides ground truth labels for 10-class semantic segmentation from an urban street surveillance perspective” Cavigelli teaches that the features may be a semantic classification, where there is a choice of a label, out of 10 possible classes.)
The reasons to combine are substantially similar to those of claim 1.
Regarding dependent claim 28, Cavigelli, Habibian, and Chidlovskii teach:
The method of claim 1, wherein, when the auxiliary network is a recurrent neural network (RNN), the auxiliary network receives recurrent input at a convolutional layer, the recurrent input arranged into a number of output channels associated with an internal state of the auxiliary network, the auxiliary network to provide a single channel to an output layer from the convolutional layer. (Habibian [Page 2698, Col. 2, Paragraph 1]: “Similar to recurrent networks, we train the model over a fixed-length sequence of frames and do inference iteratively on an arbitrary number of frames.” Habibian teaches that at a convolutional layer, the neural network receives recurrent input, which is sequential data, or frames. That the frames are organized over a fixed-length shows that the recurrent input is arranged into a number of output channels.)
The reasons to combine are substantially similar to those of claim 1.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KYU HYUNG HAN whose telephone number is (703) 756-5529. The examiner can normally be reached on MF 9-5.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alexey Shmatov can be reached on (571) 270-3428. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Kyu Hyung Han/
Examiner
Art Unit 2123
/ALEXEY SHMATOV/Supervisory Patent Examiner, Art Unit 2123