DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
The present application is being examined under the claims filed 10/07/2025.
Claims 1-20 are pending.
Response to Amendment
This Office Action is in response to Applicant’s communication filed 10/07/2025 in response to office action mailed 06/10/2025. The Applicant’s remarks and any amendments to the claims or specification have been considered with the results that follow.
Response to Arguments
Regarding Objections and Informalities
In Remarks page 12, Argument 1
The objection to the wording of claim 6 is respectfully traversed. Respectfully, the proposed amendment (as understood) would have the claim read that "the perturbed region is confined to a single block ... or [is confined to] an integer number of whole blocks". However, this claim is not the intended scope of the claim, where the perturbed region is [either] confined to (i.e. is located within) a single block, or the perturbed region directly corresponds to an integer whole number of (perturbed) blocks (e.g. such that the perturbation doesn't straddle two "partly perturbed" blocks, as discussed on page 28, lines 23-32 of the original application). As such, reconsideration of the objection to claim 6 is respectfully requested.
Examiner’s response to Argument 1
Applicant is thanked for clarifying the intended scope of dependent claim 6. Applicant’s arguments are convincing and the objection is withdrawn accordingly. However, a new objection is raised regarding claim 14 for different reasons.
Regarding 35 U.S.C. 112
In Remarks page 12, Argument 2
Claims 1-20 are rejected under 35 U.S.C. §112(b) or 35 U.S.C. §112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which Applicant regards as the invention. It is respectfully submitted that the claims as amended particularly point out and distinctly claim the subject matter under 35 U.S.C. § 112.
In response to Argument 2
Applicant’s amendments for all claims except claim 9-10 are convincing. Thus outstanding issues under 35 U.S.C. 112(b) remain. See rejections below.
Regarding 35 U.S.C. 101
In Remarks page 13, Argument 3
(Examiner summarizes Applicant’s arguments) Applicant argues that claim 1 integrates any abstract idea into a practical application by subjecting only some (not all) of the perturbed data array to neural network processing which reduces the amount of processing performed and thereby improves neural network efficiency. Applicant further argues that claim 1 amounts to significantly more than any judicial exception because it includes a combination of elements which are not well-understood, routine, and conventional.
Applicant argues that claim 10 as amended recites performing comparisons on outputs of intermediate layers of neural network processing. Applicant argues that when the outputs are significantly similar, processing for the perturbed version can be terminated which results in reduced computation time and power expenditure. Applicant concludes that claim 10 is directed to a practical real-world application as a result.
Examiner’s response to Argument 3
Applicant’s argument appear convincing. The rejections under 35 U.S.C. 101 for all claims are withdrawn accordingly.
Regarding 35 U.S.C. 103
In Remarks pages 21-24, Argument 6
(Examiner summarizes Applicant’s arguments) Applicant argues that none of the cited references teach the feature of subjecting only part of the perturbed input data to neural network processing. Applicant argues that Xu does not teach perturbed images at all because the input images are fundamentally different, and further that the second image of Xu is not generated by applying a perturbation to the first image as recited in the claims.
Examiner’s response to Argument 6
Examiner disagrees. Although Xu does not appear to teach using a computer algorithm to apply a perturbation, the second image of Xu is still a perturbation of the first. Instead of applying the perturbation by changing the pixel data of the image, the first image of Xu is perturbed by perturbing the environment itself and taking another picture. A picture that is perturbed by means other than changing the pixels by computer technique is still a perturbed image.
Moreover, while Xu does not teach the specifics of how the image was perturbed from the first image to the second image, Fong teaches using algorithm techniques for generating a perturbed image from an original image with a perturbed part differing from the original image and a non-perturbed part that is the same as the original image(see rejections under 35 U.S.C. 103 below). That is, Fong teaches applying a perturbation to a given input data image, and Xu teaches subjecting only a part but not all of a perturbed input to neural network processing. Neither reference in isolation teaches the claim as a whole, however when viewed in combination, Fong and Xu teach applying the claimed type of perturbation to an input image and subjecting some but not all of the image to neural network processing.
Applicant’s argument is directed to Xu not teaching the limitation for which Fong is relied upon. However, 35 U.S.C. 103 considers reference not only on their own merits, but what the references suggest is obvious to a person having ordinary skill in the art and therefore the rejection is maintained.
In Remarks page 20, Argument 7
Indeed, Xu does not relate to the application of perturbations to images at all, nor the effect of such perturbations on the output of neural networking process. Rather, Xu relates to a method of detecting similar blocks of pixels in consecutively-taken images, so that computations for these blocks can be re-used.
The abovementioned features of claim 1 are also not disclosed by Fong. Given that neither Fong nor Xu discloses or suggests the abovementioned features, nor the abovementioned advantages that these features provide, there is nothing in either document (alone or in combination) that could or would suggest to one of average skill in the art the extensive modifications to the method of Xu that would that would be required to arrive at the subject matter defined in claim 1.
Claims 11 and 20 include similarly defined features. Thus, it is respectfully submitted claims 1-3, 11-13 and 20 are not obvious and are therefore allowable over the applied art.
Examiner’s response to Argument 7
Examiner disagrees. Firstly, Xu and Fong are both directed to the same technical field of convolutional neural networks, and more specifically using convolutional neural networks on pairs of similar images. As argued above, the consecutive images of Xu can be readily interpreted as perturbations under a broadest reasonable interpretation. Accordingly, Xu and Fong are both analogous art to the instant application.
Further, as discussed above, neither Fong nor Xu in isolation teaches the claimed subject matter, it has been shown that it would have been obvious to a person having ordinary skill in the art to combine use the techniques of Xu with the method of Fong due to impressive performance gains (“CNNCache can accelerate the execution of CNN models by 20.2% on average”). Applicant’s arguments do not convincingly show that Fong and Xu do not teach the limitations for which they are respectively cited nor do they dispute the rationale for obviousness provided. Therefore, the rejection of the claims are maintained (excepting the dependent claims already indicated as allowable subject matter).
In Remarks page 21, Argument 8
(Examiner summarizes Applicant’s arguments) Applicant argues that the prior art of record does not teach on claim 10 due to claim amendments. Applicant further argues that Riera does not appear to teach determining whether to continue processing on the basis of comparison.
Examiner’s response to Argument 8
Applicant’s arguments related to claim amendments are convincing. While examiner maintains that Riera teaches on determining whether to continue processing on the basis of the comparison, the claim amendments change the scope of the claim so as to necessitate new art for this limitation rendering the argument a moot point.
In Remarks, Argument 9
(Examiner summarizes Applicant’s arguments) Applicant argues that the dependent claims are allowable by virtue of the independent claims.
Examiner’s response to Argument 9,
For the reasons provided in responses to arguments above and in the rejections under 35 U.S.C. 103 below, the rejection of the independent claims are maintained. The rejections for the dependent claims (excepting the claims already marked as allowable subject matter) are maintained for similar reasons.
Allowable Subject Matter
Claims 8-9 and 18-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. Examiner notes that the claims must also be amended to overcome the rejections under 35 U.S.C. 112 before allowance. Reasons for allowance were included in a previous correspondence.
Claim Objections
Regarding Claim 14
Claim 14 is objected to because of the following informalities: “the stored output of the neural network processing of that layer of the neural network processing when processing the input data array;” should read “the stored output of the neural network processing of that layer . Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 9-10 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Regarding Claim 9
Claim 9 recites the limitation “storing the output of the neural network processing […]” in line 2 of the claim. Claim 19 recites a very similar limitation. There is insufficient antecedent basis for this limitation in the claim. In particular, there is insufficient antecedent basis for the term “the output of the neural network processing”. For purposes of examination, the examiner interprets the limitation as though it said “storing an output of the neural network processing […]”.
Claim 9 recites the limitation “retrieving the stored output from memory […]” in line 4 of the claim. Claim 19 recites a very similar limitation. There is insufficient antecedent basis for this limitation in the claim. In particular, there is insufficient antecedent basis for the term “the stored output from memory”. For purposes of examination, the examiner interprets the limitation as though it said “retrieving a stored output from memory […]”.
Claim 9 recites the limitation “reusing the retrieved output of the neural network processing […]” in line 5 of the claim. Claim 19 recites a very similar limitation. There is insufficient antecedent basis for this limitation in the claim. In particular, there is insufficient antecedent basis for the term “the retrieved output of the neural network”. For purposes of examination, the examiner interprets the limitation as though it said “reusing a retrieved output when performing neural network processing […]”.
Claim 9 recites the limitation “reusing the retrieved output again when performing neural network processing […]” in line 10 of the claim. Claim 19 recites a very similar limitation. There is insufficient antecedent basis for this limitation in the claim. In particular, there is insufficient antecedent basis for the term “the retrieved output”. For purposes of examination, the examiner interprets the limitation as though it said “reusing a retrieved output of the neural network processing […]”.
Regarding Claim 10
Claim 10 recites the limitation "comparing an output for a layer of the neural network processing when processing the perturbed version of the input data array to the stored result of the processing of that layer when processing the input data array without the perturbation" in line 18 of the claim. There is insufficient antecedent basis for this limitation in the claim. In particular, there is insufficient antecedent basis for the term “the stored result of the processing of that layer”. For purposes of examination, the examiner interprets the limitation as though it said "comparing an output for a layer of the neural network processing when processing the perturbed version of the input data array to a stored result of the processing of that layer when processing the input data array without the perturbation".
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 11-13, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Fong et al. “Interpretable Explanations of Black Boxes by Meaningful Perturbation” herein referred to as Fong in view of Xu et al “Accelerating Convolutional Neural Networks for Continuous Mobile Vision via Cache Reuse”, herein referred to as Xu.
Regarding Claim 1
Fong teaches:
A method of performing neural network processing in a data processing system, […], the method comprising: for an input data array to be processed by a neural network, subjecting the input data array to neural network processing to generate a result of the neural network processing for the input data array
(page 3429 column 1 “introduction” paragraph 1) “Given the powerful but often opaque nature of modern black box predictors such as deep neural networks [4,5], there is a considerable interest in explaining and understanding predictors a-posteriori, after they have been learned.”; (page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “The aim of saliency is to identify which regions of an image x0[*Examiner notes: input data array] are used by the black box to produce the output value f(x0)[*Examiner notes: result of neural network processing]”
and applying a perturbation to a part but not all of the input data array to generate a perturbed version of the input data array, the perturbed version of the input data array thereby being made up of a perturbed part that differs from the input data array, and a non-perturbed part that is the same as the input data array
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes as x is obtained “deleting” different regions R of x0[*Examiner notes: applying perturbation to part but not all of the input data array].”; [*Examiner notes: x0 is the same as x except for the deleted (perturbed) portion]
and performing the neural network processing using the perturbed version of the input data array to generate a result of the neural network processing for the perturbed version of the input data array
(page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes[*Examiner notes: performing the neural network processing using the perturbed version] as x is obtained “deleting” different regions R of x0.”
and comparing the result of the neural network processing of the perturbed version of the input data array with the result of the neural network processing of the input data array without the perturbation, to determine whether the perturbation of the input data array has an effect on the result of the neural network processing of the perturbed version of the input data array relative to the result of the neural networking processing of the input data array without the perturbation.
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x)[*Examiner notes: processing of perturbed version] changes as x is obtained “deleting” different regions R of x0. For example, if f(x0) = +1 denotes a robin image, we expect that f(x) = +1[*Examiner notes: determine whether perturbation affects result] as well unless the choice of R deletes the robin from the image. Given that x is a perturbation of x0, this is a local explanation (sec. 3.2) and we expect the explanation to characterize the relationship between f and x0.”
Fong does not teach:
the data processing system comprising a processor operable to execute a neural network, and operable to store data relating to the neural network processing being performed by the processor to memory
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied
However, Xu teaches:
the data processing system comprising a processor operable to execute a neural network, and operable to store data relating to the neural network processing being performed by the processor to memory
(figure 4)
PNG
media_image1.png
160
307
media_image1.png
Greyscale
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied
(page 6 column 1 paragraph 1) “Figure 6 shows an output example of applying our matching algorithm on two consecutively captured images[*Examiner notes: corresponds to perturbed version of input data array]. As observed, the second frame image is different from the first one in two aspects. First, the camera is moving, so the overall background also moves in certain direction. This movement is captured in Step 3 by looking into the movement of each small block and combining them together. Second, the objects in sight are also moving. Those moved objects (regions) should be detected and marked as non-reusable. This detection is achieved in Step 4.”; [*Examiner notes: The second image is perturbed from the first because it was taken immediately after. The second image is similar, but not exactly the same as the first.] (page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices[*Examiner notes: subjecting some but not all of the perturbed version to neural network processing based on the part of the input to which perturbation has been applied], CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations taught by Fong with the subjecting of only some but not all of the perturbed input to neural network processing taught by Xu because (Xu page 1 abstract) “The results show that CNNCache can accelerate the execution of CNN models by 20.2% on average and up to 47.1% under certain scenarios”
Regarding Claim 2
Fong in view of Xu teaches:
The method of claim 1
(see rejection of claim 1)
And Xu further teaches
comprising storing some or all of an output of neural network processing for a layer or layers of the neural network processing when processing the input data array, and reusing the output of the neural network processing for the layer or layers of the neural network processing stored from the processing of the input data array when performing the neural network processing for the perturbed version of the input data array
(page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices, CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 1 above.
Regarding Claim 3
Fong in view of Xu teaches:
The method of claim 2,
(see rejection of claim 2)
And Xu further teaches:
comprising reusing the output of the neural network processing for the layer or layers of the neural network processing stored from the processing of the input data array as part of an input for a fully connected layer or layers when performing neural network processing for the perturbed version of the input data array
(page 3 column 1 section III B paragraph 1) “CNNCache is based on a key observation that consecutively captured images often have substantial overlapped (similar) regions. The reason is that mobile devices, e.g., smartphones and head-mounted devices, are in slow motion or even held still especially when users are using these devices for vision tasks such as augmenting reality [42], [20]. Thus, CNNCache tries to cache the intermediate computation results of previous frames[*Examiner notes: output stored from the processing of input data array], and reuse the results of unchanged regions to accelerate the processing of current frame.”; (page 4 column 1 paragraph 1) “In other words, there are still plenty of room (88.5%) to be improved via reusing the cache of layers before fully-connected layer[*Examiner notes: fully connected layer]”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 1 above.
Regarding Claim 11
Fong teaches:
subject an input data array to neural network processing to generate a result of the neural network processing for the input data array
(page 3429 column 1 “introduction” paragraph 1) “Given the powerful but often opaque nature of modern black box predictors such as deep neural networks [4,5], there is a considerable interest in explaining and understanding predictors a-posteriori, after they have been learned.”; (page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “The aim of saliency is to identify which regions of an image x0[*Examiner notes: input data array] are used by the black box to produce the output value f(x0)[*Examiner notes: result of neural network processing]”
and to subject a perturbed version of the input data array to the neural network processing to generate a result of the neural network processing for the perturbed version of the input data array
(page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes[*Examiner notes: performing the neural network processing using the so-perturbed version] as x is obtained “deleting” different regions R of x0.”
the perturbed version of the input data array comprising a version of the input data array in which a perturbation has been applied to a part but not all of the input data array, the perturbed version of the input data array thereby being made up of a perturbed part that differs from the input data array, and anon-perturbed part that is the same as the input data array;;
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes as x is obtained “deleting” different regions R of x0[*Examiner notes: applying perturbation to part but not all of the input data array].”; [*Examiner notes: x0 is the same as x except for the deleted (perturbed) portion]
the data processing system further comprising: a processing circuit configured to compare the result of the neural network processing of the perturbed version of the input data array with the result of the neural network processing of the input data array without the perturbation, to determine whether the perturbation of the input data array has an effect on the result of the neural network processing of the perturbed version of the input data array relative to the result of the neural networking processing of the input data array without the perturbation..
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x)[*Examiner notes: processing of perturbed version] changes as x is obtained “deleting” different regions R of x0. For example, if f(x0) = +1 denotes a robin image, we expect that f(x) = +1[*Examiner notes: determine whether perturbation affects result] as well unless the choice of R deletes the robin from the image. Given that x is a perturbation of x0, this is a local explanation (sec. 3.2) and we expect the explanation to characterize the relationship between f and x0.”
Fong does not explicitly teach:
A data processing system, the data processing system comprising: a processor operable to execute a neural network and operable to store data relating to the neural network processing being performed by the processor to memory; the data processing system further comprising a processing circuit configured to cause the processor to:
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied;
However, Xu teaches:
A data processing system, the data processing system comprising: a processor operable to execute a neural network and operable to store data relating to the neural network processing being performed by the processor to memory; the data processing system further comprising a processing circuit configured to cause the processor to:
(figure 4)
PNG
media_image1.png
160
307
media_image1.png
Greyscale
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied
(page 6 column 1 paragraph 1) “Figure 6 shows an output example of applying our matching algorithm on two consecutively captured images[*Examiner notes: corresponds to perturbed version of input data array]. As observed, the second frame image is different from the first one in two aspects. First, the camera is moving, so the overall background also moves in certain direction. This movement is captured in Step 3 by looking into the movement of each small block and combining them together. Second, the objects in sight are also moving. Those moved objects (regions) should be detected and marked as non-reusable. This detection is achieved in Step 4.”; [*Examiner notes: The second image is perturbed from the first because it was taken immediately after. The second image is similar, but not exactly the same as the first.] (page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices[*Examiner notes: subjecting some but not all of the perturbed version to neural network processing based on the part of the input to which perturbation has been applied], CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations taught by Fong with the subjecting of only some but not all of the perturbed input to neural network processing taught by Xu because (Xu page 1 abstract) “The results show that CNNCache can accelerate the execution of CNN models by 20.2% on average and up to 47.1% under certain scenarios”
Regarding Claim 12
Fong in view of Xu teaches:
The system of claim 11
(see claim 11)
And Xu further teaches:
wherein the processing circuit is configured to store some or all of an output of neural network processing for a layer or layers of the neural network processing when processing the input data array, and reuse the output of the neural network processing for the layer or layers of the neural network processing stored from the processing of the input data array when performing the neural network processing for the perturbed version of the input data array
(page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices, CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 11 above.
Regarding Claim 13
Fong in view of Xu teaches:
The system of claim 12
(see rejection of claim 12)
And Xu further teaches:
wherein the processing circuit is configured to reuse the output of the neural network processing for the layer or layers of the neural network processing stored from the processing of the input data array as part of an input for a fully connected layer or layers when performing neural network processing for the perturbed version of the input data array
(page 3 column 1 section III B paragraph 1) “CNNCache is based on a key observation that consecutively captured images often have substantial overlapped (similar) regions. The reason is that mobile devices, e.g., smartphones and head-mounted devices, are in slow motion or even held still especially when users are using these devices for vision tasks such as augmenting reality [42], [20]. Thus, CNNCache tries to cache the intermediate computation results of previous frames[*Examiner notes: output stored from the processing of input data array], and reuse the results of unchanged regions to accelerate the processing of current frame.”; (page 4 column 1 paragraph 1) “In other words, there are still plenty of room (88.5%) to be improved via reusing the cache of layers before fully-connected layer[*Examiner notes: fully connected layer]”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 11 above.
Regarding Claim 20
Fong teaches:
the method comprising: for an input data array to be processed by a neural network, subjecting the input data array to neural network processing to generate a result of the neural network processing for the input data array
(page 3429 column 1 “introduction” paragraph 1) “Given the powerful but often opaque nature of modern black box predictors such as deep neural networks [4,5], there is a considerable interest in explaining and understanding predictors a-posteriori, after they have been learned.”; (page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “The aim of saliency is to identify which regions of an image x0[*Examiner notes: input data array] are used by the black box to produce the output value f(x0)[*Examiner notes: result of neural network processing]”
and applying a perturbation to a part but not all of the input data array to generate a perturbed version of the input data array, the perturbed version of the input data array thereby being made up of a perturbed part that differs from the input data array, and anon-perturbed part that is the same as the input data array,
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes as x is obtained “deleting” different regions R of x0[*Examiner notes: applying perturbation to part but not all of the input data array].”; [*Examiner notes: x0 is the same as x except for the deleted (perturbed) portion]
and performing the neural network processing using the perturbed version of the input data array to generate a result of the neural network processing for the perturbed version of the input data array
(page 3430 column 1 section 3 paragraph 1) “A black box is a map f : X → Y[*Examiner notes: neural network processing] from an input space X to an output space Y, typically obtained from an opaque learning process”; (page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x) changes[*Examiner notes: performing the neural network processing using the so-perturbed version] as x is obtained “deleting” different regions R of x0.”
and comparing the result of the neural network processing of the perturbed version of the input data array with the result of the neural network processing of the input data array without the perturbation, to determine whether the perturbation of the input data array has an effect on the result of the neural network processing of the perturbed version of the input data array relative to the result of the neural networking processing of the input data array without the perturbation.
(page 3432 column 2 section 4.1 paragraph 1) “We can do so by observing how the value of f(x)[*Examiner notes: processing of perturbed version] changes as x is obtained “deleting” different regions R of x0. For example, if f(x0) = +1 denotes a robin image, we expect that f(x) = +1[*Examiner notes: determine whether perturbation affects result] as well unless the choice of R deletes the robin from the image. Given that x is a perturbation of x0, this is a local explanation (sec. 3.2) and we expect the explanation to characterize the relationship between f and x0.”
Fong does not teach:
A non-transitory computer readable storage medium storing computer software code which when executing on at least one processor performs a method of performing neural network processing in a data processing system, the data processing system comprising a processor operable to execute a neural network, and operable to store data relating to the neural network processing being performed by the processor to memory
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied
However, Xu teaches:
A non-transitory computer readable storage medium storing computer software code which when executing on at least one processor performs a method of performing neural network processing in a data processing system, the data processing system comprising a processor operable to execute a neural network, and operable to store data relating to the neural network processing being performed by the processor to memory
(figure 4)
PNG
media_image1.png
160
307
media_image1.png
Greyscale
wherein performing the neural network processing for the perturbed version of the input data array comprises: subjecting only some but not all of the perturbed version of the input data array to neural network processing when performing the neural network processing for the perturbed version of the input data array, based on the part of the input data array to which the perturbation has been applied
(page 6 column 1 paragraph 1) “Figure 6 shows an output example of applying our matching algorithm on two consecutively captured images[*Examiner notes: corresponds to perturbed version of input data array]. As observed, the second frame image is different from the first one in two aspects. First, the camera is moving, so the overall background also moves in certain direction. This movement is captured in Step 3 by looking into the movement of each small block and combining them together. Second, the objects in sight are also moving. Those moved objects (regions) should be detected and marked as non-reusable. This detection is achieved in Step 4.”; [*Examiner notes: The second image is perturbed from the first because it was taken immediately after. The second image is similar, but not exactly the same as the first.] (page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices[*Examiner notes: subjecting some but not all of the perturbed version to neural network processing based on the part of the input to which perturbation has been applied], CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations taught by Fong with the subjecting of only some but not all of the perturbed input to neural network processing taught by Xu because (Xu page 1 abstract) “The results show that CNNCache can accelerate the execution of CNN models by 20.2% on average and up to 47.1% under certain scenarios”
Claims 4 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Fong in view of Xu and further in view of NPL reference Riera Villanueva “Low-Power Accelerators for Cognitive Computing” herein referred to as Riera.
Regarding Claim 4
Fong in view of Xu teaches:
The method of claim 1
(see rejection of claim 1)
And Xu further teaches:
comprising: storing an output of neural network processing for a layer or layers of the neural network processing when processing the input data array
(page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices, CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 1 above.
Xu does not explicitly teach:
comparing an output of the neural network processing for a layer of the neural network processing when processing the perturbed version of the input data array to the stored output of the neural network processing of that layer of the neural network processing when processing the input data array;
and determining whether to continue the neural network processing for a part or parts of the perturbed version of the input data array on the basis of the comparison.
However, Riera teaches:
comparing an output of the neural network processing for a layer of the neural network processing when processing the perturbed version of the input data array to the stored output of the neural network processing of that layer of the neural network processing when processing the input data array;
and determining whether to continue the neural network processing for a part or parts of the perturbed version of the input data array on the basis of the comparison.
(page 64 section 4.2.4) “Therefore, RNNs only require extra storage for the inputs/outputs of one layer, whereas MLPs and CNNs require extra storage for all the layers where the computation reuse technique is applied. In other words, temporal locality of the redundant computations is higher in RNNs. Second, the four gates (four FC layers) in one LSTM cell share the same inputs. Hence, we only compare the inputs once with the previous values and, in case an input remains unmodified, computations and memory accesses are avoided in the four gates.”; [*Examiner notes: Comparing inputs with previous values is the same as comparing the outputs of layers because an input is an output of an input layer]
Fong, Xu, Riera, and the instant application are analogous because they are all directed to machine learning.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations of Fong in view of Xu with the determining whether to continue the neural network processing taught by Riera because (Riera page 64 section 4.2.4) “Hence, we only compare the inputs once with the previous values and, in case an input remains unmodified, computations and memory accesses are avoided in all the gates”. That is, if processing is halted then less computations would be performed, leading to better efficiency of the neural network.
Regarding Claim 14
Fong in view of Xu teaches:
The system of claim 11
(see rejection of claim 11)
And Xu further teaches:
wherein the processing circuit is configured to: store an output of neural network processing for a layer or layers of the neural network processing when processing the input data array;
(page 1 abstract) “To cache and reuse the computations of the similar image regions which are consecutively captured by mobile devices, CNNCache leverages two novel techniques: an image matching algorithm that quickly identifies similar image regions between images, and a cache-aware CNN inference engine that propagates the reusable regions through varied layers and reuses the computation results at layer granularity.”
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to combine Fong with Xu for the same reasons given in claim 11 above.
Fong in view of Xu does not teach:
compare an output of the neural network processing for a layer of the neural network processing when processing the perturbed version of the input data array to the stored output of the neural network processing of that layer of the neural network processing when processing the input data array;
and determine whether to continue the neural network processing for a part or parts of the perturbed version of the input data array on the basis of the comparison.
However, Riera teaches:
compare an output of the neural network processing for a layer of the neural network processing when processing the perturbed version of the input data array to the stored output of the neural network processing of that layer of the neural network processing when processing the input data array;
and determine whether to continue the neural network processing for a part or parts of the perturbed version of the input data array on the basis of the comparison.
(page 64 section 4.2.4) “Therefore, RNNs only require extra storage for the inputs/outputs of one layer, whereas MLPs and CNNs require extra storage for all the layers where the computation reuse technique is applied. In other words, temporal locality of the redundant computations is higher in RNNs. Second, the four gates (four FC layers) in one LSTM cell share the same inputs. Hence, we only compare the inputs once with the previous values and, in case an input remains unmodified, computations and memory accesses are avoided in the four gates.”; [*Examiner notes: Comparing inputs with previous values is the same as comparing the outputs of layers because an input is an output of an input layer]
Fong, Xu, Riera, and the instant application are analogous because they are all directed to machine learning.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations of Fong in view of Xu with the determining whether to continue the neural network processing taught by Riera because (Riera page 64 section 4.2.4) “Hence, we only compare the inputs once with the previous values and, in case an input remains unmodified, computations and memory accesses are avoided in all the gates”. That is, if processing is halted then less computations would be performed, leading to better efficiency of the neural network.
Claims 5 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Fong in view of Xu, Riera, and further in view of Foreign patent reference Velic et al. (EP 3744410 A1) herein referred to as Velic.
Regarding Claim 5
Fong in view of Xu in view of Riera teaches:
The method of claim 4
(see rejection of claim 4)
Fong in view of Xu and Rierra does not teach:
wherein the layer or layers of the neural network processing for which an output is stored when the processing the input data array comprises a pooling layer, and the output of the neural network processing for that pooling layer when processing the perturbed version of the input data array is compared to the stored output of the neural network processing for that pooling layer when processing the input data array
However, Velic teaches:
wherein the layer or layers of the neural network processing for which an output is stored when the processing the input data array comprises a pooling layer, and the output of the neural network processing for that pooling layer when processing the perturbed version of the input data array is compared to the stored output of the neural network processing for that pooling layer when processing the input data array
(column 39 line 16 “embodiment 19”) “wherein the recognition system is configured to estimate the one or more additional attributes of the real-world toy object depicted in the captured image by comparing an output of the convolutional stage of the trained convolutional classification model produced by the trained convolutional classification model based on the captured image with one or more of the stored reference representations associated with the predicted object identifier.”; (column 40 line 23 “embodiment 25”) “wherein the convolutional classification model comprises one or more pooling layers”
Fong, Xu, Rierra, Velic, and the instant application are analogous because they are all directed to machine learning.
It would have been obvious to a person having ordinary skill in the art before the effective filing date of the present invention to modify the neural networks and perturbations of Fong in view of Xu and Rierra with the pooling layer of Velic because (Velic paragraph [0035]) “After obtaining said feature maps with the convolution operation, the resulting images may be subsampled (or "pooled") to reduce the computational requirement for further processing.”
Regarding Claim 15
Fong in view of Xu and Riera teaches:
The system of claim 14
(see rejection of claim 14)
Fong in view of Xu and Riera does not teach:
wherein the layer or layers of the neural network processing for which an output is stored when the processing the input data array comprises a pooling layer, and the output of the neu