DETAILED ACTION
This nonfinal action is in response to application 17/996,533 filed 10/19/2022, which is the national stage of international application PCT/EP2021/063846 filed 05/25/2021.
Receipt of applicant’s preliminary amendment to the specification, abstract, drawing, and claims filed 12/12/2022 is acknowledged.
Claims 1-13 are cancelled. Claims 14-26 are pending in the application. Claims 14, 25, and 26 are independent claims.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) filed 10/19/2022, 12/12/2022, and 03/13/2025 have been fully considered by the examiner.
The examiner has further noted the presence of a written opinion and preliminary report on patentability for the parent application (PCT/EP2021/063846). A copy of the report has been included in the file wrapper.
Claim Objections
Claims 14, 17-18, and 25-26 are objected to because of the following informalities:
In claims 14, 25, and 26, “an input matrix having input data of the neural network” should read “an input matrix having input data of the convolutional neural network” to have clearer antecedent basis.
In claim 17, “the sum of the element in the line or the column” should read “the sum of the elements in the line or the column” to correct an apparent typographical error.
In claim 18, “the deviation ascertained in the comparison” should read “the deviation yielded in the comparison” to have clearer antecedent basis and maintain consistency with the independent claim.
Appropriate corrections are required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 14-26 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 14, it recites the limitation “comparing each element of the control matrix with a sum of elements corresponding to the element of the control matrix in the output matrices”. The intended interrelation of recited elements in the limitation is unclear; e.g., it is unclear if the claim is reciting: for each element of the control matrix, summing elements within the output matrices that correspond to the element of the control matrix, and then comparing the element of the control matrix to the calculated sum, or is instead reciting: for each element of the control matrix, comparing it to a sum of some separate group of elements, wherein the sum somehow corresponds to the element of the control matrix, and wherein the element of the control matrix is somehow present in the output matrices. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination and as best understood in light of the specification, the limitation is interpreted as: for each element of the control matrix, identifying elements in the output matrices that correspond to the element of the control matrix, calculating a sum of the identified elements, and comparing the element of the control matrix to the calculated sum of the identified elements.
It is further noted by the examiner that in the following limitations “responsive to the comparison yielding a deviation for an element of the control matrix” and “whether an element of at least one output matrix corresponding to the element of the control matrix was correctly calculated”, the claim term “an element of the control matrix” is interpreted as encompassing any one or more elements of the control matrix, and the claim term “an element of at least one output matrix” is interpreted as encompassing any one or more of the identified elements in the output matrices that correspond to the element of the control matrix.
Regarding claim 15, it recites the limitation “wherein, in the convolving with at least one of the convolution kernels”, which appears to be an intended reference to a limitation of the parent claim. However, parent claim 14 recites “convolving, by the acceleration module, an input matrix having input data of the neural network with a plurality of convolution kernels”, which appears to be different in scope than the recitation in the dependent claim. It is therefore unclear if the limitation is intending to simply refer to the parent claim, or is further specifying at least one convolution kernel within the plurality of convolution kernels. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination, the limitation is interpreted as: wherein, for at least one convolution kernel of the plurality of convolution kernels, the convolving further comprises [a bias value corresponding to the at least one convolution kernel being added].
Regarding claim 19, it recites the limitation “wherein elements of all of the output matrices corresponding to the element of the control matrix being checked as to whether they were correctly calculated”. Similarly to claim 15, it is unclear if this recitation is an intended reference to a limitation of the parent claim, which appears to be different in scope, or is further specifying an additional limitation with respect to the “checking” procedure. It is thus further unclear what elements are encompassed by “elements of all the output matrices”. Consequently, one of ordinary skill in the art would not be reasonably apprised of the scope of the invention.
For purposes of examination, the limitation is interpreted as: further comprising checking all of the identified elements corresponding to the element of the control matrix as to whether they were correctly calculated.
Regarding claims 25 and 26, they have substantially similar deficiencies to those found in claim 1 above. Consequently, they are rejected for the same reasons and are likewise interpreted as detailed above.
Regarding claims 16-18 and 20-24, they inherit the deficiencies of their parent claim. Consequently, they are also rejected under 35 U.S.C. 112(b) as being indefinite for depending on an indefinite parent claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 14-21 and 23-26 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Independent Claims (Claim 14, Claim 25, Claim 26):
Step 1: Claim 14 is drawn to a method, claim 25 is drawn to a product, and claim 26 is drawn to an apparatus. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 14, 25, and 26 each recite a judicially recognized exception of an abstract idea.
Claim 14 recites, inter alia:
calculate a convolution of an input matrix with a convolution kernel, output a result of the convolving as a two-dimensional output matrix – These limitations amount to using mathematical operations (e.g., calculating dot product (a convolution) of matrices (input matrix and convolution kernel)) to determine a variable (output a result of the convolving), and therefore recite mathematical calculation.
convolving an input matrix with a plurality of convolution kernels, so that a multiplicity of two-dimensional output matrices results; summing the convolution kernels elementwise to form a control kernel; convolving the input matrix with the control kernel, so that a two-dimensional control matrix results; – These limitations similarly amount to using a series of mathematical operations (e.g., dot products (convolving an input matrix) and summations (summing the convolution kernels) with respect to matrices) to determine variables (output matrices, control kernel, control matrix), and therefore recite mathematical calculation.
comparing each element of the control matrix with a sum of elements corresponding to the element of the control matrix in the output matrices; responsive to the comparison yielding a deviation for an element of the control matrix, checking with at least one additional control calculation, whether an element of at least one output matrix corresponding to the element of the control matrix was correctly calculated – These limitations amount to a mental step of observing results of the performed calculations, and performing further processes of reasoning (e.g., comparing elements, re-checking and/or performing additional calculations) to draw conclusions with respect to the calculation procedure (e.g., determining whether elements were correctly calculated). They therefore recite a process of evaluation that a human could reasonably perform using pen and paper.
Claims 25 and 26 recite substantially similar abstract idea limitations to those found in claim 14, and therefore recite the same judicial exception.
Step 2A Prong 2: The following additional elements recited in claims 14, 25, and 26 also do not integrate the recited judicial exceptions into a practical application.
Claim 14 additionally recites:
A method for operating a hardware platform for an inference calculation of a convolutional neural network – This limitation does no more than generally link the recited judicial exception to the technological environment of implementing a convolutional neural network (CNN) on a generic hardware platform (e.g., computer).
the hardware platform having at least one acceleration module that is specialized to [calculate a convolution]; [convolving] by the acceleration module; [convolving] by the acceleration module – These limitations amount to an insignificant implementation step with regard to the recited hardware platform, wherein the hardware platform, and its corresponding acceleration module, are merely being invoked as tools to perform the recited abstract procedure of calculation and reasoning. They therefore recite insignificant extra-solution activity.
[calculate a convolution of an input matrix with a convolution kernel] by applying the convolution kernel to various positions within the input matrix – Wherein the claim recites an abstract procedure of calculation, this limitation does no more than recite an insignificant step with regard to applying said calculation in the given environment of convolutional neural networks, and therefore recites insignificant extra-solution activity.
an input matrix having input data of the neural network – This limitation amounts to steps of merely gathering input data to enable further analysis, and therefore recites insignificant extra-solution activity.
Claim 25 recites substantially similar additional elements to those recited in claim 14, and further recites:
A non-transitory machine-readable data carrier on which is stored a computer program for operating a hardware platform – This limitation amounts to mere instructions to implement an abstract idea on a computer or computer components.
Claim 26 recites substantially similar additional elements to those recited in claim 14, and further recites:
A computer configured to operate a hardware platform – This limitation amounts to mere instructions to implement an abstract idea on a computer or computer components.
Step 2B: The additional elements recited in claims 14, 25, and 26, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
Claim 14 additionally recites:
A method for operating a hardware platform for an inference calculation of a convolutional neural network – Generally linking the recited judicial exception to the technological environment of implementing a convolutional neural network (CNN) on a generic hardware platform (e.g., computer), without providing anything more, does not provide an inventive concept or significantly more to the recited abstract idea.
the hardware platform having at least one acceleration module that is specialized to [calculate a convolution]; [convolving] by the acceleration module; [convolving] by the acceleration module – Use of specific hardware modules (e.g., GPU, FPGA, ASIC) in accelerating CNN operations is well-understood, routine, and conventional activity (see Zhang et al., “Recent advances in convolutional neural network acceleration” [pages 8-9 Implementation level]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
[calculate a convolution of an input matrix with a convolution kernel] by applying the convolution kernel to various positions within the input matrix – The described operations of applying a convolution kernel to positions of the input matrix (e.g., calculating dot product between two matrices) are inherent to the well-understood, routine, and conventional operation of a typical CNN (see Mishra, “Convolutional Neural Networks, Explained” [pages 2-4 Convolution Layer]), and therefore do not provide an inventive concept or significantly more to the recited abstract idea.
an input matrix having input data of the neural network – Receiving data is well-understood, routine, and conventional activity (see MPEP § 2106.05(d); “Receiving or transmitting data over a network”) and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 25 recites substantially similar additional elements to those recited in claim 14, and further recites:
A non-transitory machine-readable data carrier on which is stored a computer program for operating a hardware platform – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
Claim 26 recites substantially similar additional elements to those recited in claim 14, and further recites:
A computer configured to operate a hardware platform – Mere instructions to implement an abstract idea on a computer or computer components do not provide an inventive concept or significantly more to the recited abstract idea.
Even when considered as an ordered combination, the additional elements recited in the claims ultimately do no more than place the claims in the context of applying an abstract procedure of calculation and reasoning via a convolutional neural network implemented on a generic hardware platform. As such, claims 14, 25, and 26 are not patent eligible.
Dependent Claims (Claims 15-21, Claims 23-24):
Dependent claims 15-21 and 23-24 narrow the scope of independent claim 14, and thereby merely narrow the recited judicial exceptions. With respect to the independent claim, the recited judicial exception is not meaningfully integrated into a practical application, and also do not amount to significantly more than the recited abstract ideas themselves. The dependent claims recite abstract idea limitations similar to those recited within the independent claims, as they also do not provide anything more than mathematical concepts or mental processes that are capable of being performed in the human mind and/or using pen and paper. The dependent claims also do not recite any further additional elements that successfully integrate the recited judicial exceptions into a practical application or amount to significantly more than the recited abstract ideas themselves. Consequently, claims 15-21 and 23-24 are also rejected under 35 U.S.C. 101.
Step 1: Claims 15-21 and 23-24 are each drawn to a method. Therefore, each of these claims falls under one of the four categories of statutory subject matter (process/method, machine/apparatus, manufacture/product, or composition of matter).
Step 2A Prong 1: Claims 15-21 and 23-24 each recite a judicially recognized exception of an abstract idea.
Claim 15 recites, inter alia:
wherein, in the convolving with at least one of the convolution kernels, a bias value corresponding to the at least one convolution kernel is added to the elements of the output matrix produced with the at least one convolution kernel, and a sum of all bias values is also added to all elements of the control matrix – This limitation amounts to applying further mathematical operations to determine variables, and therefore recites mathematical calculation.
Claim 16 recites, inter alia:
wherein in the checking with the additional control calculation, checking whether a line or a column, containing the element to be checked, of the at least one output matrix was correctly calculated – This limitation amounts to a further mental step of observing results of the performed calculations, and performing further processes of reasoning (e.g., comparing elements, re-checking and/or performing additional calculations) to draw conclusions with respect to the calculation procedure (e.g., determining whether elements were correctly calculated). It therefore recites a process of evaluation that a human could reasonably perform using pen and paper.
Claim 17 recites, inter alia:
in the additional control calculation: the input matrix is expanded with verification elements, the verification elements are convolved, by the acceleration module, with the convolution kernel that corresponds to the at least one output matrix to obtain a control value; – This limitation amounts to applying further mathematical operations to determine variables, and therefore recites mathematical calculation.
a sum of the elements in the line or the column is compared with the control value; and responsive to the comparison of the sum of the element in the line or the column with the control value yielding a deviation, determining that the line or the column was not correctly calculated, and the element to be checked of the output matrix was also not correctly calculated – This limitation amounts to a further mental step of observing results of the performed calculations, and performing further processes of reasoning (e.g., comparing elements, re-checking and/or performing additional calculations) to draw conclusions with respect to the calculation procedure (e.g., determining whether elements were correctly calculated). It therefore recites a process of evaluation that a human could reasonably perform using pen and paper.
Claim 18 recites, inter alia:
wherein in which, in response to the determination that an element of an output matrix was not correctly calculated, the element is corrected by the deviation ascertained in the comparison – This limitation amounts to applying further mathematical operations to determine variables (e.g., the corrected element), and therefore recites mathematical calculation.
Claim 19 recites, inter alia:
wherein elements of all of the output matrices corresponding to the element of the control matrix being checked as to whether they were correctly calculated, and, in response to the determination that all of these elements were correctly calculated, determining that the element of the control matrix was not correctly calculated – This limitation amounts to a further mental step of observing results of the performed calculations, and performing further processes of reasoning (e.g., comparing elements, re-checking and/or performing additional calculations) to draw conclusions with respect to the calculation procedure (e.g., determining whether elements were correctly calculated). It therefore recites a process of evaluation that a human could reasonably perform using pen and paper.
Claim 20 recites, inter alia:
wherein when the comparison yields a deviation with regard to at least one hardware component or at least one memory area that can be regarded as the cause of the deviation, an error counter is incremented upward – This limitation amounts to a further mental step of observing results of the performed calculations, and based on analysis of the results, drawing conclusions with respect to the operational state of a computer (and/or its hardware components) on which they were performed. It therefore recites a process of evaluation (e.g., diagnosing and tracking operational issues) that a human could reasonably perform in the mind or using pen and paper.
Claim 21 recites, inter alia:
wherein, in response to a determination that the error counter has exceeded a specified threshold value, the hardware component or the memory area is recognized as defective – This limitation amounts to a further mental step of observing results of the performed calculations, and based on analysis of the results, drawing conclusions with respect to the operational state of a computer (and/or its hardware components) on which they were performed. It therefore recites a process of evaluation (e.g., diagnosing and tracking operational issues) that a human could reasonably perform in the mind or using pen and paper.
Claim 23 recites the same judicial exception as claim 14.
Claim 24 recites the same judicial exception as claim 14.
Step 2A Prong 2: Claims 15-21 do not recite any further additional elements besides those recited in the independent claims, and the following additional elements recited in claims 23-24 also do not integrate the recited judicial exceptions into a practical application.
Claim 23 additionally recites:
wherein the input data includes optical image data and/or thermal image data and/or video data and/or radar data and/or ultrasonic data and/or lidar data, the input data having been obtained through a physical measurement process and/or through a partial or complete simulation of the physical measurement process, and/or through a partial or complete simulation of a technical system observable with the physical measurement process – This limitation amounts to steps of merely gathering data and specifying a type of data to be manipulated, and therefore recites insignificant extra-solution activity.
Claim 24 additionally recites:
processing the output matrices to form a control signal; and controlling, using the control signal, a vehicle and/or a system for quality control of mass-produced products and/or a system for medical imaging and/or an access control system – This limitation amounts to an insignificant post-solution addition to the claim that does no more than generically apply the results of the performed calculations to the fields of use of vehicles, quality control systems, medical imaging systems, and/or access control systems. It therefore recites insignificant extra-solution activity.
Step 2B: The additional elements recited in claims 23-24, viewed individually or as an ordered combination, do not provide an inventive concept or otherwise amount to significantly more than the recited abstract ideas themselves.
Claim 23 additionally recites:
wherein the input data includes optical image data and/or thermal image data and/or video data and/or radar data and/or ultrasonic data and/or lidar data, the input data having been obtained through a physical measurement process and/or through a partial or complete simulation of the physical measurement process, and/or through a partial or complete simulation of a technical system observable with the physical measurement process – Using CNNs for a wide range of applications (e.g., processing 3D medical images (i.e., optical image data), processing video data) is well-understood, routine, and conventional activity (see Mittal et al., “A survey of accelerator architectures for 3D convolution neural networks” [Abstract and page 1 Introduction]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Claim 24 additionally recites:
processing the output matrices to form a control signal; and controlling, using the control signal, a vehicle and/or a system for quality control of mass-produced products and/or a system for medical imaging and/or an access control system – Applying CNNs for processing data to output a control signal (e.g., in autonomous vehicular systems) is well-understood, routine, and conventional activity (see Kuutti et al., “A Survey of Deep Learning Applications to Autonomous Vehicle Control”, [pages 1-2 Introduction]), and therefore does not provide an inventive concept or significantly more to the recited abstract idea.
Even when considered as an ordered combination, the additional elements recited in the claims ultimately do no more than place the claims in the context of processing certain types of data and generically applying conclusions drawn from the recited abstract procedure to existing systems. As such, claims 23-24 also are not patent eligible.
Eligible Claim(s) (Claim 22):
Although Claim 22 recites a judicial exception, the additional elements further recited in the claim (“the hardware platform is reconfigured in such a way that, for further calculations, instead of the hardware component recognized as defective, or the memory area recognized as defective, a reserve hardware component or a reserve memory area is used”) adequately reflect an improvement to the functioning of the hardware platform (i.e., underlying computer) itself, as detailed in the specification [page 9 line 20 – page 10 line 28]. The recited error checking procedure can be applied to recognize potentially failing components of the underlying hardware, such that said components can then be replaced with reserve hardware (e.g., as part of a maintenance procedure). As further detailed in the specification, this procedure has particular relevance to implementing neural networks in mobile environments (e.g., automated driving of vehicles [page 1 line 7 – page 2 line 3]), wherein the hardware platform is “specialized” for the given calculation tasks while also being vulnerable to outside sources of error (e.g., background radiation, electrical interference). Therefore, the recited abstract procedure of calculation and reasoning is integrated into a practical application, and claim 22 is found to be patent eligible at Step 2A Prong 2.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 14-19 and 25-26 are rejected under 35 U.S.C. 103 as being unpatentable over Hari et al. (“Making Convolutions Resilient via Algorithm-Based Error Detection Techniques”, published IEEE 2 Mar 2021, cited in IDS dated 10/19/2022), hereinafter Hari, in view of Zhao et al. (“FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks”, published IEEE 31 Dec 2020), hereinafter Zhao.
Regarding claim 14, Hari teaches A method for operating a hardware platform for an inference calculation of a convolutional neural network (“this article, we focus on algorithmically verifying convolutions, the most resource-demanding operations in CNNs. We use checksums to verify convolutions. We identify the feasibility and performance related challenges that arise in algorithmically detecting errors in convolutions in optimized CNN inference deployment platforms (e.g., TensorFlow or TensorRT on GPUs) that fuse multiple network layers and use reduced-precision operations, and demonstrate how to overcome them” [Hari Abstract]), the hardware platform having at least one acceleration module that is specialized to calculate a convolution of an input matrix with a convolution kernel by applying the convolution kernel to various positions within the input matrix, and to output a result of the convolving as a two-dimensional output matrix (“Recent advancement in the ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing (HPC) systems…Processors deployed in safety-critical and HPC systems employ ECC and/or parity in major SRAM structures. However, this protection is typically not sufficient to meet the requirements of ISO 26262 for all hardware error sources. For intermittent and permanent faults, non-storage elements contribute significantly towards the total error rates in GPUs and DNN-accelerators that dedicate significant chip area to logic [6]. As CNNs are posed to dominate the runtimes of many safety-critical and HPC systems, the goal of this paper is to develop a low-cost CNN-specific resilience solution that allows the full system to meet the target markets’ resilience requirements without full duplication and is easy to implement…Over 90 percent of the computation during CNN inference is in convolutions [24]” [Hari pages 1-2 Introduction]; “Input fmaps are represented as a 4-D tensor in most CNNs. Each fmap is a 2-D tensor with height (H) and width (W)…Each output fmap value is produced by performing a dot product between a filter and a same-sized portion of the input fmap’s tensor. An example is shown in the highlighted cells in Fig. 1, along with the formula to compute each of the output fmap values” [Hari page 3 The Convolution Operation]; see Fig. 1 including Filters (F), Input feature maps (I), and Output feature maps (O) – “Fig. 1. A typical convolution operation used by most CNNs” [Hari page 3]; GPU and DNN accelerators dedicate processing power to convolution, wherein convolution follows the typical procedure of convolution of applying a filter tensor (i.e., convolution kernel) to portions of an input feature map (i.e., input matrix) to produce output feature maps (i.e., output matrix)) , the method comprising the following steps:
convolving, by the acceleration module, an input matrix having input data of the neural network with a plurality of convolution kernels, so that a multiplicity of two-dimensional output matrices results ([Hari pages 1-2 Introduction] and [Hari page 3 The Convolution Operation] and Fig. 1 [Hari page 3] as detailed above; The convolution operation is performed at the 2-D tensor (i.e., matrix) level, and fmaps are stacked and batched respectively to form 3-D and 4-D tensors)
summing the convolution kernels elementwise to form a control kernel; (“Verifying every output value of a convolution might require duplicating the entire operation. Instead, the focus of this work is to verify just the reduced output, i.e., sum of all the output elements…Based on this key insight, we explore the following three schemes to verify a convolution, which are summarized in Fig. 2” [Hari page 3 Convolution ABED Approach]; “In this scheme, a 3-D filter checksum tensor is computed by performing an element-wise sum (using sum as a checksum function) across all the 3-D filter tensors (1 in Fig. 2a)” [Hari page 3 Filter Checksum-Based (FC)]; see Fig. 2 - (a) Filter Checksum-based detection (FC) including (1) Filter checksum generation
PNG
media_image1.png
37
550
media_image1.png
Greyscale
[Hari page 4]; The filter tensors (i.e., convolution kernel) are summed elementwise to form a filter checksum tensor
PNG
media_image2.png
42
72
media_image2.png
Greyscale
(i.e., control kernel))
convolving, by the acceleration module, the input matrix with the control kernel, so that a two-dimensional control matrix results; (“This new checksum filter is convolved with the input maps to compute an extra output fmap, which is used to verify the original fmaps’ values” [Hari page 3 Filter Checksum-Based (FC)]; see Fig. 2 - (a) Filter Checksum-based detection (FC) including extra output maps (shown in purple) of dimension P x Q (to which output fmaps (O) are then compared); The convolution of the checksum filter (i.e., control kernel) with input fmaps (i.e., matrices) produces additional output fmaps (i.e., control matrix))
comparing each element of the control matrix with a sum of elements corresponding to the element of the control matrix in the output matrices (“The original output fmaps’ values are reduced across the channel dimension to generate a reduced fmap, which is compared element-wise for equality with the extra output fmap for verification” [Hari pages 3-4 Filter Checksum-Based (FC)]; see Fig. 2 - (a) Filter Checksum-based detection (FC) including Element-wise compare of extra output maps
PNG
media_image3.png
39
127
media_image3.png
Greyscale
to the channel-wise reduction (i.e., sum) of the original output fmaps
PNG
media_image4.png
25
112
media_image4.png
Greyscale
and 3) Output verification
PNG
media_image5.png
27
405
media_image5.png
Greyscale
[Hari page 4])
However, Hari does not expressly teach responsive to the comparison yielding a deviation for an element of the control matrix, checking with at least one additional control calculation, whether an element of at least one output matrix corresponding to the element of the control matrix was correctly calculated.
In the same field of endeavor, Zhao teaches a means of using checksum techniques to algorithmically detect errors in inference calculations of a CNN executed on accelerator hardware (“Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components, instruction duplication techniques incur high overhead, and existing algorithm-based fault tolerance (ABFT) techniques cannot protect all convolution implementations. In this article, we focus on how to protect the CNN inference process against soft errors as efficiently as possible, with the following three contributions. (1) We propose several systematic ABFT schemes based on checksum techniques and analyze their fault protection ability and runtime thoroughly” [Zhao Abstract]) that responsive to a comparison yielding a deviation for an element of an output checksum matrix (see Table 1 – “Notations and Symbols Used in This Paper” [Zhao page 2]; “At the block level, the convolution operation is similar to matrix-matrix multiplication. The element (i, j) of O is calculated by using the ith element of D and the jth element of W“ [Zhao page 3 Preliminary Analysis-Convolution]; “Soft error protection includes error detection and error correction. Error detection means that the scheme can detect soft errors without knowing the exact location. Error correction means that the scheme can locate the soft error locations and recover the incorrect result” [Zhao page 5 Fault Model]; “Specifically, Co5, Co6, and Co7 are the output checksums we will use in this scheme. Similar to Co1, using the distributive property can get three equations between the output checksums and output as follows
PNG
media_image6.png
244
311
media_image6.png
Greyscale
So5, So6, and So7 are defined as the output summations corresponding to Co5, Co6, and Co7…Using the output checksums, we can get the following.
PNG
media_image7.png
248
454
media_image7.png
Greyscale
The location i, j can be obtained by
PNG
media_image8.png
30
165
media_image8.png
Greyscale
and
PNG
media_image9.png
26
162
media_image9.png
Greyscale
. Then the soft error can be fixed by adding
PNG
media_image10.png
17
13
media_image10.png
Greyscale
to Oij. If only soft error detection is required, we do not need to compute Co6 and Co7, thus reducing the number of computations. Input checksums regarding Cd1, Cd2 and Cw1, Cw2, however, are still required for soft error detection. We denote such a detection scheme by CoC-D” [Zhao pages 4-5 Checksum-of-Checksum Scheme (CoC/Coc-D)]; “To achieve the highest protection ability and lowest overhead, we propose a multischeme workflow by integrating the four schemes, as shown in Fig. 7. The workflow is made up of two modules: error detection and error correction. In our designed workflow, we use CoC-D to detect errors because it has the lowest overhead. For the error correction, we put CoC in the beginning because it is the most lightweight method…The error detection modules will be executed for every execution whether there is a soft error or not…The error correction module will not be executed until some soft errors are detected” [Zhao page 7 Multischeme Workflow for Soft Error Protection]; In the disclosed workflow, to reduce computational overhead, steps required for error correction (including determining exact error location) are only executed in response to error detection, wherein error detection module CoC-D comprises comparing output checksum matrix Co5 to summation of output matrices So5 to determine deviation
PNG
media_image10.png
17
13
media_image10.png
Greyscale
), check[s] with at least one additional control calculation, whether an element of at least one output matrix corresponding to the element of the output checksum matrix was correctly calculated (see [Zhao pages 4-5 Checksum-of-Checksum Scheme (CoC/Coc-D)] and [Zhao page 7 Multischeme Workflow for Soft Error Protection] as detailed above; “Fig. 2 demonstrates the protection ability of the CoC scheme when soft errors strike the input or output data. As shown in Fig. 2a, multiple soft errors can be detected by using only Co5. A single soft error in O can be corrected by CoC using all checksums including Co5, Co6, and Co7, as shown in Fig. 2b” [Zhao page 5 Soft Error Protection Ability of CoC Scheme]; see Fig. 2 – (b) CoC Error Detection [Zhao page 5]; see Fig. 7 including CoC-d – Error is detected –> CoC – “Fig. 7. Multischeme workflow designed to detect/correct soft errors” [Zhao page 7]; Responsive to an error being detected, the CoC module serves as an additional calculation to further check and determine location of the detected error in the output matrix).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated responsive to a comparison yielding a deviation for an element of an output checksum matrix, check[s] with at least one additional control calculation, whether an element of at least one output matrix corresponding to the element of the output checksum matrix was correctly calculated as taught by Zhao into Hari because they are both directed towards using checksum techniques to algorithmically detect errors in inference calculations of a CNN executed on accelerator hardware. Given that Hari already uses a substantially similar checksum comparison technique to Zhao for detecting errors in convolutions, and emphasizes a focus on error detection over error localization and correction due to the added computational overhead (“Lastly, processing the output matrix twice to generate both the row and column checksums for error correction capability can also introduce high overheads. Optimized implementations that process the output just once will reduce this overhead. By focusing on error detection alone, ABED significantly speeds up this step by generating a single checksum” [Hari page 11 Overhead Analysis of a Traditional ABET Technique]), a person of ordinary skill in the art would recognize the value of incorporating the teachings of Zhao to thereby enable error localization and correction, but while also continuing the objective of limiting computational overhead by only introducing additional checksum schemes once errors are actually detected ([Zhao page 7 Multischeme Workflow for Soft Error Protection] as detailed above) via, e.g., the low-cost ABED checksum technique of Hari, and also by starting with the most lightweight error correction checksum schemes first (“For the error correction, we put CoC in the beginning because it is the most lightweight method. By comparison, FC has highest correction ability but also highest time overhead, so we put it at the end of the workflow” [Zhao page 7 Multischeme Workflow for Soft Error Protection]).
Regarding claim 15, the combination of Hari and Zhao teaches the limitations of parent claim 14, and Hari further teaches wherein, in the convolving with at least one of the convolution kernels, a bias value corresponding to the at least one convolution kernel is added to the elements of the output matrix produced with the at least one convolution kernel, and a sum of all bias values is also added to all elements of the control matrix (“Convolution, bias, and activation operations are typically fused together for performance. Such fused operations perform O = activation(conv(x) + bias). For int8 convolutions, I and F use int8, and O uses either int8 or fp32. Fig. 4 explains the logical flow of computation within such fused kernels. For int8 convolutions, the output of the convolution operation is an int32 result (ConvOut in the figure)…Bias is added to ScaledOut” [Hari page 6 Kernel Modifications]; see Fig. 4 including Fused convolution, bias, and activation kernel – “Fig. 4. The logical computation flow in a fused convolution, bias, and activation kernel” [Hari page 6])
Regarding claim 16, the combination of Hari and Zhao teaches the limitations of parent claim 14, and Zhao further teaches wherein in the checking with the additional control calculation, checking whether a line or a column, containing the element to be checked, of the at least one output matrix was correctly calculated (see Fig. 7 including CoC-d – Error is detected –> CoC –Unable to correct error–>RC/CIC Controller [Zhao page 7]; see implementation of Row Checksum Scheme (RC) and Column Checksum Scheme (CIC) [Zhao page 4, sects. 3.4 and 3.5]; If, e.g., the CoC module is unsuccessful, the additional calculations may further comprise additional row (i.e. line) and column checksum schemes RC and CIC, that compare checksums to summations on a row-by-row or column-by-column basis to detect location of errors in the output matrix)
Regarding claim 17, the combination of Hari and Zhao teaches the limitations of parent claim 16, and Zhao further teaches in the additional control calculation:
the input matrix is expanded with verification elements; (“Since the row checksum scheme and column checksum scheme are symmetric with each other, we discuss them together in this section. As shown in Fig. 4a, the row checksum scheme can detect and correct soft errors if they are in the same row” [Zhao page 5 Soft Error Protection Ability of Row Checksum Scheme and Column Checksum Scheme]; see Fig. 4 – (a) Row Checksum Scheme including additional checksum elements Cd1 and Cd2 (shown in yellow) added to D1, D2,…Dn (i.e., input matrix) [Zhao page 6])
the verification elements are convolved, by the acceleration module, with the convolution kernel that corresponds to the at least one output matrix to obtain a control value; (see Fig 4 – (a) Row Checksum Scheme including D1,D2..Dn,Cd1,Cd2 convolved with W1, W2,…,Wn (i.e., convolution kernel) to obtain output O [Zhao page 6])
a sum of the elements in the line or the column is compared with the control value; (see implementation of Row Checksum Scheme (RC) [Zhao page 4]) and
responsive to the comparison of the sum of the element in the line or the column with the control value yielding a deviation, determining that the line or the column was not correctly calculated, and the element to be checked of the output matrix was also not correctly calculated (see implementation of Row Checksum Scheme (RC) [Zhao page 4]).
Regarding claim 18, the combination of Hari and Zhao teaches the limitations of parent claim 14, and Zhao further teaches wherein in which, in response to the determination that an element of an output matrix was not correctly calculated, the element is corrected by the deviation ascertained in the comparison (“Then the soft error can be fixed by adding
PNG
media_image10.png
17
13
media_image10.png
Greyscale
to Oij” [Zhao page 6 Checksum-of-Checksum Scheme (CoC/CoC-D)])
Regarding claim 19, the combination of Hari and Zhao teaches the limitations of parent claim 14, and Zhao further teaches wherein elements of all of the output matrices corresponding to the element of the control matrix being checked as to whether they were correctly calculated, and, in response to the determination that all of these elements were correctly calculated, determining that the element of the control matrix was not correctly calculated (“Fig. 3 illustrates the protection ability of the CoC scheme when soft errors happen inside the checksums. Such soft errors can cause inconsistency among the output checksums of CoC, which can be used for error detection. For example, in Fig. 3a, Cd1 is corrupted, leading to corrupted Co5 and Co6 with correct Co7. We can detect this abnormal pattern when comparing checksums with the summation of O to detect the input checksum corruption. The input D, W, and output O are clean and without soft errors since fault frequency is at most once per convolution. Thus, we can safely discard all the checksums and finish this convolution computation” [Zhao page 5 Soft Error Protection Ability of CoC Scheme]; Besides detecting soft errors within the elements of output matrices, the CoC scheme may detect errors that happen within the calculation of output checksums (including, e.g., control matrix) themselves, and given the assumption that only one fault may occur per convolutional layer (see [Zhao page 5 Fault Model] – note this assumption is synonymous with the assumption set forth in the instant specification [page 6 lines 15-26]), it logically follow that the elements of the output matrix themselves are error-free (i.e., correctly calculated))
Regarding claim 25, it is a product claim that substantially corresponds to the method of claim 14, which is already taught by the combination of Hari and Zhao as detailed above. Hari further teaches A non-transitory machine-readable data carrier on which is stored a computer program for operating a hardware platform that performs the claimed functions (“Runtime Overhead Evaluation. We experimentally evaluate the runtime of convolutions by creating a cuDNN-based workload that sets up, initializes, and runs convolutions in a loop. We compile this workload using CUDA 10 and use cuDNN 7.3 on both a Jetson AGX Xavier system and an x86- based desktop with a V100-based GPU (Titan V) [29], [31]” [Hari page 7 Overhead Evaluation]). Consequently, claim 25 is rejected for the same reasons as claim 14.
Regarding claim 26, it is an apparatus claim that substantially corresponds to the method of claim 14, which is already taught by the combination of Hari and Zhao as detailed above. Hari further teaches A computer configured to operate a hardware platform that performs the claimed functions ([Hari page 7 Overhead Evaluation] as detailed in claim 25 above). Consequently, claim 26 is rejected for the same reasons as claim 14.
Claims 20-22 are rejected under 35 U.S.C. 103 as being unpatentable over the combination of Hari and Zhao, as applied to claim 14 above, further in view of Xu et al. (“Safety Design of a Convolutional Neural Network Accelerator with Error Localization and Correction”, available IEEE Xplore 17 Feb 2020), hereinafter Xu.
Regarding claim 20, the combination of Hari and Zhao teaches the limitations of parent claim 14.
However, the combination does not expressly teach wherein when the comparison yields a deviation with regard to at least one hardware component or at least one memory area that can be regarded as the cause of the deviation, an error counter is incremented upward.
In the same field of endeavor, Xu teaches a means of using checksum techniques to algorithmically detect errors in inference calculations of a CNN executed on accelerator hardware (“[10] extended ABFT and proposed Algorithm Based Error Checker (ABEC) for a WS CNN accelerator to identify errors of an Atomic operation for all the PEs during run-time and Algorithm Based Cluster Checker (ABCC) to isolate the errors to a PE cluster….In this paper, we proposed two design techniques for low latency error detection and error correction with enhanced error localization capability and minimal power and area overhead. Specifically, this paper makes four contributio