Detailed Action
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
The present application was filed on 05/02/2023. Claims 1-15 are pending and have been examined. Claims 1, 14 and 15 are the independent claims.
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). Applicant’s claim for the benefit of a prior-filed application under 35 U.S.C. 119(e) or under 35 U.S.C. 120, 121, 365(c), or 386(c) is acknowledged. The present application is a continuation application of international patent application no. PCT/KR2021/015706, filed on 11/02/2021, which claims foreign priority to Korean patent application no. KR10-2020-0147081, filed on 11/05/2020.
The examiner acknowledges that a certified copy of Korean patent application No. KR10-2020-0147081 has been retrieved (on 5/23/2023, in Korean), as required by 37 CFR 1.55. The examiner notes that a translation of Korean patent application No. KR10-2020-0147081 does not appear to have been furnished to-date.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 05/02/2023 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.
The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked.
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph:
(A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function;
(B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and
(C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function.
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function.
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function.
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) are:
Claim 1:
an input feature map transformer configured to transform an input feature map (IFM) to a Winograd domain.
a weight kernel transformer configured to transform a weight kernel to the Winograd domain.
an inverse transformer configured to perform an inverse Winograd transform on a result output according to the collected MAC operation results from the computation data processor to thereby generate an output feature map (OFM) useable for performing a convolution operation.
Claim 2:
the transformation data processor is configured to: map the plurality of feature value groups and the plurality of weight value groups to the at least one type of MAC unit from among the first type MAC unit, the second type MAC unit, and the third type MAC unit included in the plurality of types of MAC units.
Claim 3:
wherein the transformation data processor is configured to map, based on a pre-generated mapping table, the plurality of feature value groups and the plurality of weight value groups to the at least one type of MAC unit from among the first type MAC unit, the second type MAC unit, and the third type MAC unit included in the plurality of types of MAC units.
Claim 4:
an accumulator configured to accumulate and add outputs respectively from the plurality of multiplier units.
a first shifter configured to receive a first fixed-point number and perform a right shift operation.
a second shifter configured to receive a second fixed-point number and perform a right shift operation.
a multiplier configured to receive an output of the first shifter and an output of the second shifter and perform a multiplication operation between the output of the first shifter and the output of the second shifter.
a restoration shifter configured to receive an output of the multiplier and restore a bit length by performing a left shift operation on the output of the multiplier.
Claim 5:
wherein the first shifter included in each of the plurality of multiplier units is configured to: when a bit length of the first fixed-point number exceeds a preset first bit length, reduce the bit length of the first fixed-point number by shifting the first fixed-point number to the right by a bit length in excess of the preset first bit length
the second shifter included in each of the plurality of multiplier units is configured to: when a bit length of the second fixed-point number exceeds a preset second bit length, reduce the bit length of the second fixed-point number by shifting the second fixed-point number to the right by a bit length in excess of the preset second bit length
the restoration shifter included in each of the plurality of multiplier units is configured to: restore the bit length of the first fixed-point number by shifting the output of the multiplier to the left by a sum of the bit length of the first fixed-point number by which the first shifter shifts the first fixed-point number to the right and the bit length of the second fixed-point number by which the second shifter shifts the second fixed-point number to the right.
Claim 7:
an accumulator configured to accumulate and add outputs respectively from the plurality of multiplier units.
a second restoration shifter configured to receive an output of the accumulator and restore a bit length by performing a left shift operation on the output of the accumulator.
a first shifter configured to receive a first fixed-point number and perform a right shift operation.
a multiplier configured to receive an output of the first shifter and an output of the second shifter and perform a multiplication operation between the output of the first shifter and the output of the second shifter.
a first restoration shifter configured to receive an output of the multiplier and increase a bit length by performing a left shift operation on the output of the multiplier.
Claim 8:
wherein the first shifter included in each of the plurality of multiplier units is configured to: reduce a bit length of the first fixed-point number by shifting the first fixed- point number to the right by a preset first bit length.
the second shifter included in each of the plurality of multiplier units is configured to: when a bit length of the second fixed-point number exceeds a preset second bit length, reduce the bit length of the second fixed-point number by shifting the second fixed-point number to the right by a bit length in excess of the preset second bit length.
the first restoration shifter included in each of the plurality of multiplier units is configured to: increase the bit length of the second fixed-point number by shifting the output of the multiplier to the left by the bit length of the second fixed-point number by which the second shifter shifts the second fixed-point number to the right.
the second restoration shifter is configured to: restore the bit length of the first fixed-point number by shifting the output of the accumulator to the left by the preset first bit length.
Claim 10:
an accumulator configured to accumulate and add outputs respectively from the plurality of multiplier units.
a restoration shifter configured to receive an output of the accumulator and restore a bit length by performing a left shift operation.
a first shifter configured to receive a first fixed-point number and perform a right shift operation.
a second shifter configured to receive a second fixed-point number and perform a right shift operation.
a multiplier configured to receive an output of the first shifter and an output of the second shifter and perform a multiplication operation between the output of the first shifter and the output of the second shifter.
Claim 11:
wherein the first shifter included in each of the plurality of multiplier units is configured to: reduce a bit length of the first fixed-point number by shifting the first fixed- point number to the right by a preset first bit length.
the second shifter included in each of the plurality of multiplier units is configured to: reduce a bit length of the second fixed-point number by shifting the second fixed-point number to the right by a preset second bit length.
the restoration shifter is configured to: restore the bit length by shifting the output of the accumulator to the left by a sum of the first bit length and the second bit length.
Upon a review of the Specification, each bolded generic placeholder in the claims A review of the specification shows that the corresponding structure is not described in the specification for the 35 U.S.C. 112(f) limitations:
Regarding the above-noted step function recited in claims 1 and 10, aside from merely repeating the claim language in paragraphs 4-6 and mentioning examples in paragraphs 76, 78 and 80, the corresponding structure of the claimed step function capable of performing the claimed determining of “a relevance score” is not described in applicant’s specification.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-2, 14 and 15 are rejected under 35 U.S.C. 102(a)(1) and (a)(2) as being anticipated by U.S. Publication No. US 20170344876 A1 (Brothers).
Claim 1.
Brothers discloses an electronic device, comprising (Para [0051] “an electronic device 500 that includes one or more integrated circuits (chips) forming a system” teaches an electronic device):
an input feature map transformer configured to transform an input feature map (IFM) to a Winograd domain (Para [0020] “In each of the input feature maps, multiple patches may be transformed into, for example, the Winograd domain” teaches transformed an input feature map to Winograd domain);
a weight kernel transformer configured to transform a weight kernel to the Winograd domain (Para [0026] “each IDP unit reads and transforms a single input feature map, but 16 different transformed weight kernels are applied to each input map, one kernel per output feature map. The IDP unit transforms the 16 patches of input data into, for example, the Winograd domain” teaches a weight kernels transform to the Winograd domain);
a transformation data processor configured to: map, a plurality of types of multiply-accumulate (MAC) units included in a computation unit, to a plurality of feature groups created by grouping feature values in a plurality of channels of the transformed input feature map and a plurality of weight value groups created by grouping weight values in a plurality of channels of the transformed weight kernel, the plurality of types of MAC units being configured to perform a MAC operation between a weight value group of the plurality of weight value groups and a feature value group of the plurality of feature groups (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results… each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle. Each IDP unit generates processing requests for different input feature maps and, since the corresponding weight kernels have 0 weights at different positions in the kernels, the requests generated by the input fetch units may be for processing of different output elements” teaches the multiply-accumulate units include computations, transformed eight different input feature maps (corresponds to the plurality of feature groups) and set of eight weights (corresponds to the plurality of weight value groups));
a computation data processor configured to collect MAC operation results from the computation unit (Para [0026] “The IDP unit then sends 16 requests to 16 different request-assembly units, each corresponding to a different output feature map. Each request assembly unit feeds a different set of 16 MAUs, Each of the 16 request-assembly units and corresponding 16 MAUs act independently of the others, each generating 16 patches of one output map” teaches collect the result of the multiply-accumulate units);
and an inverse transformer configured to perform an inverse Winograd transform on a result output according to the collected MAC operation results from the computation data processor to thereby generate an output feature map (OFM) useable for performing a convolution operation (Para [0026] “The IDP unit then sends 16 requests to 16 different request-assembly units, each corresponding to a different output feature map. Each request assembly unit feeds a different set of 16 MAUs,. Each of the 16 request-assembly units and corresponding 16 MAUs act independently of the others, each generating 16 patches of one output map”; Para [0036] “The subject matter disclosed herein includes multiple IDPs 102a-102n in which the respective elements of a transformed feature-map data patch and the weight values in the corresponding transformed weight kernel are multiplied together to form a convolved matrix in the Winograd domain. The elements in the n convolved matrices are respectively summed and the result, indicated at 103, is inverse Winograd transformed to form a 2×2 output feature map patch” and Para [0029] “each output matrix represents a 2×2 patch in the output map—four elements—and it takes four multiply operations to apply the convolution with the Winograd scheme while skipping 0-valued weights, the overall computation per output element is just one multiply” teaches an inverse Winograd transform on the MAU result process to output feature map for performing a convolution operation).
Claim 2.
As discussed above, Brothers discloses the electronic device of claim 1,
Brothers further discloses wherein the computation unit comprises: at least one type of MAC unit from among a first type MAC unit, a second type MAC unit, and a third type MAC unit included in the plurality of types of MAC units (Para [0025] “The eight results are all added into 1-of-16 accumulator registers maintained by the IDP unit, each corresponding to one of the 16 elements of the output patch being computed by the MAU” teaches multiple MAU units),
and the transformation data processor is configured to: map the plurality of feature value groups and the plurality of weight value groups to the at least one type of MAC unit from among the first type MAC unit, the second type MAC unit, and the third type MAC unit included in the plurality of types of MAC units (Para [0026] “each IDP unit reads and transforms a single input feature map, but 16 different transformed weight kernels are applied to each input map, one kernel per output feature map…The IDP unit then sends 16 requests to 16 different request-assembly units, each corresponding to a different output feature map. Each request assembly unit feeds a different set of 16 MAUs,. Each of the 16 request-assembly units and corresponding 16 MAUs act independently of the others, each generating 16 patches of one output map” teaches map the feature value and weight value to the MAUs wherein more than 3 MAUs act independently of the others).
Claim 14.
Brothers discloses an operation method of an electronic device, comprising (Para [0051] “an electronic device 500 that includes one or more integrated circuits (chips) forming a system” teaches electronic device):
transforming an input feature map (IFM) to a Winograd domain (Para [0020] “In each of the input feature maps, multiple patches may be transformed into, for example, the Winograd domain” teaches transformed input feature maps to Winograd domain);
transforming a weight kernel to the Winograd domain (Para [0026] “each IDP unit reads and transforms a single input feature map, but 16 different transformed weight kernels are applied to each input map, one kernel per output feature map. The IDP unit transforms the 16 patches of input data into, for example, the Winograd domain” teaches a weight kernels transform to the Winograd domain);
creating a plurality of feature value groups by grouping feature values at same coordinates in a plurality of channels of the transformed input feature map (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results…each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle” and Para [0034] “parallelizing in the input channel dimension by having multiple IDPs each operate on different input maps in parallel (for example, eight input maps fed into the each MAU after reordering in the RAU), it is also possible and advantageous to parallelize in the output channel dimension” teaches grouping eight different input feature maps (corresponds to feature values) to same channel);
creating a plurality of weight value groups by grouping weight values at same coordinates in a plurality of channels of the transformed weight kernel (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results…each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle” and Para [0034] “parallelizing in the input channel dimension by having multiple IDPs each operate on different input maps in parallel (for example, eight input maps fed into the each MAU after reordering in the RAU), it is also possible and advantageous to parallelize in the output channel dimension” teaches grouping eight different set of eight weights values to same channel);
mapping the plurality of feature value groups and the plurality of weight value groups to a plurality of types of multiply-accumulate (MAC) units included in the electronic device (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results… each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle. Each IDP unit generates processing requests for different input feature maps and, since the corresponding weight kernels have 0 weights at different positions in the kernels, the requests generated by the input fetch units may be for processing of different output elements” teaches multiply-accumulate units include computations, transformed eight different input feature maps, and set of eight weights);
outputting a MAC operation value by performing, a MAC operation, for the plurality of feature value groups with the plurality of weight value groups, respectively ((Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results” teaches outputting a multiply-accumulate units performing operation using a feature values and weight value of eight sets);
generating a transformed output feature map by collecting MAC operation results according to the outputting of the output MAC operation value (Para [0025] “The eight results are all added into 1-of-16 accumulator registers maintained by the IDP unit, each corresponding to one of the 16 elements of the output patch being computed by the MAU. Each of the 16 MAUs compute a different one of the 16 output patches being computed in parallel” and Para [0026] “16 output patches are processed in parallel to generate a portion of one output feature map” teaches generating a transformed output feature map by collecting 16 output of the multiply-accumulate units (MAU) operation value);
and performing an inverse Winograd transform on the generated transformed output feature map to thereby generate an output feature map (OFM) useable for performing a convolution operation (Para [0026] “The IDP unit then sends 16 requests to 16 different request-assembly units, each corresponding to a different output feature map. Each request assembly unit feeds a different set of 16 MAUs,. Each of the 16 request-assembly units and corresponding 16 MAUs act independently of the others, each generating 16 patches of one output map”; Para [0036] “The subject matter disclosed herein includes multiple IDPs 102a-102n in which the respective elements of a transformed feature-map data patch and the weight values in the corresponding transformed weight kernel are multiplied together to form a convolved matrix in the Winograd domain. The elements in the n convolved matrices are respectively summed and the result, indicated at 103, is inverse Winograd transformed to form a 2×2 output feature map patch” and Para [0029] “each output matrix represents a 2×2 patch in the output map—four elements—and it takes four multiply operations to apply the convolution with the Winograd scheme while skipping 0-valued weights, the overall computation per output element is just one multiply” teaches an inverse Winograd transform on the MAU result process to output feature map for performing a convolution operation).
Claim 15.
Brothers discloses a non-transitory computer-readable recording medium having recorded thereon a program to execute a method by an electronic device, the method comprising (Para [0051] “an electronic device 500 that includes one or more integrated circuits (chips)…Electronic device 500 may be used in, but not limited to, a computing device, a personal digital assistant (PDA), a laptop computer, a mobile computer, a web tablet, a wireless phone, a cell phone, a smart phone, a digital music player, or a wireline or wireless electronic device. The electronic device 500 may include a controller 510, an input/output device 520 such as, but not limited to, a keypad, a keyboard, a display, a touch-screen display, a camera, and/or an image sensor, a memory 530” teaches an electronic device):
transforming an input feature map (IFM) to a Winograd domain (Para [0020] “In each of the input feature maps, multiple patches may be transformed into, for example, the Winograd domain” teaches transformed input feature maps to Winograd domain);
transforming a weight kernel to the Winograd domain (Para [0026] “each IDP unit reads and transforms a single input feature map, but 16 different transformed weight kernels are applied to each input map, one kernel per output feature map. The IDP unit transforms the 16 patches of input data into, for example, the Winograd domain” teaches a weight kernels transform to the Winograd domain);
grouping feature values at same coordinates in a plurality of channels of the transformed input feature map to create a plurality of feature value groups (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results…each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle” and Para [0034] “parallelizing in the input channel dimension by having multiple IDPs each operate on different input maps in parallel (for example, eight input maps fed into the each MAU after reordering in the RAU), it is also possible and advantageous to parallelize in the output channel dimension” teaches grouping eight different input feature maps (corresponds to feature values) to same channel);
grouping weight values at same coordinates in a plurality of channels of the transformed weight kernel to create a plurality of weight value groups (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results…each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle” and Para [0034] “parallelizing in the input channel dimension by having multiple IDPs each operate on different input maps in parallel (for example, eight input maps fed into the each MAU after reordering in the RAU), it is also possible and advantageous to parallelize in the output channel dimension” teaches grouping eight different set of eight weights values to same channel);
mapping the plurality of feature value groups and the plurality of weight value groups to a plurality of types of multiply-accumulate (MAC) units included in the electronic device (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results… each of the eight inputs and their corresponding weights coming from eight different input feature maps might be for element in position (1,3) in a given cycle. Each IDP unit generates processing requests for different input feature maps and, since the corresponding weight kernels have 0 weights at different positions in the kernels, the requests generated by the input fetch units may be for processing of different output elements” teaches multiply-accumulate units include computations, transformed eight different input feature maps, and set of eight weights);
outputting a MAC operation value by performing, a MAC operation, for the plurality of feature value groups, with the plurality of weight value groups, respectively (Para [0025] “An array of 16 multiply-accumulate units (MAU) are coupled to the request-assembly unit to process the requests. Each MAU takes as inputs eight sets of input values and a corresponding set of eight weights. In parallel, each of the eight inputs is multiplied by its corresponding weight to generate eight results” teaches outputting a multiply-accumulate units performing operation using a feature values and weight value of eight sets);
generating a transformed output feature map by collecting MAC operation results according to the outputting of the output MAC operation value (Para [0025] “The eight results are all added into 1-of-16 accumulator registers maintained by the IDP unit, each corresponding to one of the 16 elements of the output patch being computed by the MAU. Each of the 16 MAUs compute a different one of the 16 output patches being computed in parallel” and Para [0026] “16 output patches are processed in parallel to generate a portion of one output feature map” teaches generating a transformed output feature map by collecting 16 output of the multiply-accumulate units (MAU) operation value);
and performing an inverse Winograd transform on the generated transformed output feature map to thereby generate an output feature map (OFM) useable for performing a convolution operation (Para [0026] “The IDP unit then sends 16 requests to 16 different request-assembly units, each corresponding to a different output feature map. Each request assembly unit feeds a different set of 16 MAUs,. Each of the 16 request-assembly units and corresponding 16 MAUs act independently of the others, each generating 16 patches of one output map”; Para [0036] “The subject matter disclosed herein includes multiple IDPs 102a-102n in which the respective elements of a transformed feature-map data patch and the weight values in the corresponding transformed weight kernel are multiplied together to form a convolved matrix in the Winograd domain. The elements in the n convolved matrices are respectively summed and the result, indicated at 103, is inverse Winograd transformed to form a 2×2 output feature map patch” and Para [0029] “each output matrix represents a 2×2 patch in the output map—four elements—and it takes four multiply operations to apply the convolution with the Winograd scheme while skipping 0-valued weights, the overall computation per output element is just one multiply” teaches an inverse Winograd transform on the MAU result process to output feature map for performing a convolution operation).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Brothers (US 20170344876 A1) in view of Lee (“UNPU: An Energy-Efficient Deep Neural Network Accelerator with Fully Variable Weight Bit Precision”).
Claim 3.
As discussed above, Brothers discloses the electronic device of claim 2,
Brothers does not explicitly teach wherein the transformation data processor is configured to map, based on a pre-generated mapping table, the plurality of feature value groups and the plurality of weight value groups to the at least one type of MAC unit from among the first type MAC unit, the second type MAC unit, and the third type MAC unit included in the plurality of types of MAC units.
However, in the same field, analogous art Lee teaches wherein the transformation data processor is configured to map, based on a pre-generated mapping table, the plurality of feature value groups and the plurality of weight value groups to the at least one type of MAC unit from among the first type MAC unit, the second type MAC unit, and the third type MAC unit included in the plurality of types of MAC units (A. Detail Architecture of the Unified DNN Core & Page 176 “Four LUT bundles are included in the LBPE, and each LUT bundle is used for MAC operations by accessing LUT in it. The LBPE generates partial sums of DNN output feature map by accumulating the partial sums obtained from the four LUT bundles” teaches generate feature map and weight kernel of the MAUs unit using lookup table (LUT (see, Abstract)), pre-generated mapping table).
Brothers and Lee are analogous art because they are both directed to using a hardware architecture to accelerate deep learning through convolutional and multiply accumulate processing.
It would have been obvious for one of ordinary skill in the arts before the effective filing date of the claimed invention to incorporate the limitation(s) above as taught by Lee to the disclosed invention of Brothers.
One of ordinary skill in the arts would have been motivated to make this modification because of the following, the lookup table (LUT)-based bit-serial processing element (LBPE) that provides “the energy consumption reduction compared to the conventional fixed-point multiply-and-accumulate (MAC) array” while supporting efficient processing of non-zero transformed weights, as suggested by Lee (Lee, Abstract, Page 173).
Allowable Subject Matter
7. Claims 4-13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. For example, with regard to dependent claims 4-13, the prior art of record does not anticipate, nor do they render obvious in any reasonable combination to one of ordinary skill in the art at the time of Applicants' invention, the combination of recited limitations of claims 4-13 their base claim, independent claim 1, and their respective intervening claims (i.e., dependent claims 2-3 in the case of claims 4, 7 and 10; claims 2-4 in the case of claim 5, claims 2-5 in the case of claim 6, claims 2-3 and 7 in the case of claim 8, claims 2-3 and 7-8 in the case of claim 9, claims 2-3 and 10 in the case of claim 11, and claims 2-3 and 10-11 in the case of claims 12 and 13).
Conclusion
8. Any inquiry concerning this communication or earlier communications from the examiner should be directed to Lokesha Patel whose telephone number is (571)272-6267. The examiner can normally be reached 8 AM - 4 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar can be reached at (571) 272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LOKESHA PATEL/ Examiner, Art Unit 2125
/KAMRAN AFSHAR/ Supervisory Patent Examiner, Art Unit 2125