Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 12/19/2025 has been entered.
Status of Claims
The present application is being examined under the claims filed 12/19/2025. The status of the claims are as follows:
Claims 1, 3-5, 8-16, 18-24 are pending.
Claims 1, 4-5, and 19-20 are amended.
Claims 2, 6-7, and 17 are canceled.
Response to Arguments
Regarding 35 U.S.C. § 101 (Remarks pp. 7-12)
Applicant argues that:
claim 1 is “technologically advantageous” because compressing coefficient groups “according to a compression scheme aligned with those groups” reduces memory footprint and memory bandwidth, and that performing the claimed steps “directly leads to an improvement in technology”.
Applicant further argues that:
because coefficients are represented by values, compression necessarily uses numbers, but does not mean claim 1 is seeking to patent “mathematical operations”.
Examiner response: These arguments are not persuasive.
First, the present claims (including independent claim 1) recite organizing coefficients into groups/subsets using defined cardinalities (n, m), applying sparsity by setting coefficients to zero, and compressing/encoding subsets so that fewer bits are used than in an uncompressed form. Such limitations are directed to mathematical concepts (mathematical relationships/formulas/calculations and data encoding/compression operations) as set forth in MPEP § 2106.04(a)(2)(I), consistent with the prior characterization of the claimed grouping/zeroing/encoding as mathematical processing.
Second, to the extent Applicant relies on the asserted technological advantages (reduced memory footprint/bandwidth), those advantages must be reflected in the claim by additional elements that integrate the judicial exception into a practical application, rather than being supplied by the judicial exception itself. See MPEP § 2106.05(e) (improvement must be reflected in the claim). Here, the claim language does not recite a particular nonconventional machine, memory architecture, storage/retrieval technique, or neural-network execution mechanism that effects an improvement apart from the mathematical compression itself.
Further, Applicant’s prior reliance on “stored to memory” as integrating activity is less persuasive for the present claim set because independent claim 1 no longer requires the previously-disclosed “storing … to memory” step; instead, claim 1 recites “whereby … used in an implementation of a neural network” clause, which is an intended-use/field-of-use statement and does not, by itself, provide integration into a practical application. As such, Applicant’s asserted improvement remains unclaimed as a specific technological mechanism and is not sufficient to overcome the § 101 rejection.
Accordingly, Applicant’s §101 arguments (Remarks pp. 7-12) do not overcome the rejection(s). Therefore, the rejection under 35 U.S.C. § 101 is maintained.
Regarding 35 U.S.C. § 103 (Remarks pp. 12-15)
Applicant argues, in substance, that:
Sequeira does not disclose applying sparsity to a plurality of groups aligned with a compression scheme;
Whiteman’s grouping is only in the context of compression of already-sparse activation data and does not disclose applying sparsity aligned with those groups “in consideration of a compression scheme” as required by claim 1;
and therefore a POSITA “could not and would not” modify Sequeira in view of Whiteman to meet the “aligned with a compression scheme” limitations.
Applicant also argues claim 23 is nonobvious because Sequeira’s alleged groups are not the same size and Whiteman lacks a link between compression schemes and sparsity application.
Examiner response: These arguments are not persuasive.
Regarding Sequeira and “groups” / applying sparsity:
Sequeira teaches creating zeros in neural-network parameters via masking/pruning (e.g., setting weights of neurons/filters to zero), resulting in neurons/filters whose weights are exactly zero. Such neurons/filters/kernels/channels constitute natural architectural groupings of coefficients having predetermined size by design, and applying masking/pruning across multiple such entities corresponds to applying sparsity to a plurality of groups. The fact that Sequeira’s disclosure may treat multiple architectural entities does not preclude using those entities as “groups” for subsequent compression. This addresses Applicant’s contention that Sequeira lacks group-wise sparsity as claimed.
Regarding Whiteman and “alignment with a compression scheme”:
Whiteman teaches forming neural-network data into groups/sub-groups and encoding those groups/sub-groups using a compression scheme (e.g., hierarchical indicators/masks) to efficiently represent zeros/non-zeros. Thus, Whiteman teaches group/sub-group structures that are inherently aligned to the compression scheme because the scheme operates on that defined grouping. Applicant’s assertion that Whiteman’s grouping is only for compression and not linked to sparsity is not persuasive because the proposed combination applies Sequeira’s sparsification to coefficient groupings selected to match Whiteman’s group/sub-group compression structure, yielding predictable improvements in compression efficiency as zeros increase.
Motivation to combine / reasonable expectation of success / no teaching away:
A POSITA would have been motivated to combine Sequeira’s sparsity creation (increasing zeros in NN coefficients) with Whiteman’s group-aligned compression scheme (efficiently encoding groups/sub-groups with zeros), because the combination predictably reduces model/activation storage and bandwidth – goals expressly addressed in neural-network deployment/compression. Applicant’s “could not and would not” and “teach away” framing is conclusory and does not establish that neither reference criticizes, discredits, or discourages the combination; rather, the references are complementary: Sequeira increases sparsity and Whiteman provides an encoding scheme well-suited to exploit sparsity.
Regarding claim 23
Applicant argues claim 23 is nonobvious because Sequeira’s alleged groups are not the same size and Whiteman lacks a sparsity/compression “link”.
Examiner response: this argument is not persuasive.
Whiteman expressly teaches fixed-size grouping of data elements (e.g., tiles/grids), which inherently results in each group having the same predefined number of elements, and the rejection relies on Whiteman for the fixed grouping structure while relying on Sequeira for sparsity creation. Thus, configuring groups such that the predefined number of coefficients in each group is the same is taught by (or at a minimum would have been an obvious design choice consistent with) Whiteman’s fixed-size grouping for compression.
Accordingly, Applicant’s §103 arguments (Remarks pp. 12-15) do not overcome the rejection(s). Therefore, the rejection under 35 U.S.C. § 103 is maintained.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 12/22/2025 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1, 3-5, 8-16, 18-24 are rejected under 35 U.S.C. 101 as being directed to a judicial exception without significantly more
Regarding claim 1
Claim 1 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 1 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“applying sparsity to a plurality of groups of coefficients of the set of coefficients, where the plurality of groups are defined to be aligned with a compression scheme to be used to perform compression, each group of coefficients comprising a predefined number of coefficients, wherein each group of coefficients comprises one or more subsets of coefficients of the set of coefficients, each group of coefficients comprising n coefficients and each subset of coefficients comprising m coefficients, where m is greater than 1 and n is an integer multiple of m, and where applying sparsity to a group of coefficients of the plurality of groups of coefficients comprises setting each of the coefficients in that group to zero;” –
“and compressing the plurality of groups of coefficients to which sparsity has been applied according to the compression scheme aligned with the plurality of groups of coefficients by compressing the one or more subsets of coefficients comprised by each group of coefficients of the plurality of groups of coefficients, each of said subsets of coefficients to be compressed comprising m coefficients that are zero, such that fewer bits are used to encode the m coefficients of each of said subsets of coefficients in a compressed form of each of said subsets of coefficients than are used to encode the m coefficients of each of said subsets of coefficients in their uncompressed form;”
Claim 1 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“whereby the compressed groups of coefficients are used in an implementation of a neural network.” – this limitation does not integrate the judicial exception into a practical application because it merely recites intended use / field of use and/or generic consumption of mathematical output by a broader system, without any recited technical implementation or improvement to the neural network or underlying computer technology. See MPEP § 2106.05(h) (field-of-use) and §2106.05(g) (insignificant extra-solution activity).
Claim 1 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“whereby the compressed groups of coefficients are used in an implementation of a neural network.” – this limitation amounts to no more than well-understood, routine, and conventional (WURC) application of the mathematical result and does not add an inventive concept beyond the judicial exception. See MPEP § 2106.05(d).
Regarding claim 3
Claim 3 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 3 – Step 2A Prong One – Abstract Idea Identification:
Claim 3 depends from Claim 1, which has been determined to be directed to an abstract idea, and adds:
“wherein n is greater than m” – this limitation recites a numerical inequality that defines a relationship between variables (n, m). Such mathematical relationships fall within the mathematical concepts grouping of abstract ideas. MPEP §2106.04(a)(2).
“and wherein each group of coefficients is compressed by compressing multiple adjacent or interleaved subsets of coefficients” – this limitation recites how to mathematically encode/process the grouped numerical data (combining adjacent or interleaved subsets during compression). This is a mathematical rule for data encoding and therefore a mathematical concept under MPEP §2106.04(a)(2).
Claim 3 – Step 2A Prong Two:
Claim 3 does not add any element beyond the mathematical constraints and compression rule identified above. There are no additional elements (e.g., specific hardware, particular machine, transformation, or improvement to computer functionality) that meaningfully limit the exception. Accordingly, claim 3 does not integrate the abstract idea into a practical application. MPEP §2106.05(a)-(c), (e)-(h).
Claim 3 – Step 2B:
Claim 3 adds no elements other than further abstract mathematical rules, there is nothing to supply an inventive concept beyond what is already found ineligible in claim 1. Thus, claim 3 fails Step 2B. MPEP 2106.05(d), (f), (g).
Regarding claim 4
Claim 4 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 4 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“wherein n is equal to m or 2m.” – this limitation recites a mathematical relationship (a constraint on variables). It does not add any concrete technological implementation detail; it simply narrows the math from “n is an integer multiple of m” to the special cases n
=
m or n
=
2m. See MPEP § 2106.04(a)(2)(I) (mathematical concepts).
Claim 4 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application.
Claim 4 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception.
Regarding claim 5
Claim 5 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 5 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“wherein n is equal to 2m” – this limitation recites a numeric relationship between variables which is a mathematical concept under MPEP § 2106.04(a)(2)(I).
“wherein each group comprises 16 coefficients and each subset comprises 8 coefficients,” – this specifies the sizes of numerical groupings (fixed numeric parameters). It is defining the mathematical structure/arrangement of the data and thus a is a mathematical concept. See MPEP § 2106.04(a)(2)(I).
“and wherein each group is compressed by compressing two adjacent or interleaved subsets of coefficients.” – this is part of the encoding/compression scheme (a rule/technique for operating on numerical data). As claimed, it is not tied to a specific hardware memory organization or a particular nonconventional data structure implementation; it’s a rule about how to select/aggregate subsets for compression and is thus a mathematical concept under MPEP § 2106.04(a)(2)(I).
Claim 5 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application.
Claim 5 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception.
Claim 8 – Step 2A Prong One – Abstract Idea Identification:
Claim 8 depends from Claim 1, which has been determined to be directed to an abstract idea.
Sparsity is applied to the plurality of groups of the coefficients in dependence on a sparsity mask that defines which coefficients of the set of coefficients to which sparsity is to be applied – this recites conditional logic applied to a dataset based on a binary mask. It describes a mathematical rule for selection and is a mathematical concept under MPEP 2106.04(a).
Claim 8 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 9 – Step 2A Prong One – Abstract Idea Identification:
Claim 9 depends from Claim 8, which has been determined to be directed to an abstract idea.
The set of coefficients is a tensor of coefficients, the sparsity mask is a binary tensor of the same dimensions as the tensor of coefficients, and sparsity is applied by performing an element-wise multiplication of the tensor of coefficients with the sparsity mask tensor – this recites a tensor structure and a linear algebra operation (element-wise multiplication), all of which are mathematical relationships and concepts and therefore fall under MPEP 2106.04(a).
Claim 9 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 10 – Step 2A Prong One – Abstract Idea Identification:
Claim 10 depends from Claim 9, which has been determined to be directed to an abstract idea.
Generating a reduced tensor having one or more dimensions an integer multiple smaller than the tensor of coefficients, wherein the integer being greater than 1 – this recites a dimensionality reduction step with a numerical constraint on the reduction factor. It is a mathematical transformation and relationship, and constitutes a mathematical concept under MPEP 2106.04(a).
Determining elements of the reduced tensor to which sparsity is to be applied so as to generate a reduced sparsity mask tensor – this recites a rule-based selection operation on numerical data, which is a mathematical concept.
Expanding the reduced sparsity mask tensor so as to generate a sparsity mask tensor of the same dimensions as the tensor of coefficients – this recites an upsampling operation. It is a mathematical calculation and a mathematical concept under MPEP 2106.04(a).
Claim 10 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 11 – Step 2A Prong One – Abstract Idea Identification:
Claim 11 depends from Claim 10, which has been determined to be directed to an abstract idea.
Dividing the tensor of coefficients into multiple groups of coefficients, such that each coefficient of the set is allocated to only one group and all of the coefficients are allocated to a group – this recites a rule for data partitioning, which is a mathematical concept under MPEP 2106.04(a).
Representing each group of coefficients of the tensor of coefficients by the maximum coefficient value within that group – this recites a mathematical aggregation function (maximum). It is a mathematical calculation and a mathematical concept under MPEP 2106.04(a).
Claim 11 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 12 – Step 2A Prong One – Abstract Idea Identification:
Claim 12 depends from Claim 10, which has been determined to be directed to an abstract idea.
Expanding the reduced sparsity mask tensor by performing nearest neighbour upsampling such that each value in the reduced sparsity mask tensor is represented by a group comprising a plurality of like values in the sparsity mask tensor - this recites a mathematical interpolation rule (nearest-neighbour upsampling). It is a mathematical operation on numerical data and a mathematical concept under MPEP 2106.04(a).
Claim 12 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 13 – Step 2A Prong Two:
Beyond the abstract mathematical encoding rules above, Claim 13 (through dependence on Claim 1) includes the additional element of claim 1, “storing the compressed groups of coefficients to memory for subsequent use in a neural network”, which does not impose a meaningful limit on the mathematical idea and therefore does not integrate the exception into a practical application. See rejection for Claim 1; MPEP §2106.05(f), (g), (h).
Claim 13 – Step 2B:
Evaluating only the additional element(s) beyond the abstract idea (i.e., the storing-to-memory step from claim 1), the limitation is a well-understood, routine, conventional computer function recited at a high level of generality and therefore does not amount to “significantly more”. MPEP §2106.05(d), (g).
Claim 14 – Step 2A Prong One – Abstract Idea Identification:
Claim 14 depends from Claim 13, which has been determined to be directed to an abstract idea.
Identifying a body portion size, b, by locating a bit position of a most significant leading one across all the coefficients in the subset – this recites a step of identifying and locating a bit position, which constitutes a mental process under MPEP 2106.04(a).
Generating the header data comprising a bit sequence encoding the body portion size – this recites an encoding rule for compressed data, which is a mathematical concept under MPEP 2106.04(a).
Generating a body portion comprising b-bits for each of the coefficients in the subset by removing none, one or more leading zeros from each coefficient – this recites a data compaction method based on removal of leading zero bits, which is a mathematical transformation and a mathematical concept under MPEP 2106.04(a).
Claim 14 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 15 – Step 2A Prong One – Abstract Idea Identification:
Claim 15 depends from Claim 1, which has been determined to be directed to an abstract idea.
The number of groups to which sparsity is to be applied is determined in dependence on a sparsity parameter – this recites a step of determining a number based on a parameter, which constitutes a mental process under MPEP 2106.04(a).
Claim 15 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 16 – Step 2A Prong One – Abstract Idea Identification:
Claim 16 depends from Claim 15, which has been determined to be directed to an abstract idea.
Dividing the set of coefficients into multiple groups of coefficients, such that each coefficient of the set is allocated to only one group and all of the coefficients are allocated to a group – this recites a rule for partitioning numerical data, which is a mathematical operation and a mathematical concept under MPEP 2106.04(a).
Determining a saliency of each group of coefficients – this step involves evaluating numerical groups to assign a significance score, and constitutes a mental process under MPEP 2106.04(a).
Applying sparsity to the plurality of the groups of coefficients having a saliency below a threshold value, the threshold value being determined in dependence on the sparsity parameter – this recites a conditional application of a mathematical operation based on numerical comparison, which is a mathematical rule and calculation and a mathematical concept under MPEP 2106.04(a).
Optionally wherein the threshold value is a maximum absolute coefficient value or an average absolute coefficient value – this recites two forms of statistical aggregation used in the comparison logic, which are both mathematical operations and thus mathematical concepts under MPEP 2106.04(a).
Claim 16 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
None beyond those already identified as abstract.
Claim 18 – Step 2A Prong One – Abstract Idea Identification:
Claim 18 depends from Claim 1, which has been determined to be directed to an abstract idea under Step 2A Prong One.
Claim 18 – Step 2A Prong Two and Step 2B Combined Analysis:
Additional Elements Beyond the Abstract Ideas:
Using the compressed groups of coefficients in a neural network – this recites a generic consumption of mathematical output by a broader system without any specified technical implementation or improvement.
The use of the compressed coefficients constitutes insignificant post-solution activity and courts have found that limitations directed to the consumption or use of data after processing, when recited at a high level of generality and not tied to any specific improvement in computer or network functionality, are well-understood, routine, and conventional (e.g., “electronic record keeping”, “receiving or transmitting data over a network”). The recited use in a neural network does not improve the network or underlying technology. As such, it does not integrate the abstract idea into a practical application or an inventive concept under MPEP 2106.05(d)(II) and (g).
Regarding claim 19
Claim 19 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 19 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“applying sparsity to a plurality of groups of coefficients of the set of coefficients, where the plurality of groups are defined to be aligned with a compression scheme to be used to perform compression, each group of coefficients comprising a predefined number of coefficients, wherein each group of coefficients comprises one or more subsets of coefficients of the set of coefficients, each group of coefficients comprising n coefficients and each subset of coefficients comprising m coefficients, where m is greater than 1 and n is an integer multiple of m, and where applying sparsity to a group of coefficients of the plurality of groups of coefficients comprises setting each of the coefficients in that group to zero;” –
“and compressing the plurality of groups of coefficients to which sparsity has been applied according to the compression scheme aligned with the plurality of groups of coefficients by compressing the one or more subsets of coefficients comprised by each group of coefficients of the plurality of groups of coefficients, each of said subsets of coefficients to be compressed comprising m coefficients that are zero, such that fewer bits are used to encode the m coefficients of each of said subsets of coefficients in a compressed form of each of said subsets of coefficients than are used to encode the m coefficients of each of said subsets of coefficients in their uncompressed form;”
Claim 19 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“a data processing system”
“pruner logic configured to apply sparsity” – this limitation recites a field-of-use/intended use for neural network use and thus falls under MPEP § 2106.05(g). It does not improve computer functionality or another technology. As such, this limitation is generic functional components. See MPEP § 2106.05(d).
“a compression engine configured to compress …” – this recites a field-of-use/intended use for neural network use and thus falls under MPEP § 2106.05(g).
“whereby the compressed groups of coefficients are used in an implementation of a neural network.” – this limitation does not integrate the judicial exception into a practical application because it merely recites intended use / field of use and/or generic consumption of mathematical output by a broader system, without any recited technical implementation or improvement to the neural network or underlying computer technology. See MPEP § 2106.05(h) (field-of-use) and §2106.05(g) (insignificant extra-solution activity).
Claim 19 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“a data processing system” - this recites well-understood, routine, and conventional (WURC) computer components recited at a high level of functionality, implementing the mathematical concept. It does not add an inventive concept. See MPEP § 2106.05(d).
“pruner logic configured to apply sparsity” – this recites well-understood, routine, and conventional (WURC) computer components recited at a high level of functionality, implementing the mathematical concept. It does not add an inventive concept. See MPEP § 2106.05(d).
“a compression engine configured to compress …” - this recites well-understood, routine, and conventional (WURC) computer components recited at a high level of functionality, implementing the mathematical concept. It does not add an inventive concept. See MPEP § 2106.05(d).
“whereby the compressed groups of coefficients are used in an implementation of a neural network.” – this limitation amounts to no more than well-understood, routine, and conventional (WURC) application of the mathematical result and does not add an inventive concept beyond the judicial exception. See MPEP § 2106.05(d).
Regarding claim 20
Claim 20 is rejected under 35 U.S.C. § 101 because it is directed to a judicial exception without significantly more. Claim 20 recites mathematical concepts, i.e., mathematical relationships/rules for grouping coefficients into groups/subsets (including n and m relationships), and applying sparsity by setting coefficients to zero, and compressing/encoding subsets such that fewer bits are sued to encode subsets in compressed form than uncompressed form, which are mathematical concepts under MPEP § 2106.04(a)(2)(I). The additional elements (a computer-readable medium and instructions that, when executed, cause performance of the mathematical concepts using generic computer technology and reflect intended use/field of use, and therefore do not integrate the judicial exception into a practical application (Step 2A, Prong Two) and do not amount to significantly more than the judicial exception (Step 2B), consistent with MPEP § 2106.05(d), 2106.05(g), and 2106.05(h).
Claim 21 – Step 2A Prong One – Abstract Idea Identification:
Claim 21 depends from claim 1, which has been determined to be directed to an abstract idea and adds:
“wherein compressing a subset of coefficients comprises identifying a number of bits that is sufficient to encode the largest coefficient value in that subset of coefficients” – this limitation recites steps for identifying a bit-width sufficient to encode the largest value requires (i) determining a maximum over numerical data and (ii) computing a representation size/bit-length, both of which constitute mathematical relationships/calculations. This is a mathematical concept under MPEP §2106.04(a)(2).
“and encoding each coefficient in that subset of coefficients using that number of bits.” – encoding values using a fixed number of bits is a mathematical encoding rule/algorithm (a numerical representation). This likewise falls within mathematical concepts under MPEP §2106.04(a)(2).
Claim 21 – Step 2A Prong Two:
Beyond the abstract math, Claim 21 adds no additional element of its own. The only non-abstract element remains the storing-to-memory for subsequent neural-network use carried from claim 1. That storing step is insignificant extra-solution activity (routine electronic record-keeping) and the “for subsequent use in a neural network” phrase is intended/field-of-use language; neither imposes a meaningful limit or improve computer functionality. MPEP §2106.05(f), (g), (h).
Claim 21 – Step 2B:
There are no additional elements that integrate the exception into a practical application or amount to significantly more than the exception.
Claim 22 – Step 2A Prong One – Abstract Idea Identification:
Claim 22 depends from claim 22, which has been determined to be directed to an abstract idea, and adds:
“The computer implemented method of claim 21, wherein 0 bits are sufficient to encode the largest coefficient value in a subset of coefficients to be compressed comprising, exclusively, m coefficients that are zero” – this recites a mathematical rule/relationship governing representation size (bit-length = 0) for an all-zero subset. It details a numerical encoding convention tied to a mathematical property of the data (largest value = 0), which falls squarely within mathematical concepts under MPEP §2106.04(a)(2) (mathematical relationships/algorithms).
Claim 22 – Step 2A Prong Two:
Beyond the abstract math above, Claim 22 adds no further element of its own. The only non-abstract element remains, through Claim 1, “storing the compressed groups of coefficients to memory for subsequent use in a neural network.” As analyzed for the base claim, this is insignificant extra-solution activity (routine electronic record-keeping) and intended/field-of-use language, not a technological improvement or meaningful limitation. MPEP §2106.05(f), (g), (h).
Claim 22 – Step 2B:
Considering only the additional element(s) beyond the abstract math (i.e., the storing-to-memory step carried from Claim 1), the limitation is well-understood, routine, and conventional computer functionality recited at a high level of generality and therefore does not amount to “significantly more”. MPEP §2106.05(d)(II), (g).
Regarding claim 23
Claim 23 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 23 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
“wherein the predefined number of coefficients in each of the plurality of groups of coefficients is the same.” – this limitation is a mathematical concept under MPEP § 2106.04(a)(2)(I) because it specifies a numerical constraint/relationship on the grouping of coefficients (i.e., each group has the same predefined count). It is a rule for organizing numeric data and does not recite any technological implementation.
Claim 23 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application.
Claim 23 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception.
Regarding claim 24
Claim 24 – Step 1 – Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is to a process.
Claim 24 – Step 2A – Prong 1 – Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Yes, the claim recites an abstract idea.
Claim 24 – Step 2A – Prong 2 – Does the claim recite additional elements that integrate the judicial exception into a practical application?
No. There are no additional elements that integrate the judicial exception into a practical application. The additional elements:
“further comprising storing the compressed groups of coefficients to memory for subsequent use in the implementation of the neural network.” – the recited “storing … to memory” is a generic computer operation used to save the result of the mathematical processing. This is insignificant extra-solution activity (i.e., a post-solution step of storing the output), under MPEP § 2106.05(g).
Claim 24 – Step 2B – Does the claim recite additional elements that amount to significantly more than the judicial exception?
No. There are no additional elements that amount to significantly more than the judicial exception. The additional elements are:
“further comprising storing the compressed groups of coefficients to memory for subsequent use in the implementation of the neural network.” – this limitation does not provide an inventive concept because generic “storing to memory” is well-understood, routine, and conventional (WURC) computer activity; and the “subsequent use in a neural network” context is an intended-use/field-of-use limitation. See MPEP § 2106.05(d); § 2106.05(h).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3-5, 8-16, 18-24 are rejected under 35 U.S.C. 103 as being unpatentable over Sequeira et al (US20220067525A1). in view of Whiteman et al. (US10938411B1).
Regarding claim 1, Sequeira in view of Whiteman teach a computer implemented method of compressing a set of coefficients for subsequent use in a neural network, the set of coefficients comprising a number of coefficients that are non-zero, the method comprising:
“applying sparsity to a plurality of groups of coefficients of the set of coefficients, where the plurality of groups are defined to be aligned with a compression scheme to be used to perform compression,” – Sequeira teaches this limitation in part. Sequeira teaches applying sparsity via masking/pruning in neural nets:
“online masking technique … (e.g., by setting weights … to zero during training)” (Sequeira, p. 2, ¶[0060])
“and where applying sparsity to a group of coefficients of the plurality of groups of coefficients comprises setting each of the coefficients in that group to zero;” – Sequeira teaches this limitation. Sequeira expressly discloses setting weights to zero during masking:
“masks some neurons (e.g., by setting weights … to zero” (Sequeira, p. 2, ¶[0060])
“weights are all exactly zeroes.” (Sequeira, p. 2, ¶[0060])
Sequeira’s masking/pruning teaches the precise sparsity operation: setting coefficients/weights to zero, meeting this limitation.
“whereby the compressed groups of coefficients are used in an implementation of a neural network.” – Sequeira teaches this limitation. Sequeira ties pruned (sparse) network representation to inferencing:
“inferencing operations based … on representation of pruned neural network” (Sequeira, p. 2, ¶[0058])
Sequeira does not teach these limitations:
“where the plurality of groups are defined to be aligned with a compression scheme to be used to perform compression,”
“each group of coefficients comprising a predefined number of coefficients,”
“wherein each group of coefficients comprises one or more subsets of coefficients of the set of coefficients,”
“each group of coefficients comprising n coefficients and each subset of coefficients comprising m coefficients,”
“where m is greater than 1 and n is an integer multiple of m,”
“and compressing the plurality of groups of coefficients to which sparsity has been applied according to [[a]] the compression scheme aligned with the plurality of groups of coefficients”
“by compressing the one or more subsets of coefficients comprised by each group of coefficients of the plurality of groups of coefficients, each of said subsets of coefficients to be compressed comprising m coefficients that are zero,”
“such that fewer bits are used to encode the m coefficients of each of said subsets of coefficients in a compressed form of each of said subsets of coefficients than are used to encode the m coefficients of each of said subsets of coefficients in their uncompressed form;”
Whiteman, however, teaches these limitations:
“where the plurality of groups are defined to be aligned with a compression scheme to be used to perform compression,” – Whiteman teaches this limitation. Whiteman teaches forming NN data into groups specifically for purposes of the compression representation (state indicators + encoding values), i.e., groups aligned with the compression approach:
“The activation data is formed into a plurality of groups … a compressed data set is formed comprising the first state indicators” (Whiteman, § Abstract)
“each group of coefficients comprising a predefined number of coefficients,” – Whiteman treaches this limitation. Whiteman discloses an 8x8 tile that contains a predefined count of elements (64):
“An 8x8 tile of data … has values elem [i][j]” (Whiteman, col. 11, line 16)
“wherein each group of coefficients comprises one or more subsets of coefficients of the set of coefficients,” –
“A second state indicator indicates, for groups having a non-zero value, whether sub-groups within the group contain a data element” (Whiteman, § Abstract)
“each group of coefficients comprising n coefficients and each subset of coefficients comprising m coefficients,” – Whiteman teaches this limitation. Whiteman teaches a fixed-size group (tile) that is subdivided into sub-groups, satisfying the concept of groups with n elements and subsets with m elements (n and m being fixed sizes for the chosen grouping/sub-grouping):
“An 8x8 tile of data …” (Whiteman, col. 11, line 16)
“sub-groups within the group …” (Whiteman, § Abstract)
“where m is greater than 1 and n is an integer multiple of m,” – Whiteman teaches this limitation. Whiteman teaches fixed-size grouping (e.g., 8x8 tiles) and disclosed sub-grouping within the tile (sub-groups), the subset size m is >1 and the group size n is an integer multiple of m for the chosen sub-group partitioning scheme:
“An 8x8 tile of data …” (Whiteman, col. 11, line 16)
“and compressing the plurality of groups of coefficients to which sparsity has been applied according to [[a]] the compression scheme aligned with the plurality of groups of coefficients” – Whiteman teaches this limitation. Whiteman teaches an identified compression scheme with structured encoded output:
“scheme identifier (4 bits), the initial value for the delta encoding (initval, 8 bits), and the mask encoding” (Whiteman, col. 14, lines 11-13)
“by compressing the one or more subsets of coefficients comprised by each group of coefficients of the plurality of groups of coefficients, each of said subsets of coefficients to be compressed comprising m coefficients that are zero,” – Whiteman teaches this limitation. Whiteman identifies an entire subset (one 4x4 grid) as all-zero:
“bottom right 4x4 grid does not include any non-zero values.” (Whiteman, col. 13, lines 20-21)
“such that fewer bits are used to encode the m coefficients of each of said subsets of coefficients in a compressed form of each of said subsets of coefficients than are used to encode the m coefficients of each of said subsets of coefficients in their uncompressed form;” – Whiteman teaches this limitation. Whiteman teaches encoding zeros via masks (rather than full-value encoding), explicitly describing the bit-level compressed representation:
“mask encoding … encodes the location of the zero values.” (Whiteman, col. 14, lines 12-13)
“scheme identifier (4 bits), the initial value for the delta encoding (initval, 8 bits), and the mask encoding” (Whiteman, col. 14, lines 11-13)
It would have been obvious to a POSITA to apply Sequeira’s masking/sparsity (setting weights/coefficients to zero) to coefficient groups that are then compressed using Whiteman’s grouped/sub-grouped compression representation, because Whiteman expressly addresses efficient compression of neural-network data having zeros/non-zeros using state indicators, and sparsification (Sequeira) predictably increases zeros and thus improves compression efficiency within Whiteman’s scheme.
Regarding claim 3, Sequeira in view of Whiteman, teach the computer implemented method of claim 1, wherein:
n is greater than m – Sequeira teaches the broader method of pruning/compression but does not explicitly teach the relative sizing constraint “n is greater than m”.
However, Whiteman teaches the claimed relationship with an explicit numerical example (n=16, m=4 → n > m):
“the unit of data includes sixty-four data elements, the unit of data is divided into four groups of sixteen elements, and the groups are sub-divided into sub-groups of four data elements.” (Whiteman, col. 23, lines 18-21)
wherein each group of coefficients is compressed by compressing multiple adjacent or interleaved subsets of coefficients – Whiteman teaches forming:
“a plurality of sub-groups within each group”
and producing a compressed data set from group/sub-group indicators plus encoded values; masks are applied:
“in row-major order across the 4x4 grids that were found to have non-zero values.”
“A first mask is generated by considering the 8x8 grid as a collection of groups of data elements in the form of four 4x4 grids: a top-left 4x4 grid, a top-right 4x4 grid, a bottom-left 4x4 grid and a bottom-right 4x4 grid. Each 4x4 grid is examined to see if there are any non-zero values.
Thus, showing multiple subsets in a group are each evaluated/encoded during compression.
Sequeira’s pruning yields many all-zero blocks and a size-reduced model for deployment (storage/loading by an inference engine). Whiteman provides a group/sub-group-aligned compression that processes multiple sub-groups per group and (when applicable) represents all-zero regions with scheme/indicator bits rather than raw values, reducing bits written “to be sent for storage in the DRAM”. A POSITA would have adopted Whiteman’s group/sub-group layout (n > m) and multi-subset group compression within Sequeira’s pruned-weights framework to further shrink the storage and bandwidth of the already sparse coefficients during deployment, an expected, predictable optimization aligned with both references’ stated aims (reduced footprint; efficient storage/processing).
Regarding claim 4, Sequeira in view of Whiteman teach the computer implemented method of claim 1,
“wherein n is equal to m or 2m.” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches a compression framework where a “group” may be subdivided into sub-groups within the group (i.e., the number/organization of sub-groups is part of the encoding organization):
“forming a plurality of groups of data elements …” (Whiteman, col. 2, line 27)
“forming a plurality of sub-groups within each group of data elements …” (Whiteman, col. 2, lines 6-7)
Given Whiteman’s explicit teaching that a group can be organized into sub-groups for encoding, a POSITA would have found it obvious to select the subdivision granularity such that (i) the group is treated as a single sub-group (n
=
m, i.e., one sub-group per group) or (ii) the group is divided into two equal-sized sub-groups (n
=
2m, i.e., two sub-groups per group), as a predictable implementation choice to balance metadata overhead (state indicators) versus compression granularity within Whiteman’s established group/sub-group encoding scheme.
Regarding claim 5, Sequeira in view of Whiteman teach the computer implemented method of claim 4
“wherein n is equal to 2m, wherein each group comprises 16 coefficients” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches:
“first mask is generated by considering the 8x8 grid as a collection of groups of data elements in the form of four 4x4 grids: a top-left 4x4 grid, a top-right 4x4 grid, a 15 bottom-left 4x4 grid and a bottom-right 4x4 grid.” (Whiteman, col. 13, lines 13-16)
A 4x4 grid corresponds to a “group” having a fixed, predefined number of elements (16). Thus, Whiteman teaches a grouping that maps to “each group comprises 16 coefficients”, and provides the basis for selecting n and m values tied to that group/sub-group hierarchy.
“and each subset comprises 8 coefficients,” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches:
“For each 4x4 grid that includes non-zero values, sub-groups in the form of 2x2 grids are formed: top-left, top-right, bottom-left and bottom-right.” (Whiteman, col. 13, lines 27-29)
“The process is performed in row-major order across the 4x4 grids …” (Whiteman, col. 13, lines 31-32)
Whiteman teaches subdividing each 4x4 “group” into multiple smaller sub-groups (2x2) and processing them in a defined order (row-major), i.e., an arrangement that establishes adjacency and predictable grouping. From this taught framework, selecting a subset size of 8 coefficients (e.g., by combining two neighboring 2x2 sub-groups into a 2x4 region or by selecting two adjacent 2x2 sub-groups as a unit) is a routine design/parameter choice to balance compression granularity versus overhead.
“and wherein each group is compressed by compressing two adjacent or interleaved subsets of coefficients.” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches:
“The process is performed in row-major order across the 4x4 grids …” (Whiteman, col. 13, lines 31-32)
“This is done in row-major order for each 4x4 grids …” (Whiteman, col. 13, lines 52-53)
“The total mask encoding is a concatenation of the 4x4 mask, the 2x2 mask and the lxl mask.” (Whiteman, col. 13, lines 61-62)
Whiteman’s row-major processing of sub-groups provides a clear teaching of treating neighboring sub-regions in an ordered sequence (adjacent units) and combining multiple hierarchical mask components into a single compressed representation (concatenation). In view of this, compressing a group by compressing two adjacent (or “interleaved” via ordered traversal) subsets is an obvious implementation choice within Whiteman’s disclosed group/sub-group traversal/encoding scheme to achieve predictable compression behavior and metadata overhead tradeoffs.
Whiteman expressly teaches defining groups as 4x4 grids within an 8x8 tile and further forming sub-groups within each group and processing them in row-major order for mask-based compression, therefore, selecting the particular parameterization of n=2m, group size 16, and compressing the group using two adjacent/interleaved subsets is an obvious design choice (routine selection of group/sub-group granularity and traversal/encoding strategy) within Whiteman’s disclosed hierarchical grouping and encoding approach, as applied in the neural-network sparsity/compression context provided by Sequeira + Whiteman.
Regarding Claim 8, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 1, wherein
sparsity is applied to the plurality of groups of the coefficients in dependence on a sparsity mask that defines which coefficients of the set of coefficients to which sparsity is to be applied – Sequeira does not teach this, however, Whiteman teaches this, stating:
“Looking at the bottom-left 4x4 grid, all the 2x2 grids have zero valued date elements. Accordingly, the 2x2 mask for that 4x4 grid is [1111].” (Whiteman, col.14, lines 59-61)
This shows that entire groups of coefficients (e.g., 2x2 regions) are completely zeroed, satisfying the claim requirement that each of the coefficients in that group is set to zero.
It would have been obvious to a person of skill in the art at the time of the claimed invention to apply Whiteman’s group-level zeroing mechanism to the group-structured coefficient sparsity framework of Sequeira to reduce encoding cost and computational complexity.
Regarding Claim 9, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 8, wherein
the set of coefficients is a tensor of coefficients, the sparsity mask is a binary tensor of the same dimensions as the tensor of coefficients, and sparsity is applied by performing an element-wise multiplication of the tensor of coefficients with the sparsity mask tensor – Sequeira does not teach this, however, Whiteman teaches this, stating:
“Each square cell represents a data element of an 8x8 tile of activation data.”
and,
“The masking process provides a way to encode the location to zero values… ”
and,
“The compressed activation data for the Mask Delta GRC schemes consists of… the mask encoding that encodes the location of the zero values.” (Whiteman, col. 13, lines 2-3, 11-12, col. 14, lines 10-13)
These disclosures describe a set of activation data as a multi-dimensional grid (tensor), and a mask of matching dimensions that defines, per element, whether the value is zero or non-zero. This supports the use of a binary mask aligned with the data tensor.
It would have been obvious to a person of skill in the art at the time of the claimed invention to apply such a binary mask to the tensor using element-wise multiplication, in order to apply sparsity in a computationally efficient manner.
Regarding Claim 10, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 9, wherein
the sparsity mask tensor is formed by: generating a reduced tensor having one or more dimensions an integer multiple smaller than the tensor of coefficients, wherein the integer being greater than 1; determining elements of the reduced tensor to which sparsity is to be applied so as to generate a reduced sparsity mask tensor; and expanding the reduced sparsity mask tensor so as to generate a sparsity mask tensor of the same dimensions as the tensor of coefficients – Sequeira does not teach this, however, Whiteman teaches the creation of a reduced sparsity mask followed by expansion:
“Looking at the bottom-left 4x4 grid, all the 2x2 grids have zero valued data elements. Accordingly, the 2x2 mask for that 4x4 grid is [1111]” (Whiteman, col. 14, lines 59-61)
This example shows that a coarser (2x2) resolution binary mask is created for a larger region (4x4), indicating generation of a reduced tensor having dimensions smaller than the original. This corresponds to the first and second steps of the claim. The resulting 2x2 binary mask is then used to construct a higher-resolution binary mask through recursive or hierarchical propagation, as suggested in the remainder of the disclosure, satisfying the final step of generating a full-size sparsity mask tensor vis expansion.
Whiteman further teaches expansion mechanisms via decoding rules and chunk assembly:
“The compressed activation data is made up of the scheme identifier (4 bits), the initial value for the delta encoding (8 bits), and the inverse mask described above… The cell is formed by the stitch processor 22 using a set of flow control rules… ” (Whiteman, col. 15-16, lines 27-29, 36-37)
“…the chunk structures 102 to 104 that form the first chunk encoding a new tile… may be followed by remainder data and/or unary data” (Whiteman, col. 16, lines 59-63)
This supports expansion of reduced-resolution masks into full-size tensors during decoding, consistent with “expanding the reduced sparsity mask tensor so as to generate a sparsity mask tensor of the same dimensions as the tensor of coefficients”.
It would have been obvious to a person of skill in the art at the time of the claimed invention to combine Whiteman’s reduced-to-full-resolution sparsity mask structure with the grouped coefficient architecture of Sequeira to achieve efficient storage and computation for sparse neural networks.
Regarding Claim 11, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 10, wherein
generating the reduced tensor comprises: dividing the tensor of coefficients into multiple groups of coefficients, such that each coefficient of the set is allocated to only one group and all of the coefficients are allocated to a group and representing each group of coefficients of the tensor of coefficients by the maximum coefficient value within that group – Sequeira does not teach this, however, Whiteman teaches this, stating:
“Each 4x4 grid is examined to see if there are any non-zero values.” (Whiteman, col. 14, lines 40-41)
This supports the idea of forming a reduced tensor by grouping coefficients and selecting a representative value (e.g., maximum) from each group, as part of an intermediate representation used for further mask generation or sparsity encoding. Whiteman’s examination of each 4x4 grid for non-zero values corresponds to evaluating grouped coefficients and determining a representative value based on magnitude.
It would have been obvious to a person of ordinary skill in the art at the time of the claimed invention to apply Whiteman’s method of scanning coefficient groups and extracting representative values (e.g., maximums) to the grouped-coefficient tensor architecture disclosed by Sequeira. The combination would enable efficient mask generation, compression, or prioritization of high-importance regions in sparse neural processing, yielding a predictable and beneficial result aligned with known optimization goals in neural network encoding.
Regarding Claim 12, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 10, further comprising:
expanding the reduced sparsity mask tensor by performing nearest neighbor upsampling such that each value in the reduced sparsity mask tensor is represented by a group comprising a plurality of like values in the sparsity mask tensor – Sequeira does not teach this; however, Whiteman teaches:
Generating a reduced sparsity mask:
“The compressed activation data for the Mask Delta GRC schemes consists of a scheme identifier (4 bits), the initial value for the delta encoding (interval, 8-bits), and the mask information that encodes the location of the zero values.” (Whiteman, col. 14, lines 10-13)
Expanding the reduced sparsity mask to full size by replicating values across a group of positions:
“Accordingly, a 4x4 mask is formed [1111]… For each 4x4 grid that includes zero values, 2x2 grids are formed… Each of these 2x2 grids are examined… For the top-left 4x4 grid, each 2x2 grid includes a zero-valued element. Accordingly, the 2x2 mask for the grid is [1111].” (Whiteman, col. 14, lines 43-56)
This teaches that a single value in a reduced mask is expanded to multiple coefficients – replicating that value across a fixed region. This replication is functionally equivalent to nearest neighbor upsampling.
A person of skill in the art would have been motivated to incorporate Whiteman’s techniques of replicating reduced mask values across multiple positions (i.e., nearest neighbor upsampling) into the method of Sequeira in order to efficiently align sparsity information to groups of coefficients; yielding the predictable result: expanding a reduced sparsity mask using nearest neighbor upsampling, as required by Claim 12.
Regarding Claim 13, Sequeira in view of Whiteman teach the computer implemented method of claim 1, wherein compressing each subset of coefficients comprises:
Generating header data comprising h-bits and a plurality of body portions each comprising b-bits, wherein each of the body portions corresponds to a coefficient in the subset, wherein b is fixed within a subset, and wherein the header data for a subset comprises an indication of b for the body portions of that subset. – Sequeira does not teach this limitation.
However, Whiteman teaches:
Header carrying decoding/length information:
“first chunk of a cell must include a header, which provides information on the length of the cell and the length of the unary sub-stream included within the cell. The length of the remainder values within the cell is not included in the header but can be derived” (Whiteman, col. 16, lines 42-46)
“The top two chunk structures … include a header portion of 32-bit length, which is needed at the start of a cell.” (Whiteman, col. 16, lines 57-59)
Body portions with fixed bit-width within a unit (subset) and one body per value:
Whiteman’s Golomb-Rice description explains the fixed-length binary portion for the remainder once the parameter is set (illustrative of b fixed within the encoded unit):
“The second portion of the Golomb Rice code is a fixed length binary portion.” (Whiteman, col. 1, lines 33-35)
Whiteman further encodes the non-zero values and forms a compressed data set from indicators and the encoded values (i.e., body portions that correspond to elements in the region being encoded).
Group/subgroup organization supporting subsets-level headers & bodies: Whiteman operates over groups and sub-groups within a tile and assembles a compressed data set from state indicators (header/meta) and encoded non-zero values (b-bit bodies), aligning exactly with a header & bodies structure per subset.
Together these disclosures teach (i) generating header data for each encoded unit that conveys decoding/length information, (ii) emitting body portions of fixed bit-width within that unit, and (iii) the body portions correspond to the (encoded) coefficients of that subset.
A POSITA implementing Sequeira’s sparse/pruned coefficient pipeline would adopt Whiteman’s header-plus-fixed-width bodies to simplify hardware decode, bound per-subset bit-width (b), and reduce parse overhead, a predictable optimization for storage and bandwidth of neural data. Whiteman explicitly targets storage-oriented compression with headers and fixed-structure bodies; applying that known format to Sequeira’s subsets yields the expected benefits without changing either reference’s principle of operation.
Regarding Claim 14, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 13, the method further comprising:
identifying a body portion size, b, by locating a bit position of a most significant leading one across all the coefficients in the subset; generating the header data comprising a bit sequence encoding the body portion size; and generating a body portion comprising b-bits for each of the coefficients in the subset by removing none, one or more leading zeros from each coefficient – Sequeira does not explicitly teach identifying the body portion size by locating the most significant leading one; however, Whiteman teaches identifying unary code lengths by locating leading ones, and generating headers accordingly:
“In step S152, a difference is taken between the values of each neighboring top-bit location which gives the value of the unary code.” (Whiteman, col. 20, lines 52-54)
and,
“generate the Golomb-Rice codes. Once the values of udelta[i][j] have been converted to Golomb-Rice codes, the compressed data is formed of a scheme value ( 4 bits), an initial value (initial) corresponding to the top-left data element (elem [0][0]), and the Golomb-Rice codes corresponding to udelta[i][j].” (Whiteman, col. 12, lines 55-60)
This teaches that the encoding logic identifies the most significant bit (i.e., leading one) by determining unary length differences between stop bits, and then constructs the compressed body from these lengths using fixed-width formats. This operation enables the removal of leading zeros and size-optimized coefficient storage.
A person of ordinary skill in the art at the time of the claimed invention would have been motivated to use the unary decoding strategy of Whiteman (parsing most significant leading bits and storing the bit length) in combination with Sequeira’s subset-based coefficient grouping, in order to ensure that each subset is encoded using the minimal bit-width b necessary to represent all value – thereby reducing memory usage and increasing bandwidth efficiency.
Regarding Claim 15, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 1, wherein:
the number of groups to which sparsity is to be applied is determined in dependence on a sparsity parameter – Sequeira does not explicitly teach this; however, Whiteman teaches:
“forming a plurality of groups of data elements within a unit of activation data, each group including a plurality of data elements; identifying whether there are any data elements within each group that have a non-zero value and forming a first state indicator for each group that indicates whether that group contains data elements having a non-zero value… ” (Whiteman, col. 22, lines 57-64)
“forming a compressed data set comprising the first state indicators, any second state indicators, any sub-group state indicators and the encoded non-zero values.” (Whiteman, col. 23, lines 12-14)
This discloses that the selection and encoding of groups based on whether they exceed or fall below a zero-value threshold inherently reflects a sparsity parameter – such as an implicit saliency or significance criterion – and modulates the number of groups encoded accordingly.
A person of ordinary skill in the art at the time of the claimed invention would recognize that applying a sparsity threshold (such as a minimum saliency or non-zero content level) directly affects how many groups are retained for encoding “the number of groups to which sparsity is to be applied… in dependence on a sparsity parameter” as claimed and thus it would have been obvious to combine Sequeira with Whiteman to yield the subject matter of Claim 15.
Regarding Claim 16, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 15, the method further comprising:
dividing the set of coefficients into multiple groups of coefficients, such that each coefficient of the set is allocated to only one group and all of the coefficients are allocated to a group determining a saliency of each group of coefficients; and applying sparsity to the plurality of the groups of coefficients having a saliency below a threshold value, the threshold value being determined in dependence on the sparsity parameter, optionally wherein the threshold value is a maximum absolute coefficient value or an average absolute coefficient value – Sequeira does not explicitly teach this; however, Whiteman teaches:
“forming a plurality of groups of data elements within a unit of activation data, each group including a plurality of data elements; identifying whether there are any data elements within each group that have a non-zero value and forming a first state indicator… ” (Whiteman, col. 22, lines 57-60)
The grouping and masking procedure taught effectively encodes a saliency score (non-zero presence) per group. The encoded group is retained only if it exceeds a threshold (i.e., contains a significant, non-zero value), and groups below this saliency thresholds are marked for masking (i.e., sparsity applied).
A person of ordinary skill in the art at the time of the claimed invention would recognize that thresholding based on saliency (e.g., maximum or average magnitude) is a design choice interchangeable with binary presence/absence. The claimed “maximum absolute coefficient value or average absolute coefficient value” merely defines specific thresholds, which are well-known in the art of neural compression and would be considered obvious design alternatives.
Regarding Claim 18, Sequeira in view of Whiteman teaches the computer-implemented method of Claim 1, further comprising:
using the compressed groups of coefficients in a neural network – Sequeira does not teach using the compressed groups in a neural network; however, Whiteman teaches:
“The components 2 are components for writing activation values to a DRAM (not shown) external to the NPU. When performing calculations relating to a neural network, calculations may be performed for each layer of the neural network.” (Whiteman, col. 10, lines 9-13)
“A processing element in the form of an encoder 20 is configured to compress the received activation data by converting the activation data into Golumb Rice codes. Further steps, which will be described below, are then performed to make the compressed activation data easier to decode.” (Whiteman, col. 10, lines 39-44)
These teachings show that compressed neural data is used as input to further layers of the neural network, satisfying the claimed use of compressed groups of coefficients within a neural network.
A person of ordinary skill in the art at the time of the claimed invention would have been motivated to apply Sequeira’s compression scheme to coefficients that are subsequently used by the neural network, just as compressed activations are reused in Whiteman. Doing so supports reduced memory usage and efficient execution – predictable benefits in standard neural network systems.
Regarding claim 19:
Claim 19 is rejected under 35 U.S.C. § 103 as being unpatentable over Sequeira in view of Whiteman. Sequeira teaches applying sparsity to neural network coefficients by masking/pruning (e.g., setting weights/coefficients to zero) in the context of neural-network inferencing. Whiteman teaches forming neural-network data into groups and sub-groups aligned with a compression scheme and compressing subsets containing zeros using indicator/mask-based encoding such that fewer bits are used than in an uncompressed representation. It would have been obvious to implement Sequeira’s sparsification within a data processing system having functional components (e.g., pruner logic and a compression engine) to perform the claimed operations and to compress the resulting sparse coefficient groups using Whiteman’s group/sub-group compression scheme, since increasing zeros predictably improves compression efficiency and reduces storage/band with in neural-network implementations.
Regarding claim 20:
Claim 20 is rejected under 35 U.S.C. § 103 as being unpatentable over Sequeira in view of Whiteman. Sequeira teaches applying sparsity to neural network coefficients by masking/pruning (e.g., setting weights/coefficients to zero) in the context of neural-network inferencing. Whiteman teaches forming neural-network data into groups and sub-groups aligned with a compression scheme and compressing subsets containing zeros using indicator/mask-based encoding such that fewer bits are used than in an uncompressed representation. It would have been obvious to implement the claimed method as computer-executable instructions stored on a non-transitory computer-readable medium to cause performance of the above-described sparsification and compression operations, since doing so is a routine and predictable way to deploy such processing in a computer system for neural-network implementation.
Regarding claims 21-22:
Claim 21 is method-analogous to claim 13 (fixed bit-width per subset determined by the largest value). Claim 22 is method-analogous to claim 14 (the all -zero subset case).
Accordingly, the § 103 rejections (including references and motivation to combine) set forth for claims 13 and 14 apply with equal force to claims 21 and 22, respectively.
Regarding claim 23, Sequeira in view of Whiteman teach the computer implemented method of claim 1,
“wherein the predefined number of coefficients in each of the plurality of groups of coefficients is the same.” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches:
“An 8x8 tile of data …” (Whiteman, col. 11, line 16)
“each square cell represents a data element of an 8x8 tile …” (Whiteman, col. 13, lines 2-3)
Whiteman’s “8x8 tile” grouping teaches that each group contains the same fixed, predefined number of data elements (i.e., a uniform group size). Therefore, configuring the plurality of groups such that each group has the same predefined number of coefficients is taught by (or at minimum would have been an obvious design choice consistent with) Whiteman’s fixed-size/group compression organization.
Regarding claim 24, Sequeira in view of Whiteman teach the computer implemented method of claim 1,
“further comprising storing the compressed groups of coefficients to memory” – Sequeira does not teach this limitation. Whiteman, however, teaches this limitation. Whiteman teaches:
“A method for compressing activation data of a neural network to be written to a storage is provided.” (Whiteman, § Abstract)
“decompressing compressed activation data of a neural network that is read from a storage” (Whiteman, col. 7, lines 48-49)
“for subsequent use in the implementation of the neural network.” – Sequeira teaches this limitation. Sequeira teaches:
“inferencing operations … on representation of pruned neural network …” (Sequeira, p. 2, ¶[0058])
Whiteman expressly teaches writing the compressed data set to storage, which corresponds to storing compressed groups of coefficients to memory. Further, Sequeira teaches that the (pruned/sparse) neural network representation is used for inferencing operations, and Whiteman’s compression is explicitly in the neural-network context, so storing the compressed groups “for subsequent use in the implementation of the neural network” is taught by the combined neural-network storage/processing workflow.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Paul Coleman whose telephone number is (571)272-4687. The examiner can normally be reached Mon-Fri.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, David Yi can be reached at (571) 270-7519. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAUL COLEMAN/ Examiner, Art Unit 2126
/DAVID YI/ Supervisory Patent Examiner, Art Unit 2126