Prosecution Insights
Last updated: April 19, 2026
Application No. 18/557,213

Method and System for Constructing Speech Recognition Model and Speech Processing

Final Rejection §101
Filed
Oct 25, 2023
Examiner
ZHU, RICHARD Z
Art Unit
2654
Tech Center
2600 — Communications
Assignee
Huawei Technologies Co., Ltd.
OA Round
2 (Final)
69%
Grant Probability
Favorable
3-4
OA Rounds
3y 2m
To Grant
85%
With Interview

Examiner Intelligence

Grants 69% — above average
69%
Career Allow Rate
498 granted / 718 resolved
+7.4% vs TC avg
Strong +15% interview lift
Without
With
+15.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
32 currently pending
Career history
750
Total Applications
across all art units

Statute-Specific Performance

§101
16.0%
-24.0% vs TC avg
§103
54.5%
+14.5% vs TC avg
§102
19.7%
-20.3% vs TC avg
§112
4.2%
-35.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 718 resolved cases

Office Action

§101
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Acknowledgement Acknowledgement is made of applicant’s amendment made on 01/21/2026. Applicant’s submission filed has been entered and made of record. Status of the Claims Claims 1-4, 6-13, 15-19, and 43-45 are pending. Response to Applicant’s Arguments In response to “For example, independent claim 1 generally recites, and independent claims 10 and 43 similarly, recite: Obtaining target keywords and semantically-associated synonym groups. Training a language model based on keywords and synonym groups. Generating decoding graphs with syntax constraint rules. Generating separate subgraphs for different keywords with clustered decoding paths. Determining a speech recognition model by combining subgraphs. None of these steps recite mathematical concepts”, “The claims also require generating a decoding graph indicating a plurality of decoding paths that satisfy syntax constraint rules. The specification describes these decoding graphs as complex computer data structures, such as HCLG frameworks that combine Hidden Markov Model graphs, context-dependency graphs, lexicon graphs, and grammar graphs using weighted finite-state transducers. See paragraphs 200-210 of the application. A human being cannot mentally construct and maintain such complex data structures with multiple weighted paths and transducer states. These are computer-implemented data structures that exist in computer memory, not concepts that can be held in the human mind”, and “Furthermore, the amended claims recite obtaining separate groups of decoding paths from the first decoding graph, generating separate subgraphs for the first and second keywords, and determining the speech recognition model based on the combined subgraphs”. According to the Supreme Court, a patent may issue for the means or method of producing a certain result, or effect, and not for the result or effect produced. Diamond v. Diehr, 450 U.S. 175, 182 n. 7 (1981). Therefore, the focus is on whether the claim “focus on a specific means or method that improves the relevant technology or are instead directed to a result or effect that itself is the abstract idea and merely invoke generic processes and machinery”. Enfish, L.L.C. v. Microsoft Corp., 822 F.3d 1327, 1336 (Fed. Cir. 2016). For example, in Enfish, the CAFC found it relevant to ask whether claims were directed to an improvement to computer functionality versus being directed to an abstract idea. Enfish, 822 F.3d at 1335. To that extent, the CAFC found that the claims were specifically directed to a self-referential table for a computer database. Id. at 1337. In particular, the claim language required a four step algorithm specifically directed to a self-referential table for a computer database that improved upon prior art information search and retrieval systems by employing a flexible, self-referential table to store data. Id. at 1336-37. Therefore, the focus of the claims was on a specific asserted improvement in computer capabilities (i.e., the self-referential table for a computer database), not on economic or other tasks for which a computer was used in its ordinary capacity. Id. at 1336. See also MPEP 2106.04(d)I (“an improvement in the functioning of a computer or an improvement to other technology or technical field, as discussed in MPEP 2106.04(d)(1) and 2106.05(a)”). In the instant application, amended claim 1 recites a method for constructing a speech recognition model, wherein the method comprises: obtaining a target keyword, wherein the target keyword comprises a first keyword and a second keyword; obtaining a synonym group semantically associated with the target keyword, wherein the synonym group comprises a first synonym group semantically associated with the first keyword and a second synonym group semantically associated with the second keyword; training, based on the target keyword and the synonym group, a language model to obtain a target language model; generating, based on the target language model, a first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group; and determining, based on the first decoding graph, the speech recognition model, wherein determining the speech recognition model comprises: (1) obtaining, from the first decoding graph, a first group of decoding paths and a second group of decoding paths, wherein the first group comprises first decoding paths corresponding to the first keyword and the first synonym group, and wherein the second group comprises second decoding paths corresponding to the second keyword and the second synonym group; (2) generating, based on the first group, a first subgraph; (3) generating, based on the second group, a second subgraph; and (4) determining, based on the first subgraph and the second subgraph, the speech recognition model. Amended claim 10 recites a method for speech processing comprising: receiving a speech input; obtaining a target keyword, wherein the target keyword comprises a first keyword and a second keyword; obtaining a synonym group semantically associated with the target keyword, wherein the synonym group comprises a first synonym group semantically associated with the first keyword and a second synonym group semantically associated with the second keyword; training, based on the target keyword and the synonym group, a language model to obtain a target language model; generating, based on the target language model, a first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group; determining, based on the first decoding graph, a speech recognition model, wherein determining the speech recognition model comprises: (1) obtaining, from the first decoding graph, a first group of decoding paths and a second group of decoding paths, wherein the first group comprises first decoding paths corresponding to the first keyword and the first synonym group, and wherein the second group comprises second decoding paths corresponding to the second keyword and the second synonym group; (2) generating, based on the first group, a first subgraph; (3) generating, based on the second group, a second subgraph; and (4) determining, based on the first subgraph and the second subgraph, the speech recognition model; and determining, using the speech recognition model, text representation associated with the speech input. Applicant argued “These are computer-implemented data structures that exist in computer memory, not concepts that can be held in the human mind” and “Furthermore, the amended claims recite obtaining separate groups of decoding paths from the first decoding graph, generating separate subgraphs for the first and second keywords, and determining the speech recognition model based on the combined subgraphs”. In other words, much like the four step algorithm specifically directed to a self-referential table for a computer database in Enfish, steps (1)-(4) in amended claims 1 and 10 are specifically directed to a particular speech recognition model data structure. Much like the self-referential table improved upon prior art information search and retrieval, the speech recognition model with the specifically asserted structure described in steps (1)-(4) set forth a particular speech recognition model for speech recognition. Accordingly, claims 1-4, 6-13, 15-19 are patent eligible. Since no prior art teach the combination of limitations set forth in claims 1 and 10, claims 1-4, 6-13, 15-19 are allowed. In response to “With reference to claim 1, the claim recites training a language model based on target keywords and synonym groups. Training a language model is not a mental process. It requires processing large amounts of text data to compute probability distributions over word sequences and adjust model parameters based on training algorithms. A human being cannot mentally process thousands or millions of text samples, calculate statistical relationships between words, and generate a functional language model. This process uses computer processing power, memory storage, and computational algorithms”. Amended claim 43 recites an electronic device, comprising: a memory configured to store instructions; and a processor coupled to the memory and configured to execute the instructions to cause the electronic device to: obtain a target keyword; obtain a synonym group semantically associated with the target keyword, wherein obtaining the synonym group comprises: (a) determining first semantics of the target keyword; and (b) determining the synonym group based on the first semantics, wherein a first difference between second semantics of each synonym in the synonym group and the first semantics is less than a difference threshold, and wherein determining the synonym group comprises determining the synonym group further based on a first length of the target keyword, and wherein a second difference between a second length of each synonym in the synonym group and the first length is less than a length threshold; train, based on the target keyword and the synonym group, a language model to obtain a target language model; generate, based on the target language model, a first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group; and determine, based on the first decoding graph, a speech recognition model. Unlike Claims 1 and 10, claim 43 does not describe a particular structure for the speech recognition model. Rather, the focus of the claims set forth in (a)-(b) was directed to obtain a synonym group semantically associated with the target keyword. Here, determination steps (a) and (b) are mental steps to form an observation, evaluation, or judgment about semantics / meaning of target keyword and using the semantics / meaning of target keyword to make a determination / observation / judgment about the synonym group. The step “generate, based on the target language model, a first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group”, in view of the specification US 2024/0242709 A1 at ¶282: “FIG. 11 is used as an example. A decoding path corresponding to a target keyword “increase sound” has a same weight as a decoding path corresponding to a synonym “raise volume”: PNG media_image1.png 316 518 media_image1.png Greyscale In other words, the decoding graph can be as simple as a mental step to mathematically weight “increase sound” and “raise volume” as equal; i.e., making a judgment that “increase sound” and “raise volume” as synonyms. Unlike Claims 1 and 10, claim 43 does not describe a particular structure for the decoding graph and therefore no specifically asserted speech recognition model structure. In response to “The claimed features solve this through a specific architectural approach. In particular, the system obtains target keywords and automatically determines semantically-associated synonym groups. When multiple keywords exist, the system generates separate subgraphs for each keyword, clustering together the decoding paths for that keyword and its synonyms. This subgraph clustering reduces decoding complexity (see paragraphs 283-285 of the application) and enables "faster decoding search...reducing computing overheads and storage overheads." See paragraph 286 of the application. The claims also recite syntax constraint rules based on specific keyword- synonym combinations and filtering using semantic and length difference thresholds. This specific solution enables lite devices to perform semantic keyword recognition locally, i.e., without cloud processing” and “In this way, the claims are analogous to McRO, Inc. v. Bandai Namco Games America Inc., 837 F.3d 1299 (Fed. Cir. 2016), where specific rules that improved automatic lip synchronization were found to be patent-eligible. There, the court held the claims were "directed to a patentable, technological improvement" using "a combined order of specific rules." Id. at 1316. Like McRO, the present claims recite specific rules for achieving an improvement: semantic association thresholds, syntax constraint rules based on keyword-synonym combinations, subgraph clustering, and equal weighting within subgraphs. The claims explain how the improvement is achieved through particular structural and functional implementations, not just that an improvement occurs”. According to the Supreme Court, a patent may issue for the means or method of producing a certain result, or effect, and not for the result or effect produced. Diamond v. Diehr, 450 U.S. 175, 182 n. 7 (1981). Therefore, the focus is on whether the claim “focus on a specific means or method that improves the relevant technology or are instead directed to a result or effect that itself is the abstract idea and merely invoke generic processes and machinery”. McRO, Inc. v. Bandai Namco Games America, Inc., 837 F.3d 1299, 1314 (Fed. Cir. 2016). For example, in McRO, the CAFC noted that prior art method of generating morph weight set with values between “0” and “1” for computer animation of facial expressions are manually determined. Id. at 1304-5. The claimed improvement in McRO allows computers to produce “accurate and realistic lip synchronization and facial expressions in animated characters” that previously could only be produced by human animators through the automated use of rules, rather than artists, to set the morph weights and transitions between phonemes. Id. at 1313. Specifically, the claims were directed to the incorporation of claimed rules, not the use of the computer that improved existing technological process by allowing automation of further tasks that goes beyond merely organizing existing information into a new form. Id. at 1314-15. In particular, the claimed process used a combined order of specific rules that renders information into a specific format that is then used and applied to create a sequence of synchronized, animated characters that prevent pre-emption of all processes for achieving automated lip-synchronization of 3-D characters. Id. at 1315. Therefore, the CAFC held that the ordered combination of claimed steps, using unconventional rules that relate sub-sequences of phonemes, timing, and morph weight sets is patent eligible. Id. at 1302-3. See also MPEP 2106.04(d)I (“an improvement in the functioning of a computer or an improvement to other technology or technical field, as discussed in MPEP 2106.04(d)(1) and 2106.05(a)”). Unlike amended claims 1 and 10, amended claim 43 does not set forth a specifically asserted improvement in a speech recognition model structure. Here, unlike the patent eligible automation claims in McRO that set forth a specific means of automating animation (i.e., using a combined order of specific rules that renders information into a specific format that is then used and applied to create a sequence of synchronized, animated characters to improve a computer process through the automated use of rules), amended claim 43 recited no combination of specific rules or steps to render or structure the speech recognition model corresponding to a specifically asserted improvement of the speech recognition model. For example, generate the first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group can be as simple as a mental step to mathematically weight “increase sound” and “raise volume” as equal; i.e., making a judgment that “increase sound” and “raise volume” as synonyms. The step of determine the speech recognition model is broadly a step to model “increase sound” and “raise volume” as synonyms. To the extent that applicant argued “This subgraph clustering reduces decoding complexity (see paragraphs 283-285 of the application) and enables "faster decoding search...reducing computing overheads and storage overheads.", lacking steps describing a particularly structured speech recognition model, adding computer functionality to increase the speed or efficiency of the process confer patent eligibility on an otherwise abstract idea. Intellectual Ventures I, 792 F.3d at 1367 (citing Bancorp Servs., LLC v. Sun Life Insurance Co. of Can., 687 F.3d 1266, 1278 (Fed. Cir. 2012) (“The fact that the required calculations could be performed more efficiently via a computer does not materially alter the patent eligibility of the claimed subject matter”)). Therefore, amended claim 43 does not satisfy the requirement for patent eligibility. Claim Rejections - 35 USC § 101 35 U.S.C. §101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claim 43 is rejected under 35 USC 101 as directing toward non-statutory subject matter. Claim 43 recites an electronic device comprising a processor and memory (“machine”). To distinguish ineligible claims that merely recite a judicial exception from eligible claims that require an implementation of judicial exception, the Supreme Court uses a two-step framework: Step One (Step 2A), determine whether the claims at issue are directed to one of those patent-ineligible concepts; and Step Two (Step 2B), if so, ask “what else is there in the claims?” to determine whether the additional elements transform the nature of the claim into a patent eligible application. Alice Corp. Pty. Ltd. v. CLS Bank Int’l., 134 S. Ct. 2347, 2355 (2014). Step One (Step 2A) is a two prong test that requires the determination of whether the claims at issue are directed to an enumerated patent ineligible concept. See MPEP 2106.04. Specifically, Step 2A Prong (1) requires the determination of the specific limitations in the claim under examination (individually or in combination) that the examiner believes recites an abstract idea and determining whether the identified limitations falls within the subject matter groupings of abstract ideas enumerated. See MPEP 2106.04(a). The enumerated patent ineligible concepts comprising: (a) Mathematical Concepts – mathematical relationships, mathematical formulas or equations, mathematical calculations; (b) Certain methods of organizing human activity – fundamental economic principles / practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between people (including social activities, teaching, and following rules / instructions) and (c) Mental processes – concepts performed in the human mind (including an observation, evaluation, judgment, opinion). See MPEP 2106.04(a). If the claim recites an enumerated patent ineligible concept, then Prong (2) of Step One (Step 2A) requires the determination of whether the claim integrates the patent ineligible concept into a practical application. Individually and in combination, identifying whether there are any additional elements recited in the claim beyond the judicial exceptions and evaluating those additional elements to determine whether they integrate the exception into a practical application, using one or more of the considerations laid out by the Supreme Court and the Federal Circuit. See MPEP 2106.04(d). Under Step Two (Step 2B), if the claim does not integrate the ineligible concept into a practical application and therefore directed to a judicial exception, evaluate whether the claim provides an inventive concept by determining whether there are additional elements, individually and in ordered combination, amount to significantly more than the exception itself. See MPEP 2106.04. Step 2A Prong (1) The “directed to” inquiry does not ask whether the claims involve a patent ineligible concept but, considered in light of the specification, whether the claim as a whole is directed to excluded subject matter or directed to an improvement to computer functionality. Enfish L.L.C. v. Microsoft Corp., 822 F.3d 1327, 1335 (Fed. Cir. 2016). Therefore, Prong (1) of Step 2A requires identifying specific limitations in the claims that recites (“describes” or “set forth”) an abstract idea and determine whether the identified limitations falls within the subject matter groupings of abstract ideas enumerated. See MPEP 2106.04 (“Thus, it is sufficient for this analysis for the examiner to identify that the claimed concept (the specific claim limitation(s) that the examiner believes may recite an exception) aligns with at least one judicial exception”). In particular, MPEP 2106.04(a)(2) states “a claim that recites a mathematical calculation, when the claim is given its broadest reasonable interpretation in light of the specification, will be considered as falling within the "mathematical concepts" grouping. A mathematical calculation is a mathematical operation (such as multiplication) or an act of calculating using mathematical methods to determine a variable or number, e.g., performing an arithmetic operation such as exponentiation”. Under Prong (1), claim 43 recites an electronic device, comprising: a memory configured to store instructions; and a processor coupled to the memory and configured to execute the instructions to cause the electronic device to: obtain a target keyword; (1) obtain a synonym group semantically associated with the target keyword, wherein obtaining the synonym group comprises: (a) determining first semantics of the target keyword; and (b) determining the synonym group based on the first semantics, wherein a first difference between second semantics of each synonym in the synonym group and the first semantics is less than a difference threshold, and wherein determining the synonym group comprises determining the synonym group further based on a first length of the target keyword, and wherein a second difference between a second length of each synonym in the synonym group and the first length is less than a length threshold; (2) train, based on the target keyword and the synonym group, a language model to obtain a target language model; (3) generate, based on the target language model, a first decoding graph indicating a plurality of decoding paths that satisfy a syntax constraint rule that is based on the target keyword and the synonym group; and (4) determine, based on the first decoding graph, a speech recognition model. Individually, step (1) obtaining target keyword and obtaining a synonym group corresponds to collecting information. Collecting information, including when limited to particular content, is within the realm of abstract ideas. Electric Power Grp., L.L.C. v. Alstom SA, 830 F.3d 1350, 1353 (Fed. Cir. 2016). Here, step (a)-(b) are mental steps to form an observation, evaluation, or judgment about semantics / meaning of target keyword and using the semantics / meaning of target keyword to make a determination / observation / judgment about the synonym group Individually, step (2) corresponds to generating a probability / mathematical modelling or mapping of words. In view of the specification US 2024/0242709 A1 at ¶¶269-70: “Specifically, the language model training module 810 may include a feature extraction module 815, configured to: generate an input feature based on the training dataset 805, and provide the input feature for a model training module 820, to obtain a target language model 825. The target language model 825 can indicate a syntax constraint rule determined based on the target keyword and the synonym group. Examples of the target language model 825 include, but are not limited to, an N-gram model based on an N-gram syntax, an RNN-LM model based on a neural network, a JSGF model based on a regular syntax, and the like”. For example, n-gram language model is generated according to an occurrence probability of training word sequence data where a probability of a word string is a product of the occurrence probabilities of respective words. US 2020/0118545 A1 at ¶87. Under the broadest reasonable interpretation, step (2) can be accomplished by a person performing mental mapping or modelling of word occurrence probability of target keyword and associated synonyms. Analyzing information by steps people go through in their minds, or by mathematical algorithms, without more, are treated as essentially mental process within the abstract-idea category. Electric Power Grp., 830 F.3d at 1354. Therefore, step (2) is essentially a mental process. Individually, step (3) corresponds to applying mathematical weights based on target keyword and the synonym group. In view of the specification US 2024/0242709 A1 at ¶282: “FIG. 11 is used as an example. A decoding path corresponding to a target keyword “increase sound” has a same weight as a decoding path corresponding to a synonym “raise volume”: PNG media_image1.png 316 518 media_image1.png Greyscale Under the broadest reasonable interpretation, step (3) can be accomplished by a person performing mental / mathematical weighting of target keywords and synonym group; e.g., mental step to mathematically weight “increase sound” and “raise volume” as equal; i.e., making a judgment that “increase sound” and “raise volume” as synonyms. Therefore, step (3) is essentially a mental process. Individually, step (4) corresponds to mathematically calculating a speech recognition model. In view of the specification US 2024/0242709 A1 at ¶285: “The target language model may be combined with an acoustic model, to obtain the speech recognition model, where the speech recognition model is a decoding graph”. Under the broadest reasonable interpretation, step (4) can be accomplished by a person performing a mathematical combination of two probability models. Therefore, step (4) is essentially a mental step. In ordered combination, steps (1)-(4) correspond to collecting information limited to particular content (speech input, target keyword, synonym group) and analyzing information by steps people go through in their minds and by mathematical algorithm to mathematically map target keyword and synonym group, mathematically generate weights of the target keyword and the synonym group, and mathematically combine the weights into a speech recognition model. Thus, claim 43 described patent ineligible subject matter enumerated under category (a) Mathematical Concepts – mathematical relationships, mathematical formulas or equations (target language model, speech recognition model), mathematical calculations and (c) Mental processes – concepts performed in the human mind (including an observation, evaluation, judgment, opinion). Step 2A Prong (2). Under Prong (2) of Step 2A, the goal is to determine whether the claim is directed to the recited exception by evaluating whether the claim as a whole integrates the recited judicial exception into a practical application of the exception. See MPEP 2106.04II(A). In particular, evaluating integration into a practical application requires identifying whether there are any additional elements recited in the claim beyond the judicial exception and evaluating those additional elements, individually and in combination, to determine whether they integrate the exception into a practical application, using one or more of the considerations laid out by the Supreme Court and the Federal Circuit (“CAFC”). See MPEP 2106.04(d). The Supreme Court held that when a claim containing a mathematical formula (i.e., an abstract idea) implements or applies that math formula / abstract idea in a structure or process which, when considered as a whole, is performing a function which the patent laws were designed to protect (e. g., transforming or reducing an article to a different state or thing), then the claim satisfies the requirements of §101. Diamond v. Diehr, 450 U.S. 175, 192 (1981); See MPEP 2106.04(d)I (“Implementing a judicial exception with, or using a judicial exception in conjunction with, a particular machine or manufacture that is integral to the claim, as discussed in MPEP 2106.05(b)”). See also Benson, 409 U.S. at 70 (“Transformation and reduction of an article "to a different state or thing" is the clue to the patentability of a process claim that does not include particular machines”). In particular, the Supreme Court looked to how the claims used that equation in a process designed to solve a technological problem in conventional industry practice. McRO, Inc. v. Bandai Namco Games America Inc., 837 F.3d 1299, 1312 (Fed. Cir. 2016). In Diehr, the claims involved a method for curing rubber by using Arrhenius equation to constantly measure actual temperature inside a mold and feeding the temperature measurements into a computer to repeatedly recalculate the cure time to open the press. Diehr, 450 U.S. at 178-79. Since the Supreme Court viewed the claims not as an attempt to patent a mathematical formula, but to an industrial process for molding of rubber products, the claims were statutory. Id. at 192-93. The key here, as noted by the CAFC, is that the Supreme Court in Diehr looked to how the claims "used that equation in a process designed to solve a technological problem in `conventional industry practice.'" McRO, 837 F.3d at 1312. When looked at as a whole, "the claims in Diehr were patent eligible because they improved an existing technological process, not because they were implemented on a computer." Id. at 1312-13. In McRO, the CAFC noted that prior art method of generating (i.e., calculating) morph weight set with values between “0” and “1” for computer animation of facial expressions are manually determined. McRO, 837 F.3d at 1304-5. The claimed improvement in McRO allows computers to produce “accurate and realistic lip synchronization and facial expressions in animated characters” that previously could only be produced by human animators through the automated use of rules, rather than artists, to set the morph weights and transitions between phonemes. Id. at 1313. Specifically, the claims are directed to the incorporation of claimed rules, not the use of the computer that improved existing technological process by allowing automation of further tasks that goes beyond merely organizing existing information into a new form. Id. at 1314-15. In other words, the claimed process uses a combined order of specific rules that renders information into a specific format that is then used and applied to create a sequence of synchronized, animated characters that prevent pre-emption of all processes for achieving automated lip-synchronization of 3-D characters. Id. at 1315. Therefore, the CAFC held that the ordered combination of claimed steps, using unconventional rules that relate sub-sequences of phonemes, timing, and morph weight sets is patent eligible. Id. at 1302-3. Further, in USPTO’s Memo on 2024 Updated Guidance on AI and Subject Matter Eligibility issued July 16, 2024 describing example 48 on pp. 14-15 regarding a method to separate speech signals from different sources to recognize human speech command from background noise by using a deep neural network (DNN) to promote separation of the features during clustering. See p. 15, ¶2. Specifically, the DNN learns high level feature representations of the signal x by mapping the feature representations to the embedding space comprising the DNN converting feature representations Xt, obtained from spectrograms St and corresponding feature matrices FMt, into multi-dimensional embedding vectors V and assigning the embedding vectors V to TF bins as a global function of the input signal (V= fθ(X), where fθ represents a function of the DNN). See p. 15, ¶5. According the Memo under Step 2A, Prong One, a claim comprising a step of using a deep neural network (DNN) to determine embedding vectors V using the formula V= fθ(X), where fθ represents a function of the DNN describes a mathematical calculation and therefore the claim “set forth” or “describes” a judicial exception. p. 19, ¶5. Under Step 2A, Prong Two, since there is no detail about a particular DNN or how the DNN operates to derive the embedding vectors other than that it is being used to determine the embedding vectors, the DNN is used to generally apply the abstract idea of performing mathematical calculation using recited mathematical equation without placing any limitation on how the DNN operates to derive the embedding vectors as a function of the input signal. p. 20, ¶2. In particular, the disclosure identifies a technical problem encountered in the field of speech separation and provides an improvement over existing speech separation methods by determining embedding vectors as a function of the input signal, partitioning those vectors into clusters, and synthesizing a reconstructed mixed speech signal based on these clusters. p. 20, ¶3. The claim, however, only requires determining the embedding vectors and therefore does not reflect the improvement discussed in the disclosure. Id. In the instant application, claim 43 set forth steps (1)-(4) corresponding to collecting information limited to particular content (speech input, target keyword, synonym group) and analyzing information by steps people go through in their minds and by mathematical algorithm to mathematically map target keyword and synonym group, mathematically generate weights of the target keyword and the synonym group (mental step to mathematically weight “increase sound” and “raise volume” as equal; i.e., making a judgment that “increase sound” and “raise volume” as synonyms), and mathematically combine the weights into a speech recognition model. Unlike the particular means or method for applying the Arrhenius equation (i.e., an abstract idea) in a particular industrial process for curing rubber in Diehr, claim 43 does not recite any particular technological or industrial application of the mathematical model / speech recognition model. In other words, unlike the technical incorporation of particular rules to the generated (i.e., calculated) morph weights to automate the “accurate and realistic lip synchronization and facial expressions in animated characters” in McRO, claim 43 does not describe any combination of specific rules or steps to render or structure the speech recognition model corresponding to a specifically asserted improvement of the speech recognition model. Rather, much like the claims in example 48 that merely used the DNN to calculate embedding vectors using a math formula without limitations on how the DNN is configured to synthesize a reconstructed speech signal, claim 43 focused on generating the speech recognition model without limitations on how to apply the speech recognition model to determine text representation associated with speech input. Finally, to the extent that claim 43 recited computer processor, attending software (i.e., instructions), and memory, the Supreme Court held that mere recitation of a generic computer cannot transform a patent-ineligible abstract idea into a patent-eligible invention. Alice, 134 S. Ct. at 2358. For example, in Alice, the Supreme Court held that data processing systems with communication controller, data storage unit, and transmission units were purely functional and generic because nearly every computer will include a "communications controller" and "data storage unit" capable of performing the basic calculation, storage, and transmission functions and such recitation of hardware failed to offer any meaningful limitation beyond generally linking the use of a method to a particular technological environment. Id. at 2360. See MPEP 2106.04(d)I (“Generally linking the use of a judicial exception to a particular technological environment or field of use, as discussed in MPEP 2106.05(h)”). Neither stating an abstract idea while adding the words “apply it” nor limiting the use of an abstract idea to a particular technological environment is enough for patent eligibility. Id. at 2350. Much like the data processing systems with data storage unit performing basic calculations in Alice, the recitation of computer processor, attending program instructions, and memory in claim 43 are purely functional and generic that failed to offer any meaningful limitation beyond generally linking the claims to computers. Therefore, claim 43 are directed to collecting information limited to particular content (target keyword, synonym group) and analyzing information by steps people go through in their minds and by mathematical algorithm to mathematically map target keyword and synonym group, mathematically generate weights of the target keyword and the synonym group, and mathematically combine the weights into a speech recognition model that are essentially mental processes within the abstract idea category. Step 2B Inventive Concept. The Guideline stated that if the additional elements do not integrate the exception into a practical application, then the claim is directed to the recited judicial exception, and requires further analysis under Step 2B where it may still be eligible if it amounts to an “inventive concept”. See MPEP 2106.04IIA and MPEP 2106.05. Further, an inventive concept can be found in the non-conventional and non-generic arrangement of known conventional pieces. BASCOM Global Internet Servs. v. AT&T Mobility, 827, F3d 1341, 1350 (Fed. Cir. 2016). In BASCOM, the CAFC held that filtering content is an abstract idea because it is a longstanding, well-known method of organizing human behavior similar to concepts previously found to be abstract. BASCOM, 827 F.3d at 1348. However, the CAFC determined that the claims did not merely recite filtering content along with the requirement to perform it on the internet or on a set of generic computer components, nor did the claims preempt all ways of filtering content on the internet. Id. at 1350. Rather, the inventive concept described and claimed was the installation of a filtering tool at a specific location, remote from the end-users, with customizable filtering features specific to each end user that gives the filtering tool both the benefits of a filter on a local computer and the benefits of a filter on an internet service provider “ISP” server. Id. By taking a prior art filter solution (one size fits all filter at internet service provider “ISP” server) and making it more dynamic and efficient (providing individualized filtering at the ISP server), the claimed invention improves the performance of the computer system itself. Id. at 1351. On the other hand, implementation via computers does not offer a meaningful limitation beyond generally linking the use of an abstract idea to a particular technological environment. Alice, 134 S. Ct. at 2360 (“Nearly every computer will include a “communications controller” and “data storage unit” capable of performing the basic calculation, storage, and transmission functions required by the method claims”). Intellectual Ventures I L.L.C. v. Capital One Bank, 792 F.3d 1363, 1370-71 (Fed. Cir. 2015) (“Steps that do nothing more than spell out what it means to “apply it on a computer” cannot confer patent-eligibility). Similarly, limiting an abstract idea to one field of use do not convert otherwise ineligible concept into an inventive concept. Intellectual Ventures I L.L.C. v. Erie Indem. Co., 850 F.3d 1315, 1328 (Fed. Cir. 2017). Neither does adding computer functionality to increase the speed or efficiency of the process confer patent eligibility on an otherwise abstract idea. Intellectual Ventures I, 792 F.3d at 1367 (citing Bancorp Servs., LLC v. Sun Life Insurance Co. of Can., 687 F.3d 1266, 1278 (Fed. Cir. 2012) (“The fact that the required calculations could be performed more efficiently via a computer does not materially alter the patent eligibility of the claimed subject matter”)). Individually, in the instant application, claim 43 set forth steps (1)-(4) correspond to collecting information limited to particular content (target keyword, synonym group) and analyzing information by steps people go through in their minds and by mathematical algorithm to mathematically map target keyword and synonym group, mathematically generate weights of the target keyword and the synonym group, and mathematically combine the weights into a speech recognition model. Claim 43 further requires computer processor, memory storing program instructions for implementing steps (1)-(4). Such individual recitation of generic computer components (processor, software / program instructions) are purely functional and generic because nearly every computer will include such processor and data storage unit capable of performing basic calculation necessary for steps (1)-(4) to mathematically generate a speech recognition model. As an ordered combination, unlike BASCOM that describes an unconventional combination of a conventional ISP server with a customized filter specific to each user that is remote from end-users to provide both the benefits of a filter on a conventional local computer and the benefits of a filter on the conventional ISP server, using speech recognition model and software program instruction + processor to determine text representation associated with speech input do not involve a unconventional combination of conventional pieces because the combination amounts to “apply it on a computer” limited to the field of speech recognition that cannot convert otherwise ineligible concept into an inventive concept. To the extent that implementing target language model and speech recognition model in the field of computers results in reduction in memory requirement and computational requirement, merely adding computer functionality to increase the speed or efficiency does not confer patent eligibility on an otherwise abstract idea. Therefore, claim 43 is not eligible for a patent. Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to examiner Richard Z. Zhu whose telephone number is 571-270-1587 or examiner’s supervisor Hai Phan whose telephone number is 571-272-6338. Examiner Richard Zhu can normally be reached on M-Th, 0730:1700. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /RICHARD Z ZHU/Primary Examiner, Art Unit 2654 02/07/2026
Read full office action

Prosecution Timeline

Oct 25, 2023
Application Filed
Nov 12, 2025
Non-Final Rejection — §101
Jan 21, 2026
Response Filed
Feb 07, 2026
Final Rejection — §101 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12592228
SPEECH INTERACTION METHOD ,AND APPARATUS, COMPUTER READABLE STORAGE MEDIUM, AND ELECTRONIC DEVICE
2y 5m to grant Granted Mar 31, 2026
Patent 12592222
APPARATUSES, COMPUTER PROGRAM PRODUCTS, AND COMPUTER-IMPLEMENTED METHODS FOR ADAPTING SPEECH RECOGNITION CONFIDENCE SCORES BASED ON EXPECTED RESPONSE
2y 5m to grant Granted Mar 31, 2026
Patent 12586574
ELECTRONIC DEVICE FOR PROCESSING UTTERANCE, OPERATING METHOD THEREOF, AND STORAGE MEDIUM
2y 5m to grant Granted Mar 24, 2026
Patent 12579978
NETWORKED DEVICES, SYSTEMS, & METHODS FOR INTELLIGENTLY DEACTIVATING WAKE-WORD ENGINES
2y 5m to grant Granted Mar 17, 2026
Patent 12572739
GENERATING MACHINE INTERPRETABLE DECOMPOSABLE MODELS FROM REQUIREMENTS TEXT
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
69%
Grant Probability
85%
With Interview (+15.4%)
3y 2m
Median Time to Grant
Moderate
PTA Risk
Based on 718 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month