Last updated: April 19, 2026

Application No. 18/242,859

METHOD, SYSTEM AND COMPUTER-READABLE STORAGE MEDIUM FOR CROSS-TASK UNSEEN EMOTION CLASS RECOGNITION

Final Rejection §101

Filed

Sep 06, 2023

Examiner

LE, THUYKHANH

Art Unit

2655

Tech Center

2600 — Communications

Assignee

National Tsing Hua University

OA Round

2 (Final)

Interview Optional

— +37.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 393 resolved cases, 2023–2026

Examiner Intelligence

LE, THUYKHANH View full profile →

Grants 78% — above average

Career Allow Rate

307 granted / 393 resolved

+16.1% vs TC avg

Strong +37% interview lift

Without

With

+37.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

19 currently pending

Career history

412

Total Applications

across all art units

Statute-Specific Performance

§101

18.6%

-21.4% vs TC avg

§103

41.8%

+1.8% vs TC avg

§102

20.1%

-19.9% vs TC avg

§112

10.1%

-29.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 393 resolved cases

Office Action

§101

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendments/Arguments
2.	With respect to Claim Objections, the amended claims 4, 11 and 17 define “GRU”, thus Claim Objections have been withdrawn. 
	With respect to 101 signal per se rejection towards claims 14-19, the amendment in claims 14-19 overcome 101 signal per se rejection, thus 101 signal per se rejection toward claims 14-19 has been withdrawn. 
	With respect to 101 abstract idea rejection, Applicant argues on page 2 of the Remarks that “Applicant has amended the subject matter of Claim 1 into “the method for unseen emotion class recognition with a speech recognition device”, so that the limitations are indeed integrated into a practical application. Claim 1 as amended should no longer be directed to non-statutory subject matter of "abstract idea" along with dependent Claims 3-7. 
 	Applicant has amended the subject matters of Claims 8 and 10-13 from instances of a generic “system” to those of a specific “speech recognition device”, so that the claims amount to significantly more than abstract idea per se. Claims 8 and 10-13 as amended should no longer be directed to non-statutory subject matter.”
	In response, Examiner respectfully notes that the recitation of “speech recognition device” merely indicates a field of use or technological environment in which the judicial exception is performed. Although the additional element “speech recognition device” limits the identified judicial exceptions, this type of limitation merely confines the use to the abstract idea to a particular technological environment (speech recognition) and thus fails to add an inventive concept to the claims. See MPEP 2106.05(f). 
	Even when viewed in combination, these additional elements do not integrate the recited judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception. (Step 2A: YES). 
	Step 2B: This part of the eligibility analysis evaluates whether the claim as a whole amounts to significantly more than recited exception, i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim. See MPEP 2106.05. 
The additional element “speech recognition device” generally links the use of the judicial exception to a particular technological environment/field of use (speech recognition) and thus fails to add an inventive concept to the claim. See MPEP 2106.05(h). 
 	Applicant’s arguments are not persuasive, and thus for these reasons, Examiner respectfully disagrees. 
	With respect to 102/103 rejection, Applicant has amended the independent claims by incorporation of claims 2, 9 and 15. Claims 2, 9 and 15 were previously indicated as Allowable Subject Matter. Thus, the independent claims 1, 8 and 14 allowed in view of the prior art of record. 

Claim Rejections - 35 USC § 101
3.	35 U.S.C. 101 reads as follows: 
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 

4.	Claims 1, 3-8, 10-14 and 16-19 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more.
	Claim 1 recites
“1. (Currently amended) A method for unseen emotion class recognition with a speech recognition device, comprising:
 	receiving, with an emotion recognition model, a speech sample to be tested;
 	calculating, with an encoder, a sample embedding to be tested of the speech sample to be tested;
 	calculating a first distance metric between the sample embedding to be tested and a first registered emotion category representation, and a second distance metric between the sample embedding to be tested and a second registered emotion category representation, wherein the second registered emotion category is not included in a plurality of basic emotion categories; and
 	determining an emotion category of the speech sample to be tested according to the first distance metric and the second distance metric, 
 	wherein determining the emotion category of the speech sample to be tested comprises:
determining the emotion category as the first registered emotion category according to the first distance metric being smaller than the second distance metric, or determining the emotion category as the second registered emotion category according to the second distance metric being smaller than the first distance metric.”
	Claims 1, 8 and 14 recite substantially the same concept but do so in the context of a method, a system and a computer-readable non-transitory storage medium. 
	The limitations recited in the independent claims as drafted covers a mathematical concept. More specifically, it relates to calculating a sample embedding, calculating a first distance metric and a second distance metric and determining an emotion category based on the calculated distances. 
 	The judicial exception is not integrated into a practical application. In particular, claim 8 recites additional elements of “a memory”, “a processor”. Claim 14 recites “a computer-readable non-transitory storage medium”. The additional element(s) or combination of elements such as a memory, a processor and/or a computer-readable non-transitory storage medium in the claim(s) other than the abstract idea per se amount(s) to no more than (i) mere instructions to implement the idea on a computer, and/or (ii) recitation of generic computer structure that serves to perform generic computer functions that are well-understood, routine, and conventional activities previously known to the pertinent industry. Viewed as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself. Therefore, the claim(s) are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter. There is further no improvement to the computing device other than calculating distance metrics and determining an emotion category. The mere recitation of a memory and a processor and/or the like is akin of adding the word “apply it” and/or “use it” with a computer in conjunction with the abstract idea. The paragraph [0031] discloses “An emotion recognition system of another embodiment of the present invention includes a memory and a processor, the memory being used for storing the emotion recognition model and a plurality of instructions that, when executed, configure the processor to perform an emotion recognition method of any one of the foresaid embodiments. Here, the processor can include any appropriate hardware device, such as central processing unit (CPU), microcontroller and application-specific integrated circuit (ASIC) and so on, and the memory can be appropriate storage media such as random-access memory (RAM), flash memory and so on, the present invention is not limited hereto. Furthermore, a computer-readable storage medium of still another embodiment of the present invention includes a computer-readable program that, after being read by a computer, may perform an emotion recognition method of any one of the embodiments as described before.”
 	As filed in the specification, the computer is listed as a general-purpose computer and are mainly used as an application thereof. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea.
	Claims recites “an emotion recognition model” and “an encoder”. The model and the encoder are recited at high level generality. There are not technical details on how an emotion recognition model receive a speech sample and how an encoder calculate a sample embedding. The limitations of “with an emotion recognition model” and “with an encoder” provide nothing more than mere instructions to implement an abstract idea on a generic computer. The claimed process is reasonably categorized as an abstract idea of mathematic concepts under the Broadest Reasonable Interpretation. 
 	The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using a computer is noted as a general computer. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claims are not patent eligible.
 	The dependent claims further do not remedy the issues noted above. More specifically, Claims 3, 10 and 16 recite fetching an acoustic feature from the speech sample. No additional limitations are presented. Claims 4, 11 and 17 transforms the acoustic feature into the sample embedding. No additional limitations are presented. Claims 5, 12 and 18 recites receiving a plurality of first registered speech samples and a plurality of second registered speech samples, calculating registered sample embeddings, calculating average of the registered sample embeddings. No additional limitations are presented. Claims 6, 13 and 19 recites receiving training sample, calculating embedding, calculating sample embedding for each of the basic emotion categories, calculating a center-of-mass, calculating a cosine similarity, calculating a loss function. No additional limitations are presented. Claim 7 recites a cross-entropy loss. No additional limitations are presented.
For at least the supra provided reasons, claims 1, 3-8, 10-14 and 16-19 are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter. 

Allowable Subject Matter
5.	 Claims 1, 3-8, 10-14 and 16-19 are allowed in view of the prior art of record. The claims stand rejected under 101 Abstract idea, and for the application to pass to allowance this rejection need to be overcome. Any amendments to overcome the rejection that result in any change in scope require further search and/or consideration in order to determine it allowability. 
 	The following is an examiner’s statement of reasons for allowable subject matter: the prior art(s) fail(s) to teach the following element(s) in combination with the other recited elements in the claim(s). 
	“determining an emotion category of the speech sample to be tested according to the first distance metric and the second distance metric, 
 	wherein determining the emotion category of the speech sample to be tested comprises:
determining the emotion category as the first registered emotion category according to the first distance metric being smaller than the second distance metric, or determining the emotion category as the second registered emotion category according to the second distance metric being smaller than the first distance metric.” as recited in Claim 1. 
	Claims 8 and 14 recite the similar features as Claim 1. 
	The closest prior arts found as following.
a.	Gokhale et al. (US 2024/0078374 A1.) In this reference, Gokhale et al. disclose determining emotion class(es) of the spoken utterance based on the plurality of distances, at least one of the plurality of distance is distance between the embedding of the spoken utterance with the emotion class(es) (Gokhale et al. [0045] generates an embedding that corresponding to a lower-level representation of the spoken utterance, [0045] calculate a plurality of distance metrics, each of the plurality of distance metrics is a Euclidean distance, cosine similarity, or other distance measures in the embedding space. Each of distance metrics is a distance between the embedding of the spoken utterance with a specific emotion class. Examiner notes that Gokhale et al. disclose multiple disparate emotion classes in order to determine an emotion class of the spoken utterance, and at least one of the multiple disparate emotion classes in Gokhale et al. is not included in a plurality of basic emotion categories (e.g., excitement), paragraph [0005] in Gokhale et al. discloses excitement class. Excitement is not included in the plurality of basic emotion classes in light of present specification. Examiner also notes that the claimed language defines the second registered emotion category is not included in a plurality of basic emotion categories.) In this reference, Gokhale et al. determines emotion class of the spoken utterance by compare a distance between the embedding of the spoken utterance and the embedding for the given emotion class. The emotion class could be basic emotion (e.g., happiness and sadness) or not basic emotion (e.g., excitement). If the distance is less than a threshold, the spoken utterance expresses that given emotion class. Gokhale et al. does not calculate a first distance between the embedding of the spoken utterance and a first given emotion class, calculate a second distance between the embedding of the spoken utterance and a second given emotion class and compare the first distance and the second distance to determine which given emotion class that the spoken utterance belongs to. Thus, Gokhale et al. fail to teach and/or suggest the allowable subject matter. 
b.	Subramanian et al. (US 2007/0208569 A1.) In this reference, Subramanian et al. disclose determining a basic emotion by analyzing speech patterns in the speaker’s voice (Subramanian et al. [0027] Basic human emotions can be categorized as surprise, peace (pleasure), acceptance (contentment), courage, pride, disgust, anger, lust (greed) and fear (although other emotion categories are identifiable). These basic emotion can be recognized by the emotional content of human speech by analyzing speech patterns in the speaker’s voice, including the pitch, tone, cadence and amplitude characteristics of the speech. Generic speech patterns can be identified in a communication that corresponds to specific human emotions for a particular language, dialect and/or geographic region of the spoken communication. Emotion speech patterns are often as unique as the individual herself. Individuals tend to refine their speech patterns for their audiences and borrow emotional speech patterns that accurately convey their emotional state. Therefore, if the identity of the speaker is known, the audience can use the speaker's personal emotion voice patterns to more accurately analyze her emotional state.) Subramanian et al. determines an emotion category in the human speech by analyzing speech patterns in the speaker’s voice, including the pitch, tone, cadence and amplitude characteristics of the speech. Subramanian et al. does not calculate a first distance, a second distance and compare the first distance and the second distance to determine an emotion category in the human speech. Thus, Subramanian et al. fail to teach and/or suggest the allowable subject matter. 
c.	Mazza et al. (US 2022/0366197 A1.) In this reference, Mazza et al. disclose determining sentiment class of the utterance (Mazza et al. [0011] Embodiments of the invention may include computing, for one or more entries of the ordered list, a value of classification precision; identifying an index of the ordered list that corresponds to a required classification precision value; and determining the confidence score of the identified index as a prediction threshold for classifying utterances as belonging to the sentiment class, Claim 4: 4. The method of claim 1, further comprising: receiving an annotated validation dataset, comprising a plurality of utterances, and computing a corresponding plurality of confidence scores, wherein each confidence score represents a certainty level of classification of an utterance to a sentiment class; sorting the utterances of the validation dataset in an ordered list, according to the computed confidence scores; for one or more entries of the ordered list, computing a value of classification precision; identifying an index of the ordered list that corresponds to a required classification precision value; and determining the confidence score of the identified index as a prediction threshold for classifying utterances as belonging to the sentiment class.) In this reference, for each of sentiment class, Mazza et al. calculates a confidence score of an utterance belonging to that sentiment class, wherein the confidence score represents a certainty level of classification of the utterance to that sentiment class. These confidence scores are used to classifying sentiment class of the utterances. Mazza et al. does not calculate a first distance, a second distance and compare the first distance and the second distance to determine an emotion category in the human speech. Thus, Mazza et al. fail to teach and/or suggest the allowable subject matter. 


Conclusion
6.	The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See PTO-892.
a.	Kim (US 2022/0022790 A1.) In this reference, Kim discloses a method and a system for recognizing personal emotion utilizing the speech data. 
b.	Lee et al. (US 2022/0032919 A1). In this reference, Lee et al. disclose a method and a system for determining a driver emotion. 
c.  	Sato et al. (US 2006/0281064 A1.) In this reference, Sato et al. disclose a method and a system for analyzing voice data to detect an emotional parameter, a basic emotion ID as set in the emotion movement pattern storage. 

7.	THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

8. 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to THUYKHANH LE whose telephone number is (571)272-6429. The examiner can normally be reached Mon-Fri: 9am-5pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew C. Flanders can be reached on 571-272-7516. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/THUYKHANH LE/Primary Examiner, Art Unit 2655

Read full office action

Prosecution Timeline

Sep 06, 2023

Application Filed

Aug 23, 2025

Non-Final Rejection — §101

Nov 28, 2025

Response Filed

Jan 06, 2026

Final Rejection — §101 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/105,011

Patent 12597413

ELECTRONIC DEVICE AND CONTROL METHOD THEREOF

2y 5m to grant Granted Apr 07, 2026

18/178,563

Patent 12592218

COMMUNICATION DEVICE, COMMUNICATION METHOD, AND NON-TRANSITORY STORAGE MEDIUM

2y 5m to grant Granted Mar 31, 2026

18/646,310

Patent 12592239

ACTIVE VOICE LIVENESS DETECTION SYSTEM

2y 5m to grant Granted Mar 31, 2026

18/242,053

Patent 12586577

AUTOMATIC SPEECH RECOGNITION USING MULTIPLE LANGUAGE MODELS

2y 5m to grant Granted Mar 24, 2026

18/567,634

Patent 12579365

INFORMATION ACQUISITION METHOD AND APPARATUS, DEVICE, AND MEDIUM

2y 5m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

78%

Grant Probability

99%

With Interview (+37.1%)

2y 9m

Median Time to Grant

Moderate

PTA Risk

Based on 393 resolved cases by this examiner. Grant probability derived from career allow rate.