Last updated: April 19, 2026
Application No. 18/763,073
DEVICE AND METHOD FOR GENERATING EMOTION-CAUSE PAIR BASED ON CONVERSATION, AND STORAGE MEDIUM STORING INSTRUCTION TO PERFORM METHOD FOR GENERATING EMOTION CAUSE PAIR

Non-Final OA §101§102§103
Filed
Jul 03, 2024
Examiner
SCHMIEDER, NICOLE A K
Art Unit
2659
Tech Center
2600 — Communications
Assignee
Research & Business Foundation Sungkyunkwan University
OA Round
1 (Non-Final)
Interview Optional

— +34.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 167 resolved cases, 2023–2026
Examiner Intelligence

SCHMIEDER, NICOLE A K View full profile →
Grants 68% — above average
Career Allow Rate
113 granted / 167 resolved
+5.7% vs TC avg
Strong +34% interview lift
Without
With
+34.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
25 currently pending
Career history
192
Total Applications
across all art units
Statute-Specific Performance

§101
21.9%
-18.1% vs TC avg
§103
46.7%
+6.7% vs TC avg
§102
13.0%
-27.0% vs TC avg
§112
13.9%
-26.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 167 resolved cases
Office Action

§101 §102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim(s) 1-20 is/are pending and has/have been examined.
Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). 
Receipt is acknowledged of certified copies of papers required by 37 CFR 1.55.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/03/2024 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Drawings
The drawings are objected to because of the following informalities:
Fig. 6 - ROM is 6240 in spec 8240 in fig; RAM is 6250 in spec, 8250 in fig.  
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim(s) 1, 11, and 20, the limitation(s) of receiving, classifying, generating, and determining, as drafted, are processes that, under broadest reasonable interpretation, covers performance of the limitation in the mind and/or with pen and paper but for the recitation of generic computer components. More specifically, the mental process of a human reading a transcript written from a conversation between two people, marking each utterance as either having an emotion or not, pairing emotion utterances with utterances that are a possible cause, and looking at the pairs of utterances to determine which has a true cause and effect relationship. If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind and/or with pen and paper but for the recitation of generic computer components, then it falls within the --Mental Processes-- grouping of abstract ideas. Accordingly, the claim(s) recite(s) an abstract idea.
This judicial exception is not integrated into a practical application because the recitation of an apparatus in claim 1, a device, memory, and processor in claim 11, and a storage medium and processor in claim 20, reads to generalized computer components, based upon the claim interpretation wherein the structure is interpreted using pgs 15-16 in the specification. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim(s) is/are directed to an abstract idea.
The claim(s) do(es) not include additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to the integration of the abstract idea into a practical application, the additional element of using generalized computer components to receive, classify, generate, and determine amounts to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim(s) is/are not patent eligible.
	
With respect to claim(s) 2 and 12, the claim(s) recite(s) receiving and determining, which reads on a human reading a transcript where the utterances are transcribed in the order they were stated, and only picking utterances to pair together that were stated within a certain number of conversational turns of each other. No additional limitations are present.
With respect to claim(s) 3 and 13, the claim(s) recite(s) receiving, determining, and generating, which reads on a human using information about the speakers and emotion information to determine which utterances to pair together. No additional limitations are present.

With respect to claim(s) 4, 5, 6, 14, 15, and 16, the claim(s) recite(s) determining a true emotion pair using a mix-of-experts technique with specific features, which reads on a human using different set of rules to evaluate the accuracy of the emotion cause pair. The recitation of a mix-of-experts technique, gating network, plurality of expert models, and weights, reads to different sets of rules by which a person can evaluate the accuracy of an utterance pair, and then compare the results using a confidence in the accuracy of one set of rules over another to make a final determination. 

With respect to claim(s) 7 and 17, the claim(s) recite(s) vectorizing and classifying, which reads on a human writing out the text in a specific format using sets of rules for understanding of human language, and using rules regarding an understanding of emotion to associate an emotion with each utterance. The recitation of a natural language processing model and emotion classification model reads to sets of rules.

With respect to claim(s) 8 and 18, the claim(s) recite(s) generating and generating, which reads on a human re-writing the text of the transcript in a segmented format and using a specific set of rules to re-write the segments into a different specific format. The recitation of a tokenizer reads to a set of rules for reformatting text and BERT reads to a set of rules for writing out the reformatted text with additional contextual information. 

With respect to claim(s) 9, the claim(s) recite(s) classifying, which reads on a human writing next to each utterance which emotion from a list of emotions is reflected in the utterance. No additional limitations are present.

With respect to claim(s) 10 and 19, the claim(s) recite(s) generating the candidate emotion cause pair, which reads on a human pairing potential emotion and cause utterances based on specific characteristics being met. No additional limitations are present.

These claims further do not remedy the judicial exception being integrated into a practical application and further fail to include additional elements that are sufficient to amount to significantly more than the judicial exception.
	Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 11, 12, and 20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by (Li et al. “ECPEC: Emotion-Cause Pair Extraction in Conversations”, IEEE, 21 Oct 2022), hereinafter Li.

Regarding claims 1, 11, and 20, Li teaches
(claim 1) A method for generating an emotion cause pair based on conversation performed by an apparatus using an emotion cause pair prediction model (emotion-cause pair extraction in conversations using models for a system Fig. 3 and caption,(Sec. 1, 2.2)), the method comprising:
(claim 11) A device for generating an emotion cause pair based on conversation (a customer service system for understating interactions among speakers (Sec. 1, 2.2)), the device comprising:
(claim 11) a memory configured to store by an emotion cause pair prediction model and one or more instructions for preforming the emotion cause pair prediction model (the ConvECPE dataset and baseline systems are released online, including the model files in folders and the order in which the models are to be run (Sec. 1, 2.2, Github files)); and
(claim 11) a processor configured to execute the one or more instructions stored in the memory, wherein the instructions, when executed by the processor, cause the processor to (the stored files are run in a specific order for the two-step framework to be implemented (Sec. 1, 2.2, Github ReadMe)):
(claim 20) A non-transitory computer readable storage medium storing computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform a method for generating an emotion cause pair based on conversation (the ConvECPE dataset and baseline systems are released online, including the model files in folders and the order in which the models are to be run, and where the stored files are run in a specific order for the two-step framework to be implemented (Sec. 1, 2.2, Github files)), the method comprising:

receiving a plurality of utterance texts converted from a voice conversation between a plurality of speakers (the dataset consists of dialogues with utterances between more than one individual, where each utterance consists of a number of words represented as vectors through GloVe based on the raw text of conversations, i.e. receiving a plurality of utterance texts converted from a…conversation between a plurality of speakers, where the dataset also contains visual and audio features of the dialogues, i.e. voice converted from a voice conversation Fig. 1,(Sec. 1, 3.3, 4.1, 6, Github ReadMe – Dataset structure));
classifying each of the plurality of utterance texts for each emotion and detecting at least one of emotion utterance texts among the plurality of utterance texts (two networks jointly extract a set of emotion utterances and a set of cause utterances in a conversation, i.e. detecting at least one of emotion utterance…among the plurality of utterance.., by evaluating utterance-level embeddings generated from the word vectors, i.e. each of the plurality of utterance texts, where each utterance is evaluated for emotion detection and cause detection, i.e. classifying each of the plurality of utterance…for each emotion Figs. 3,(Sec. 2.2, 3.1, 3.2, 4 intro, 4.1));
generating candidate emotion cause pairs each including a pair of an emotion utterance text selected from among the at least one of the emotion utterance texts and a cause utterance text corresponding to the selected emotion utterance text (the Joint-EC model takes the emotion utterance vectors and cause utterance vectors as inputs, i.e. an emotion utterance text selected from among the at least one of the emotion utterance texts and a cause utterance text, and a cartesian product is applied to pairing all the possible EC pairs, i.e. generating candidate emotion cause pairs each including a pair of an emotion utterance text…and a cause utterance text corresponding to the selected emotion utterance text Figs. 4 and 7,(Sec. 4.2, 5.1)); and
determining the emotion cause pair from the plurality of generated candidate emotion cause pairs (the Joint-EC model takes the emotion utterance vectors and cause utterance vectors as inputs, where the EC-chunk filter extracts EC-chunk pairs, where a cartesian product is applied to pairing all the possible EC pairs, i.e. plurality of generated candidate emotion cause pairs, and the EC pair filter is then applied using the result of the EC chunk filter, and the final output of the model is the probability that a pair is an EC pair, where correct pairs are identified by the model, i.e.  determining the emotion cause pair Figs. 4 and 7,(Sec. 1 para 1, 4.2, 5.1).  

Regarding claims 2 and 12, Li teaches claims 1 and 11, and further teaches
receiving utterance order information of the plurality of utterance texts, and wherein the generating the candidate emotion cause pairs includes determining the cause utterance text of the selected emotion utterance text corresponding to a present or past utterance text within a preset number of times of utterances based on the utterance order information (the utterances are identified as being adjacent or not based on a time step, i.e. receiving utterance order information of the plurality of utterance texts, where the context for each utterance based on the utterances immediately before and after it in time is used for the emotion and cause detection tasks, and an emotional utterance is paired with a cause-chunk defined as a sequence of contiguous cause utterances, i.e. determining the cause utterance text of the selected emotion utterance text corresponding to a present or past utterance text…based on the utterance order information, and where the chunk has a specific set size, i.e. corresponding to a present or past utterance text within a preset number of times of utterances (Sec. 4 intro, 4.1, 5.2.3, 5.5)).  
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 3, 7, 9, 10, 13, 17, and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Wang et al. (“Multimodal Emotion-Cause Pair Extraction in Conversations”, IEEE, 05 December 2022), hereinafter Wang.
Regarding claims 3 and 13, Li teaches claims 1 and 11.
While Li provides that information about speakers is collected and should be taken into account, Li does not specifically teach how the speaker information should be utilized, and thus does not teach
receiving information of the plurality of speakers corresponding to each utterance text, and wherein the generating the emotion cause pair includes determining an emotion cause pair type based on information of each speaker and an emotion type of each speaker, and generating the emotion cause pair based on the emotion cause pair type.  
Wang, however, teaches receiving information of the plurality of speakers corresponding to each utterance text, and wherein the generating the emotion cause pair includes determining an emotion cause pair type based on information of each speaker and an emotion type of each speaker, and generating the emotion cause pair based on the emotion cause pair type (features are extracted to obtain independent multimodal representations of each utterance, including the text of the utterance, as well as acoustic features and visual features of the speaker’s voice and facial expressions, i.e. receiving information of the plurality of speakers corresponding to each utterance text, where the emotion extraction identifies the set of emotion utterances based on a multi-class emotion classification to identify the specific emotion category of the utterance, i.e. emotion type of each speaker, based on the multimodal information, i.e. based on information of each speaker, and the emotion cause pairs are determined and categorized with the type of emotion using the determination of the emotion and cause extraction steps, i.e. determining an emotion cause pair type…and generating the emotion cause pair based on the emotion cause pair type Fig. 5, (Intro, Sec. 2, 3.2, 4.1, 4.2.1, 5.3)).  
Li and Wang are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction in conversations. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the collecting information about speakers teachings of Li with the use of multimodal input information as taught by Wang. It would have been obvious to combine the references to increase the effectiveness of the MECPE task using multimodal features (Wang Conclusion)).

Regarding claims 7 and 17, Li teaches claims 1 and 11, and further teaches
vectorizing each utterance text including a previous utterance text based on a natural language processing model (the sequence of utterances are processed through an LSTM to generate utterance-level embeddings (Sec. 4.1)).
While Li provides identifying utterances that are an emotion utterance, Li does not specifically teach classifying each utterance into at least one of several emotions based on an emotion classification model, and thus does not teach
classifying each vectorized utterance text into at least one of several emotion based on an emotion classification model.
Wang, however, teaches classifying each vectorized utterance text into at least one of several emotion based on an emotion classification model (each utterance is fed into an encoder to obtain textual features, i.e. each vectorized utterance text, where a trained emotion classifier is used to determine an emotion classification in the first step to determine the emotion, such as surprise or fear, i.e. classifying… into at least one of several emotion based on an emotion classification model Fig. 5,(Sec. 4.1, 4.2.1, 4.2.2, 5.3)).
Li and Wang are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction in conversations. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying utterances that are an emotion utterance teachings of Li with the use of an emotion classifier to determine an emotion classification for each utterance as taught by Wang. It would have been obvious to combine the references to increase the effectiveness of the MECPE task using multimodal features (Wang Conclusion)).

Regarding claim 9, Li teaches claim 1. 
While Li provides identifying utterances that are an emotion utterance, Li does not specifically teach classifying each utterance into at least one of several emotions, and thus does not teach
classifying the utterance text as at least one of emotion type among a plurality of emotion types.  
Wang, however, teaches classifying the utterance text as at least one of emotion type among a plurality of emotion types (each utterance if fed into an encoder to obtain textual features, i.e. utterance text, where a trained emotion classifier is used to determine an emotion classification in the first step to determine the emotion, such as surprise or fear, i.e. as at least one of emotion type among a plurality of emotion types Fig. 5,(Sec. 4.1, 4.2.1, 4.2.2, 5.3)).
Li and Wang are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction in conversations. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying utterances that are an emotion utterance teachings of Li with the use of an emotion classifier to determine an emotion classification for each utterance as taught by Wang. It would have been obvious to combine the references to increase the effectiveness of the MECPE task using multimodal features (Wang Conclusion)).  

Regarding claims 10 and 19, Li teaches claims 1 and 11, and further teaches
generating the candidate emotion cause pair including the selected emotion utterance text … and ((claim 10) a present or past cause utterance text within the set number of times of utterances in the at least one of the emotion utterance texts)/((claim 19) the cause utterance text)(the utterances are identified as being adjacent or not based on a time step, where the context for each utterance based on the utterances immediately before and after it in time is used for the emotion and cause detection tasks, and an emotional utterance is paired with a cause-chunk defined as a sequence of contiguous cause utterances, i.e. generating the candidate emotion cause pair including the selected emotion utterance text, and where the chunk has a specific set size, i.e. a present or past cause utterance text within the set number of times of utterances in the at least one of the emotion utterance texts/the cause utterance text (Sec. 4 intro, 4.1, 5.2.3, 5.5)).  
While Li provides identifying utterances that are an emotion utterance, Li does not specifically teach identifying an emotion type among a plurality of emotion types, and thus does not teach
generating the candidate emotion cause pair including the selected emotion utterance text corresponding to the same emotion type among the plurality of emotion types.
Wang, however, teaches generating the candidate emotion cause pair including the selected emotion utterance text corresponding to the same emotion type among the plurality of emotion types (each utterance is fed into an encoder to obtain textual features, i.e. emotion utterance text, where a trained emotion classifier is used to determine an emotion classification in the first step to determine the emotion, such as surprise or fear, i.e. emotion type among the plurality of emotion types, where the emotion category is applied to the emotion cause pair, i.e. generating the candidate emotion cause pair including the selected emotion utterance text corresponding to the same emotion type Fig. 5,(Sec. 2, 4.1, 4.2.1, 4.2.2,5.3)).
Li and Wang are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction in conversations. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying utterances that are an emotion utterance teachings of Li with the classification of specific emotion categories for each emotion-cause pair as taught by Wang. It would have been obvious to combine the references to increase the effectiveness of the MECPE task using multimodal features (Wang Conclusion)).

Claim(s) 4-6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Wang, in view of Xu et al. (“An Ensemble Approach for Emotion Cause Detection with Event Extraction and Multi-Kernel SVMs” Tsinghua Science and Technology, December 2017, 22(6): 646–659), hereinafter Xu, and further in view of Shazeer et al. (U.S. PG Pub No. 2019/0251423), as found in the IDS, hereinafter Shazeer.

Regarding claim 4, Li in view of Wang teaches claim 3. 
While Li in view of Wang provides identifying correct emotion cause pairs, Li in view of Wang does not specifically teach the use of a mix-of-experts technique, and thus does not teach
determining at least one true emotion cause pair based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models.  
Xu, however, teaches determining at least one true emotion cause pair based on –an ensemble technique-- and a plurality of expert models (for each classifier, the emotion cause is input, and the each classifier outputs a result, i.e. an ensemble technique and a plurality of expert models, that is averaged to determine whether the input is an emotion cause event, i.e. determining at least one true emotion cause pair Fig. 6,(Sec. 4 intro, 4.2.3)).  
Li, Wang, and Xu are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying correct emotion cause pairs teachings of Li, as modified by Wang, with the use of ensemble classifiers to determine a positive emotion cause as taught by Xu. It would have been obvious to combine the references to improve the performance of emotion cause extraction (Xu Sec. 5.2)).
While Li in view of Wang and Xu provides the use of an ensemble classifier, Li in view of Wang and Xu does not specifically teach the use of a mix-of-experts technique using a gating network, and thus does not teach
determining at least one –output-- based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models.
Shazeer, however, teaches determining at least one –output-- based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models (the input may be a sequence of text or a sequence representing a spoken utterance, which is processed by a first network layer and input to an MoE, which includes multiple expert neural networks, i.e. plurality of expert models, and a gating subsystem, i.e. using a gating network, where the experts generate respective output, i.e. determining at least one output [0017],[0021-22],[0025-7]).
Where Xu specifically teaches that the output of each model is an emotion cause prediction Fig. 6,(Sec. 4 intro, 4.2.3).
Li, Wang, Xu, and Shazeer are analogous art because they are from a similar field of endeavor in processing input with trained models to provide a desired output. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of an ensemble classifier teachings of Li, as modified by Wang and Xu, with the use of an MoE subnetwork to process input as taught by Shazeer. It would have been obvious to combine the references to achieve better results at a reasonable processing time and computational costs for a variety of technical purposes (Shazeer [0010]).

Regarding claim 5, Li in view of Wang, Xu, and Shazeer teaches claim 4, and Xu further teaches
each expert model is a model pre-trained to predict the true emotion cause pair corresponding to each emotion cause pair type (the classifiers are trained with training data, i.e. each expert model is a model pre-trained, and for each classifier, the emotion cause is input, and the each classifier outputs a result that is averaged to determine whether the input is an emotion cause event, i.e. predict the true emotion cause pair,  Fig. 6,(Sec. 4 intro, 4.2.3)).
Wang further teaches that the emotion cause pair has an emotion category Fig. 5, (Intro, Sec. 2, 3.2, 4.1, 4.2.1, 5.3).
Shazeer further teaches and wherein the gating network is configured to determine a weight for a prediction result of each expert model (the gating subsystem determines a weight for each expert neural network, and combines the expert outputs generated in accordance with the weights for the expert neural networks to generate a MoE output [0027]).  
And where the motivation to combine is the same as previously presented.
Regarding claim 6, Li in view of Wang, Xu, and Shazeer teaches claim 5, and Xu further teaches
inputting a first candidate emotion cause pair among the candidate emotion cause pairs into each expert model, and determining whether the first candidate emotion cause pair is a true emotion cause pair corresponding to any of the emotion cause pair types based on the prediction result of each expert model and the weight (for each classifier, the emotion cause is input, i.e. inputting a first candidate emotion cause pair among the candidate emotion cause pairs into each expert model, and the each classifier outputs a result that is averaged to determine whether the input is an emotion cause event, i.e. determining whether the first candidate emotion cause pair is a true emotion cause pair corresponding to any of the emotion cause pair types based on the prediction result of each expert model and the weight Fig. 6,(Sec. 4 intro, 4.2.3)).  
Wang further teaches that the emotion cause pair has an emotion category Fig. 5, (Intro, Sec. 2, 3.2, 4.1, 4.2.1, 5.3).
Shazeer further teaches that the weights are determined for each expert neural network [0027].
And where the motivation to combine is the same as previously presented.

Claim(s) 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Wang, and further in view of Acheampong et al. (“Transformer models for text‑based emotion detection: a review of BERT‑based approaches”, Artificial Intelligence Review (2021) 54:5789–5829), hereinafter Acheampong.
Regarding claims 8 and 18, Li in view of Wang teaches claims 7 and 17, and Wang further teaches
… a token sequence from the plurality of utterance texts … and generating a token sequence representation from the token sequence based on BERT to generate each utterance text (each text is tokens representing utterances, i.e. a token sequence from the plurality of utterance texts, that is fed into a pre-trained BERT to obtain the textual features of each utterance, i.e. generating a token sequence representation from the token sequence based on BERT to generate each utterance text Fig. 5,(Sec 4.2.1)).  
While Li in view of Wang provides processing tokens with BERT, Li in view of Wang does not specifically teach generating the token sequence with a tokenizer, and thus does not teach
generating a token sequence from the plurality of utterance texts based on a tokenizer.
Acheampong, however, teaches generating a token sequence from the plurality of utterance texts based on a tokenizer (input sentences were converted to lower case and tokenized using the WordPiece tokenizer, which was then fine-tuned using BERT (pg 5816-5817)).
Li, Wang, and Acheampong are analogous art because they are from a similar field of endeavor in identifying emotions in text. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the processing tokens with BERT teachings of Li, as modified by Wang, with the use of a tokenizer to prepare the input sentences as taught by Acheampong. It would have been obvious to combine the references to enable significant improvement in emotion recognition from text based on the extraction of context (Acheampong Abstract)).

Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Xu, and further in view of Shazeer.

Regarding claim 14, Li teaches claim 11. 
While Li provides identifying correct emotion cause pairs, Li does not specifically teach the use of a mix-of-experts technique, and thus does not teach
determine at least one true emotion cause pair based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models.   
Xu, however, teaches determine at least one true emotion cause pair based on –an ensemble technique-- and a plurality of expert models (for each classifier, the emotion cause is input, and the each classifier outputs a result, i.e. an ensemble technique and a plurality of expert models, that is averaged to determine whether the input is an emotion cause event, i.e. determine at least one true emotion cause pair Fig. 6,(Sec. 4 intro, 4.2.3)).  
Li and Xu are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying correct emotion cause pairs teachings of Li, with the use of ensemble classifiers to determine a positive emotion cause as taught by Xu. It would have been obvious to combine the references to improve the performance of emotion cause extraction (Xu Sec. 5.2)).
While Li in view of Xu provides the use of an ensemble classifier, Li does not specifically teach the use of a mix-of-experts technique using a gating network, and thus does not teach
determine at least one –output-- based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models.
Shazeer, however, teaches determine at least one –output-- based on a mix-of-experts (MOE) technique using a gating network and a plurality of expert models (the input may be a sequence of text or a sequence representing a spoken utterance, which is processed by a first network layer and input to an MoE, which includes multiple expert neural networks, i.e. plurality of expert models, and a gating subsystem, i.e. using a gating network, where the experts generate respective output, i.e. determine at least one output [0017],[0021-22],[0025-7]).
Where Xu specifically teaches that the output of each model is an emotion cause prediction Fig. 6,(Sec. 4 intro, 4.2.3).
Li, Xu, and Shazeer are analogous art because they are from a similar field of endeavor in processing input with trained models to provide a desired output. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the use of an ensemble classifier teachings of Li, as modified by Xu, with the use of an MoE subnetwork to process input as taught by Shazeer. It would have been obvious to combine the references to achieve better results at a reasonable processing time and computational costs for a variety of technical purposes (Shazeer [0010]).

Claim(s) 15 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li, in view of Xu, in view of Shazeer, and further in view of Wang.

Regarding claim 15, Li in view of Xu and Shazeer teaches claim 14, and Xu further teaches
each expert model is a model pre-trained to predict the true emotion cause pair… (the classifiers are trained with training data, i.e. each expert model is a model pre-trained, and for each classifier, the emotion cause is input, and the each classifier outputs a result that is averaged to determine whether the input is an emotion cause event, i.e. predict the true emotion cause pair,  Fig. 6,(Sec. 4 intro, 4.2.3)).
Shazeer further teaches and wherein the gating network is configured to determine a weight for a prediction result of each expert model (the gating subsystem determines a weight for each expert neural network, and combines the expert outputs generated in accordance with the weights for the expert neural networks to generate a MoE output [0027]).  
While Li provides identifying utterances that are an emotion utterance, Li does not specifically teach identifying an emotion type among a plurality of emotion types, and thus does not teach
corresponding to each emotion cause pair type.
Wang, however, teaches that the emotion cause pair has an emotion category Fig. 5, (Intro, Sec. 2, 3.2, 4.1, 4.2.1, 5.3).
Li, Xu, Shazeer, and Wang are analogous art because they are from a similar field of endeavor in performing emotion-cause pair extraction in conversations. Thus, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the identifying utterances that are an emotion utterance teachings of Li, as modified by Xu and Shazeer, with the classification of specific emotion categories for each emotion-cause pair as taught by Wang. It would have been obvious to combine the references to increase the effectiveness of the MECPE task using multimodal features (Wang Conclusion)).

Regarding claim 16, Li in view of Xu, Shazeer, and Wang teaches claim 15, and Xu further teaches
input a first candidate emotion cause pair among the candidate emotion cause pairs into each expert model, and determine whether the first candidate emotion cause pair is a true emotion cause pair corresponding to any of the emotion cause pair types based on the prediction result of each expert model and the weight (for each classifier, the emotion cause is input, i.e. inputting a first candidate emotion cause pair among the candidate emotion cause pairs into each expert model, and the each classifier outputs a result that is averaged to determine whether the input is an emotion cause event, i.e. determining whether the first candidate emotion cause pair is a true emotion cause pair corresponding to any of the emotion cause pair types based on the prediction result of each expert model and the weight Fig. 6,(Sec. 4 intro, 4.2.3)).  
Wang further teaches that the emotion cause pair has an emotion category Fig. 5, (Intro, Sec. 2, 3.2, 4.1, 4.2.1, 5.3).
Shazeer further teaches that the weights are determined for each expert neural network [0027].
And where the motivation to combine is the same as previously presented.
Conclusion
	
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NICOLE A K SCHMIEDER whose telephone number is (571)270-1474. The examiner can normally be reached 8:00 - 5:00 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre-Louis Desir can be reached at (571) 272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/NICOLE A K SCHMIEDER/Primary Examiner, Art Unit 2659
Read full office action
Prosecution Timeline

Jul 03, 2024
Application Filed
Mar 06, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/219,339
Patent 12572751
ELECTRONIC DEVICE AND CONTROLLING METHOD OF ELECTRONIC DEVICE
2y 5m to grant Granted Mar 10, 2026
17/626,617
Patent 12567408
MULTI-MODAL SMART AUDIO DEVICE SYSTEM ATTENTIVENESS EXPRESSION
2y 5m to grant Granted Mar 03, 2026
17/938,173
Patent 12554930
TRANSFORMER-BASED TEXT ENCODER FOR PASSAGE RETRIEVAL
2y 5m to grant Granted Feb 17, 2026
17/418,679
Patent 12542131
SYSTEM AND METHOD FOR COMMUNICATING WITH A USER WITH SPEECH PROCESSING
2y 5m to grant Granted Feb 03, 2026
17/667,487
Patent 12531071
PACKET LOSS CONCEALMENT METHOD AND APPARATUS, STORAGE MEDIUM, AND COMPUTER DEVICE
2y 5m to grant Granted Jan 20, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
68%
Grant Probability
99%
With Interview (+34.0%)
2y 10m
Median Time to Grant
Low
PTA Risk
Based on 167 resolved cases by this examiner. Grant probability derived from career allow rate.