Last updated: May 29, 2026

Application No. 18/454,828

EMOTION ESTIMATION APPARATUS AND EMOTION ESTIMATION METHOD

Final Rejection §102§103

Filed

Aug 24, 2023

Priority

Feb 28, 2023 — JP 2023-029725

Examiner

HANG, VU B

Art Unit

2654

Tech Center

2600 — Communications

Assignee

Kabushiki Kaisha Toshiba

OA Round

2 (Final)

Interview Optional

— +17.4% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 75% grant rate with +17.4% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 625 resolved cases, 2023–2026

Examiner Intelligence

HANG, VU B View full profile →

Grants 75% — above average

Career Allowance Rate

467 granted / 625 resolved

+12.7% vs TC avg

Strong +17% interview lift

Without

With

+17.4%

Interview Lift

resolved cases with interview

Typical timeline

3y 1m

Avg Prosecution

8 currently pending

Career history

640

Total Applications

across all art units

Statute-Specific Performance

§101

3.3%

-36.7% vs TC avg

§103

79.9%

+39.9% vs TC avg

§102

5.6%

-34.4% vs TC avg

§112

3.9%

-36.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 625 resolved cases

Office Action

§102 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-9 are pending.

Response to Arguments
Applicant's arguments filed, with respect to the rejection of Claims 1, 8 and 9 under 35 U.S.C. 102, have been fully considered but they are not persuasive. The applicant argues that the cited prior art, Provost et al. (US 11,545,173 B1), fails to disclose the limitation “calculate an emotion feature based on the extracted medium features and an emotion feature extraction model associated with a target user, the emotion feature extraction model denoting a standard for the medium features of the target user”.
In response, the examiner points out that Provost teaches utilizing a trained machine learning model to predict a mood of a particular user based on extracted acoustic features (see Fig.1 (122), Fig.6 (606,608) and Col.25, Line 65 – Col.26, Line 20, a set of acoustic features are applied to a machine learning model associated with a particular user). Provost further teaches that the trained machine learning could be an individual-specific model used to detect mood irregularities for a particular user or clinical patient (see Fig.1 (122) and Col.28, Line 13-23, linguistic and acoustic analysis could be applied with a trained machine learning model that is individual-specific for detecting mood irregularities and depression). Therefore, Provost suggests an apparatus configured to “calculate an emotion feature based on the extracted medium features and an emotion feature extraction model associated with a target user, the emotion feature extraction model denoting a standard for the medium features of the target user”.

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1 and 6-9 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Provost et al. (US 11,545,173 B1).
Regarding Claim 1, 8 and 9, Provost teaches an emotion estimation apparatus (see Fig.1 (120), Fig.5, Col.6, Line 26-29 and Col.23, Line 13-20), comprising processing circuitry (see Fig.5 (506) and Col.23, Line 13-24) configured to:
acquire medium data of a target user in which a medium of the target user is recorded (see Fig.1 (102,108), Fig.6 (602) and Col.25, Line 20-25, recorded audio data or conversation with a user);
extract medium features from the medium data of the target user (see Fig.6 (604) and Col.25, Line 44-51);
calculate an emotion feature based on the extracted medium features and an emotion feature extraction model associated with a target user (see Fig.1 (122), Fig.6 (606,608) and Col.25, Line 65 – Col.26, Line 20, a set of acoustic features are applied to a machine learning model associated with a particular user), the emotion feature extraction model denoting a standard for the medium features of the target user (see Fig.1 (122) and Col.28, Line 13-23, linguistic and acoustic analysis could be applied with a trained machine learning model that is individual-specific for detecting mood irregularities and depression);
and estimate an emotion of the target user based on the calculated emotion feature (see Fig.1 (120), Fig.6 (608) and Col.26, Line 7-13).
Regarding Claim 6, Provost further teaches wherein the processing circuitry receives medium data acquired by the medium data acquisition device and target user information for specifying the target user (see Fig.5 (502,506), Col.12, Line 31-37, Col.24, Line 8-15 and Col.24, Line 29-46, remote device 506 acquires recorded audio data and a trained user ID machine learning model), and extracts the medium data of the target user from the received medium data based on the received target user information (see Fig.5 (502,506), Col.24, Line 8-15 and Col.24, Line 29-46, extracting data for user identification).
Regarding Claim 7, Provost further teaches wherein the medium includes speech (see Fig.6 (602) and Col.25, Line 20-27, recorded telephone call).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-4 are rejected under 35 U.S.C. 103 as being unpatentable over Provost et al. (US 11,545,173 B1) in view of Ostrand et al. (US Patent 12,148,443 B1).
Regarding Claim 2, Provist teaches the apparatus of Claim 1 but fail to teach acquiring the emotion feature extraction model associated with a target user from a database storing one or more emotion feature extraction models associated with one or more users.
Provist, however, teaches acquiring an emotion feature extraction model associated with a particular user for predicting a mood of the user based on extracted acoustic features (see Fig.1 (122), Fig.6 (606,608) and Col.25, Line 65 – Col.26, Line 20).
Ostrand teaches querying a database for a speaker-specific acoustic model and retrieving the acoustic model from the database (see Fig.4 (452), Fig.7 (710,711,713) and Col.14, Line 4-23).
It would have been obvious for one skilled in the art, before the effective filing date of the application to configure Provost’s apparatus to acquire an emotion feature extraction model associated with a target user from a database storing one or more emotion feature extraction models associated with one or more users. The motivation would be to acquire an individual-specific emotion feature extraction model from a storage data structure that stores the emotion feature extraction models for multiple individuals.
Regarding Claim 3, Provost teaches training the emotion feature extraction model based on the medium data of a target user (see Fig.5 (502B), Col.11, Line 63-67 and Col.21, Line 54-59), but fail to teach determining whether or not the emotion feature extraction model exists in the model database; acquiring the emotion feature extraction model from the model database if the emotion feature extraction model exists in the model database; and registering a trained emotion feature extraction model in the database.
Ostrand, however, teaches querying a database for a speaker-specific acoustic model and retrieving the acoustic model from the database (see Fig.4 (452), Fig.7 (710,711,713) and Col.14, Line 4-23); and creating a profile of a speaker and storing the profile in the database if a speaker-specific acoustic model of a particular user does not exist (see Fig.7 (710,711,712) and Col.14, Line 4-23).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to configure emotion estimation apparatus to determine whether or not the emotion feature extraction model exists in the model database; acquire the emotion feature extraction model from the model database if the emotion feature extraction model exists in the model database; and register a trained emotion feature extraction model in the database. The motivation would be to train and store new emotion feature extraction models for new target users.
Regarding Claim 4, Provost further teaches wherein the processing circuitry generates the emotion feature extraction model by calculating a statistic of medium features extracted from at least a part of the medium data of the target user (see Fig.6 (606), Col.25, Line 65 – Col.26, Line 20 and Col.26, Line 32-44).
Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Provost et al. (US 11,545,173 B1) in view of Sasaki et al. (US Patent 11,422,527 B2).
Regarding Claim 5, Provost teaches the processing circuitry is configured to: calculate, for each processing unit in which emotion estimation is to be performed, a statistic of the extracted medium features (see Fig.6 (606), Col.25, Line 65 – Col.26, Line 20 and Col.26, Line 32-44), but fails to teach calculating, for each processing unit, the emotion feature based on a difference between the calculated statistic of the extracted medium features and the emotion feature extraction model.
Sasaki, however, teaches calculating a degree of difference between the extracted features from a feature extraction unit and the learning results of normal sounds to determine whether a sound input is normal or abnormal (see Fig.1 (17) and Col.5, Line 17-34).
It would have been obvious for one skilled in the art, before the effective filing date of the application, to configure emotion estimation apparatus to calculate, for each processing unit, the emotion feature based on a difference between the calculated statistic of the extracted medium features and the emotion feature extraction model. The motivation would be to perform classification on the medium data based on eth difference of the extracted features and the learned training data of the features.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VU B HANG whose telephone number is (571)272-0582.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hai Phan, can be reached at (571)272-6338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/VU B HANG/Primary Examiner, Art Unit 2654

Read full office action

Prosecution Timeline

Aug 24, 2023

Application Filed

Jun 03, 2025

Non-Final Rejection mailed — §102, §103

Aug 12, 2025

Examiner Interview Summary

Aug 12, 2025

Applicant Interview (Telephonic)

Aug 21, 2025

Response Filed

Dec 03, 2025

Final Rejection mailed — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/257,403

Patent 12640136

METHOD FOR TRAINING MODEL, SPEECH RECOGNITION METHOD, APPARATUS, MEDIUM, AND DEVICE

2y 11m to grant Granted May 26, 2026

18/689,668

Patent 12626687

SPEECH RECOGNITION METHOD, APPARATUS AND DEVICE, AND STORAGE MEDIUM

2y 2m to grant Granted May 12, 2026

18/233,728

Patent 12614548

PRE-EMPTIVELY INITIALIZING AN AUTOMATED ASSISTANT ROUTINE AND/OR DISMISSING A SCHEDULED ALARM

2y 8m to grant Granted Apr 28, 2026

18/441,789

Patent 12609110

TECHNIQUES FOR UTTERANCE GROUPING AND FOR IMPROVED TRAINING OF MACHINE LEARNING MODELS USING GROUPED UTTERANCE DATA

2y 2m to grant Granted Apr 21, 2026

18/624,615

Patent 12603082

METHOD FOR TRAINING VOICE CONVERSION MODEL, ELECTRONIC DEVICE, AND STORAGE MEDIUM

2y 0m to grant Granted Apr 14, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4

Expected OA Rounds

75%

Grant Probability

92%

With Interview (+17.4%)

3y 1m (~4m remaining)

Median Time to Grant

Moderate

PTA Risk

Based on 625 resolved cases by this examiner. Grant probability derived from career allowance rate.