Last updated: April 18, 2026
Application No. 18/300,989
MONITORING OF FACIAL CHARACTERISTICS

Final Rejection §103
Filed
Apr 14, 2023
Examiner
JONES, CARISSA ANNE
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Nokia Technologies Oy
OA Round
4 (Final)
Interview Optional

— +25.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 24 resolved cases, 2023–2026
Examiner Intelligence

JONES, CARISSA ANNE View full profile →
Grants 83% — above average
Career Allow Rate
20 granted / 24 resolved
+21.3% vs TC avg
Strong +25% interview lift
Without
With
+25.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
3.1%
-36.9% vs TC avg
§103
76.0%
+36.0% vs TC avg
§102
11.6%
-28.4% vs TC avg
§112
4.9%
-35.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 24 resolved cases
Office Action

§103
DETAILED ACTION
This action is in response to the remarks filed 03/05/2026:
Claims 16 – 25, 27 and 29 – 35 are pending
Claims 1 - 15, 26 and 28 are cancelled

Response to Arguments
Applicant’s arguments with respect to claims 16 – 25, 27 and 29 – 35  have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Response to Amendment
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 16, 17, 19, 21 – 24, 27, and 30 - 35 are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (E.P. Pub. No. 3402186, hereinafter "Lee") in view of Shimizu et al. (U.S. Patent No. 11,627,007, hereinafter “Shimizu”), Jung et al. (E.P. Pub. No. 2753076, hereinafter "Jung"), Chin et al. (U.S. Pub. No. 2019/0043064, hereinafter “Chin”), Garcia (U.S. Pub. No. 2016/0055370), and Zhao (C.N. Pub. No. 110502609).
Regarding Claim 16, Lee teaches
An apparatus (see Lee Paragraph [0001], apparatus) comprising:
at least one processor (see Lee Paragraph [0040], processor); and
at least one memory storing instructions that, when executed by the at least one processor (see Lee Paragraph [0040], The memory 130 may include a computer readable recording medium having a program recorded thereon to execute the method according to various example embodiments in the processor 120), cause the apparatus at least to:
determine that a communications device is positioned close to an ear of a user of the communications device during a communication session (see Lee Paragraph [0920], In operation 6605, the controller 580 may detect proximity or acceleration of the electronic device 500 during the voice call. According to various embodiments, the controller 580 may determine a distance with an object (e.g., a user's face or a user's ear) or a change in the distance based on the proximity or acceleration) wherein the communications device comprises at least one display (see Lee Paragraph [0016], there is provided an electronic device including: a display, and Figure 14A, a display on electronic device);
use one or more sensors in or under the display of the communications device (see Lee Paragraph [0052], The electronic device 201 may include a sensor module 240) to monitor one or more facial characteristics of the user (see Lee Paragraph [0249] According to one embodiment, the image process module 844 may perform biometric identification (e.g., iris, fingerprint, face, pose-based activity recognition, etc.), character recognition, handwriting recognition, image code recognition (e.g., barcode, quick response (QR) code, PDF-417, color code, etc.), machine vision such as object recognition, artificial intelligence, machine learning, and decision tree function. The image process module 844 may be interlocked with a database or learning engine of an external device in connection with performing various functions) while the communications device is close to the user's ear (see Lee Paragraph [0920], In operation 6605, the controller 580 may detect proximity or acceleration of the electronic device 500 during the voice call. According to various embodiments, the controller 580 may determine a distance with an object (e.g., a user's face or a user's ear) or a change in the distance based on the proximity or acceleration); 

Lee does not expressively teach
such that an imaging device of the communications device can no longer capture images of the user’s face
identify an emotional context of the user based on the monitored one or more facial characteristics of the user, wherein a machine learning program is used to identify the emotional context based upon the one or more facial characteristics of the user monitored by the one or more sensors;
detect a user input indicating whether or not the identified emotional context is correct; and
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct.

However, Shimizu teaches
such that an imaging device of the communications device can no longer capture images of the user’s face (see Shimizu Column 17, lines 30 - 43, For example, when the face region cannot be detected from the wide angle image at a certain moment in the face detection processing in the step S3, the face detection processing may be retried by using an image at a different moment in accordance with the number of times of the retry that has been previously set. Alternatively, when the face region cannot be detected or tracked, the latest detected face image or the registration image D10 may be used as the replacement image. The mobile information terminal 1 may display a message or others indicating that, for example, the face cannot be detected, inside the display screen DP, and handle this case by asking the user A to confirm whether or not the registration image D10 is to be used as the replacement transmission image D12, and Column 34, lines 23 – 27, When a face(s) of one (or some) of the users cannot be handled on time, the mobile information terminal 1 may use the registration image D10 as the replacement image, or an image of another icon, scenery or others as the replacement image)

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee), with an imaging device of a communications device no longer being able to capture images of the user’s face (as taught in Shimizu), the motivation being to provide the ability to have accurate user authentication and security, even when a face cannot be seen by a device’s camera, such as during a phone call (see Shimizu Column 24, lines 19 – 21, Column 17, lines 30 – 43 and Column 34, lines 23 – 27).

Lee in view of Shimizu does not expressively teach
identify an emotional context of the user based on the monitored one or more facial characteristics of the user, wherein a machine learning program is used to identify the emotional context based upon the one or more facial characteristics of the user monitored by the one or more sensors;
detect a user input indicating whether or not the identified emotional context is correct; and
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct.

However, Jung teaches
identify an emotional context of the user based on the monitored one or more facial characteristics of the user (see Jung Abstract and Figure 4, In response to detecting a face in the captured image, the facial image data is analyzed to identify an emotional characteristic of the face by identifying a facial feature and comparing the identified feature with a predetermined feature associated with an emotion).

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee), with an imaging device of a communications device no longer being able to capture images of the user’s face (as taught in Shimizu), the motivation being to provide the ability to have accurate user authentication and security, even when a face cannot be seen by a device’s camera, such as during a phone call (see Shimizu Column 24, lines 19 – 21, Column 17, lines 30 – 43 and Column 34, lines 23 – 27).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee in view of Shimizu), with identifying an emotional context of a user based on monitored facial characteristics of user (as taught in Jung), the motivation being to provide users the emotional context of other users to improve social awareness of communication (see Jung Paragraph [0053]).

Lee in view of Shimizu and Jung does not expressively teach
wherein a machine learning program is used to identify the emotional context based upon the one or more facial characteristics of the user monitored by the one or more sensors;
detect a user input indicating whether or not the identified emotional context is correct; and
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct.

However, Chin teaches
wherein a machine learning program is used to identify the emotional context based upon the one or more facial characteristics of the user monitored by the one or more sensors (see Chin Paragraph [0022], by receiving these and other sensor data, it will be appreciated sensor data may, individually or in combination, allow an analytics tools to directly or indirectly with analysis, determine various desirable features 118 within a particular environment. For example, the sensor(s) data may be used to detect 120 people within the environment, detect 122 (and hence then classify) gestures, perceive 124 emotional context for a person, and the like. Multiple different sensors may be used to analyze a person and the various sensor data may be combined to make, for example, a likely determination of emotional state in a given context, all of the various sensor and derivable context inputs may be analyzed to identify emotional context and other analytic results, Paragraph [0026], reviews are determined 136 with a tool or tools that are provided analytics 116 data 120-134 along with other data to assist with automatic review determination and creation, tool(s) generating the review may use an artificial intelligence (AI) component, e.g., neural net, deep neural network, expert system, rules system, etc., that may be trained on a variety of training models 140, and Paragraph[h [0020], Vision sensor data may be used to detect emotional state based at least in part on analyzing body language, gait, movement, facial expression, etc.);

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee), with an imaging device of a communications device no longer being able to capture images of the user’s face (as taught in Shimizu), the motivation being to provide the ability to have accurate user authentication and security, even when a face cannot be seen by a device’s camera, such as during a phone call (see Shimizu Column 24, lines 19 – 21, Column 17, lines 30 – 43 and Column 34, lines 23 – 27).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee in view of Shimizu), with identifying an emotional context of a user based on monitored facial characteristics of user (as taught in Jung), the motivation being to provide users the emotional context of other users to improve social awareness of communication (see Jung Paragraph [0053]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user based on monitored facial characteristics of user (as taught in Lee in view of Shimizu and Jung), with a machine learning program used to identify emotion based upon characteristics of a user monitored by sensors (as taught in Chin), the motivation being to implement machine learning to address the demand of performing data-intensive tasks (see Chin Paragraph [0072]).

Lee in view of Shimizu, Jung and Chin does not expressively teach
detect a user input indicating whether or not the identified emotional context is correct; and
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct.

However, Garcia teaches
detect a user input indicating whether or not the identified emotional context is correct (see Garcia Paragraph [0030], The facial expression recognition algorithm is further trained using existing user face images for each emotion category. In an embodiment, the automatic emotion recognition function operation may also include prompting the user to confirm the result of the analysis. The user face image is hence added to an emotion category if approved by the user); 

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee), with an imaging device of a communications device no longer being able to capture images of the user’s face (as taught in Shimizu), the motivation being to provide the ability to have accurate user authentication and security, even when a face cannot be seen by a device’s camera, such as during a phone call (see Shimizu Column 24, lines 19 – 21, Column 17, lines 30 – 43 and Column 34, lines 23 – 27).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee in view of Shimizu), with identifying an emotional context of a user based on monitored facial characteristics of user (as taught in Jung), the motivation being to provide users the emotional context of other users to improve social awareness of communication (see Jung Paragraph [0053]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user based on monitored facial characteristics of user (as taught in Lee in view of Shimizu and Jung), with a machine learning program used to identify emotion based upon characteristics of a user monitored by sensors (as taught in Chin), the motivation being to implement machine learning to address the demand of performing data-intensive tasks (see Chin Paragraph [0072]).
It would have been additionally further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user using machine learning based on monitored facial characteristics of user (as taught in Lee in view of Shimizu, Jung and Chin), with detecting a user input indicating whether or not the identified emotional context is correct (as taught in Garcia), the motivation being to add an additional step of confirmation in order to represent true and accurate representations of personal emotions (see Garcia Abstract).

Lee in view of Shimizu, Jung, Chin and Garcia does not expressively teach
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct.

However, Zhao teaches
train or update the machine learning program based upon the user input indicating whether or not the identified emotional context is correct (see Zhao Page 4, After the problem externalization model externalizes the user's "question story", the robot voice outputs the externalization result, and the built-in camera is turned on to record the user's facial expression at this time for deep analysis of expression. If the user's emotion is confirmed to be happy, joyful, relaxed, peaceful, etc., that is, a flashing blue background light reminds the user that the emotion adjustment of the round is successful. Among them, if the topic naming is stored in the topic thesaurus in step S3, the successful externalized language and topic naming are stored in the problem externalization model for direct recall in the future).

It would have been obvious to one of ordinary skill in the art before the effective filing date of
the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee), with an imaging device of a communications device no longer being able to capture images of the user’s face (as taught in Shimizu), the motivation being to provide the ability to have accurate user authentication and security, even when a face cannot be seen by a device’s camera, such as during a phone call (see Shimizu Column 24, lines 19 – 21, Column 17, lines 30 – 43 and Column 34, lines 23 – 27).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user (as taught in Lee in view of Shimizu), with identifying an emotional context of a user based on monitored facial characteristics of user (as taught in Jung), the motivation being to provide users the emotional context of other users to improve social awareness of communication (see Jung Paragraph [0053]).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user based on monitored facial characteristics of user (as taught in Lee in view of Shimizu and Jung), with a machine learning program used to identify emotion based upon characteristics of a user monitored by sensors (as taught in Chin), the motivation being to implement machine learning to address the demand of performing data-intensive tasks (see Chin Paragraph [0072]).
It would have been additionally further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user using machine learning based on monitored facial characteristics of user (as taught in Lee in view of Shimizu, Jung and Chin), with detecting a user input indicating whether or not the identified emotional context is correct (as taught in Garcia), the motivation being to add an additional step of confirmation in order to represent true and accurate representations of personal emotions (see Garcia Abstract).
It would have been further obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that uses one or more sensors of the communications device to monitor facial characteristics of the user and determines proximity to user to identify an emotional context of a user based on monitored facial characteristics of user using machine learning and detect user input to confirm the emotional context (as taught in Lee in view of Shimizu, Jung, Chin and Garcia), with training a machine learning program based upon the user input indicating whether or not the identified emotional context is correct (as taught in Zhao), the motivation being to apply artificial intelligence in order to concisely and accurately classify data and perform analysis, more specifically, regarding emotional context (see Zhao Page 3).

Regarding Claim 17, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 further configured to transmit information indicative of the identified emotional context to one or more other participants in the communication session (see Jung Figure 6 and Paragraph [0052], As indicated by image 601, a first emotion indicative image 71 of "Lee Sook" may be output in a first emotion indicative image region 41 on the display unit of the terminal of "Yoon Hee". As described before, the first emotion indicative image 71 may be updated when an emotional state change is detected through emotional analysis based on a facial image of "Lee Sook". In response to settings, the facial image of "Lee Sook" may be updated in real time or at regular intervals when an emotional state change is detected or not detected. The first emotion indicative image 71 may be composed of an emotion indicative image and textual emotional state information such as "Happiness" or "Neutral". As an emotion indicative image may be output in the image region 51 or 52, the mobile terminal 100 may set the emotion indicative image size to the size of a thumbnail usable in instant messaging at the time of emotion indicative image generation).

Regarding Claim 19, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 17 wherein the information indicative of the identified emotional context is usable to generate a visual representation of the identified emotional context (see Jung Figure 9 and Paragraph [0059], visual effects applied to display of facial emotion indicative images (see images 71, 72, and 73)).

Regarding Claim 21, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 wherein the one or more sensors in or under the display of the communications device comprise one or more optical sensors (see Lee Paragraph [0285], The sensors 890 may include an optical sensor).

Regarding Claim 22, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 21 wherein the one or more optical sensors are configured to detect infrared light that is reflected by the user's face (see Lee Paragraph [0061], The sensor module 240, for example, may measure a physical quantity or detect an operation state of the electronic device 201, and may convert the measured or detected information into an electrical signal. The sensor module 240 may include, for example, an Infrared (IR) sensor).

Regarding Claim 23, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 wherein the one or more facial characteristics comprise at least one of: the position of one or more facial features, relative distances between two or more facial features, movement of one or more facial features, or skin tone (see Jung Paragraph [0017], For emotion analysis, the sender terminal 101 supports emotion classification according to changes in feature elements of the face, such as eyes, nose, ears, forehead, cheekbones, chin, cheeks and facial appearance. For example, the sender terminal 101 may be equipped with an emotion classification database to support identification of an emotional state of the user on the basis of eye shape change, mouth openness or corner change, ear change, forehead crease change, chin position change, cheek shape change, face shadow change for example).

Regarding Claim 24, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 further configured to control the communications device to provide an output to the user to indicate the identified emotional context (see Jung Figure 9 and Paragraph [0059], visual effects applied to display of facial emotion indicative images (see images 71, 72, and 73)).

Regarding Claim 27, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 wherein the communications device comprises one or more microphones and wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to: analyze microphone output signals to determine when the user is speaking and control the monitoring of the one or more facial characteristics to monitor the one or more facial characteristics of the user while they are speaking (see Jung Paragraph [0003], The system provides a user interface based on face recognition supporting bidirectional communication of still images in response to detected change in facial expression of a speaking user and Paragraph [0075], That is, when a voice signal generated by the speaking user is acquired by the microphone MIC, the mobile terminal 100 may capture an image using the camera module 170 and perform emotional analysis using the captured image).

Regarding Claim 30, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teaches
An apparatus as claimed in claim 16 wherein the apparatus is at least one of: a telephone, a camera, a computing device, a teleconferencing device, a virtual reality device, or an augmented reality device (see Jung Paragraph [0004], A mobile device user interface method is provided, that activates a camera module to support a video chat function and acquires an image of a target object using the camera module).

Regarding Claim 31, it has been rejected similarly as Claim 16. The method can be found in Lee (Paragraph [0001], method).

Regarding Claim 32, it has been rejected similarly as Claim 17. The method can be found in Lee (Paragraph [0001], method).

Regarding Claim 33, it has been rejected similarly as Claim 21. The method can be found in Lee (Paragraph [0001], method).

Regarding Claim 34, it has been rejected similarly as Claim 16. The non-transitory computer readable medium can be found in Lee (Paragraph [0019], computer-readable recording medium).

Regarding Claim 35, it has been rejected similarly as Claim 21. The non-transitory computer readable medium can be found in Lee (Paragraph [0019], computer-readable recording medium).

Claims 18, 20, and 29, are rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (E.P. Pub. No. 3402186, hereinafter "Lee") in view of Shimizu et al. (U.S. Patent No. 11,627,007, hereinafter “Shimizu”), Jung et al. (E.P. Pub. No. 2753076, hereinafter "Jung"), Chin et al. (U.S. Pub. No. 2019/0043064, hereinafter “Chin”), Garcia (U.S. Pub. No. 2016/0055370), Zhao (C.N. Pub. No. 110502609) and Tong et al. (U.S. Pub. No. 2014/0198121, hereinafter "Tong").
Regarding Claim 18, Lee in view of Shimizu, Jung, Chin, Garcia and Zhao teach all the limitations of Claim 17, but does not expressively teach
An apparatus as claimed in claim 17 wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to:
detect a user input indicating whether or not the user of the communications device wants to transmit information indicative of the identified emotional context to one or more other participants in the communication session; and
in response to the user input, determine whether or not to transmit the information indicative of the identified emotional context to the one or more other participants in the communication session.

However, Tong in a similar field of invention teaches
An apparatus as claimed in claim 17 wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to:
detect a user input indicating whether or not the user of the communications device wants to transmit information indicative of the identified emotional context to one or more other participants in the communication session (see Tong Paragraph [0039], In one embodiment the general expression of the detected face may be converted into one or more parameters that cause the avatar to exhibit the same expression and Paragraph [0044], The feedback avatar 214 represents how the selected avatar appears on the remote device, in a virtual place, etc. In particular, the feedback avatar 214 appears as the avatar selected by the user and may be animated using the same parameters generated by avatar control module 210. In this way the user may confirm what the remote user is seeing during their interaction); and
in response to the user input, determine whether or not to transmit the information indicative of the identified emotional context to the one or more other participants in the communication session (see Tong Paragraph [0039], In one embodiment the general expression of the detected face may be converted into one or more parameters that cause the avatar to exhibit the same expression and Paragraph [0044], The feedback avatar 214 represents how the selected avatar appears on the remote device, in a virtual place, etc. In particular, the feedback avatar 214 appears as the avatar selected by the user and may be animated using the same parameters generated by avatar control module 210. In this way the user may confirm what the remote user is seeing during their interaction).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that determines that a communication device is in proximity to a user during a communication session and uses one or more sensors in or under the display of the communications device to monitor and identify emotional context based on facial characteristics of the user using machine learning, and allows users to confirm emotional context as well as training a machine learning program based on the user input whether or not the emotional context is correct (as taught in Lee in view of Shimizu, Jung, Chin, Garcia, and Zhao), with a user input indicating whether or not a user wants to transmit emotional context information to other participants in the communication session (as taught in Tong), the motivation being to allow the user to choose to not transmit emotional context if it is incorrect, or if they wish to have privacy and not desire one or more participants to view their avatar's emotional context (see Tong Paragraph [0039]).

Regarding Claim 20, Lee in view of Shimizu, Jung, Chin, Garcia, Zhao and Tong teach
An apparatus as claimed in claim 19 wherein the communication session comprises a video call and wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to: use the visual representation of the identified emotional context to replace images from the imaging device of the communications device in the video call (see Tong Paragraph [0039], In one embodiment the general expression of the detected face may be converted into one or more parameters that cause the avatar to exhibit the same expression and Figure 2, avatar is generated to represent user in a video call (see Figures 4A – 4C showing avatar generation)).

Regarding Claim 29, Lee in view of Shimizu, Jung, Chin, Garcia, Zhao and Tong teach
An apparatus as claimed in claim 19 wherein the visual representation comprises at least one of: an avatar, an animated Emoji®, or an image of the user of the communications device (see Tong Paragraph [0039], In one embodiment the general expression of the detected face may be converted into one or more parameters that cause the avatar to exhibit the same expression and Figure 2, avatar is generated to represent user in a video call (see Figures 4A – 4C showing avatar generation)).

Claim 25 is rejected under 35 U.S.C. 103 as being unpatentable over Lee et al. (E.P. Pub. No. 3402186, hereinafter "Lee") in view of Shimizu et al. (U.S. Patent No. 11,627,007, hereinafter “Shimizu”), Jung et al. (E.P. Pub. No. 2753076, hereinafter "Jung"), Chin et al. (U.S. Pub. No. 2019/0043064, hereinafter “Chin”), Garcia (U.S. Pub. No. 2016/0055370), Zhao (C.N. Pub. No. 110502609) and Tartz et al. (C.N. Pub. No. 103782253, hereinafter "Tartz").
Regarding Claim 25, Lee in view of Shimizu, Jung, Chin, Garcia, and Zhao teach all the limitations of Claim 24, but does not expressively teach
An apparatus as claimed in claim 24 wherein the communications device comprises a tactile output device and wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to: control the tactile output device to provide the output indicating the identified emotional context.

However, Tartz in a similar field of invention teaches
An apparatus as claimed in claim 24 wherein the communications device comprises a tactile output device and wherein the at least one memory and the instructions stored therein are configured to, with the at least one processor, further cause the apparatus to: control the tactile output device to provide the output indicating the identified emotional context (see Tartz Page 7 of Description, the second device 115-b-2 of analysis module 225-a may be analyzed to determine the emotional state of the first user. For example, it can through tactile response; provide a response form of heat using the Peltier element to reflect first user anxiety or agitation, a vibration burst, an audible indicator, a specific frequency, sound intensity or duration, form of cooling to reflect silence, a shorter sound to indicate the change in the emotional state, so as to transmit emotional context).

It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of an apparatus that determines that a communication device is in proximity to a user during a communication session and uses one or more sensors in or under the display of the communications device to monitor and identify emotional context based on facial characteristics of the user using machine learning, and allows users to confirm emotional context as well as training a machine learning program based on the user input whether or not the emotional context is correct (as taught in Lee in view of Shimizu, Jung, Chin, Garcia, and Zhao), with a tactile output device that provides an output indicating emotional context (as taught in Tartz), the motivation being to provide a communication session that can deliver a more realistic (face-to-face) level of interaction (see Tartz Page 1 of Description).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.  Refer to PTO-892, Notice of References Cited for a listing of analogous art.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARISSA A JONES whose telephone number is (703)756-1677. The examiner can normally be reached Telework M-F 6:30 AM - 4:00 PM CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached on 5712727503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CARISSA A JONES/Examiner, Art Unit 2691      

/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691
Read full office action
Prosecution Timeline

Apr 14, 2023
Application Filed
Mar 13, 2025
Non-Final Rejection — §103
Jun 12, 2025
Response Filed
Jun 25, 2025
Final Rejection — §103
Oct 22, 2025
Response after Non-Final Action
Nov 13, 2025
Request for Continued Examination
Nov 21, 2025
Response after Non-Final Action
Dec 01, 2025
Non-Final Rejection — §103
Mar 05, 2026
Response Filed
Apr 07, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/299,777
Patent 12598267
IMAGE CAPTURE APPARATUS AND CONTROL METHOD
2y 5m to grant Granted Apr 07, 2026
18/354,967
Patent 12598354
INFORMATION PROCESSING SERVER, RECORD CREATION SYSTEM, DISPLAY CONTROL METHOD, AND NON-TRANSITORY RECORDING MEDIUM
2y 5m to grant Granted Apr 07, 2026
18/124,682
Patent 12593004
DISPLAY METHOD, DISPLAY SYSTEM, AND NON-TRANSITORY COMPUTER-READABLE STORAGE MEDIUM STORING PROGRAM
2y 5m to grant Granted Mar 31, 2026
18/163,371
Patent 12556468
QUALITY TESTING OF COMMUNICATIONS FOR CONFERENCE CALL ENDPOINTS
2y 5m to grant Granted Feb 17, 2026
18/297,357
Patent 12556655
Efficient Detection of Co-Located Participant Devices in Teleconferencing Sessions
2y 5m to grant Granted Feb 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+25.0%)
2y 10m
Median Time to Grant
High
PTA Risk
Based on 24 resolved cases by this examiner. Grant probability derived from career allow rate.