Last updated: April 19, 2026
Application No. 17/349,000
VIDEO-BASED PHYSIOLOGICAL SENSING INTEGRATED INTO VIRTUAL CONFERENCES

Non-Final OA §103
Filed
Jun 16, 2021
Examiner
RASNIC, HUNTER J
Art Unit
3684
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Microsoft Technology Licensing, LLC
OA Round
5 (Non-Final)
Interview Optional

— +20.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 81 resolved cases, 2023–2026
Examiner Intelligence

RASNIC, HUNTER J View full profile →
Grants only 11% of cases
Career Allow Rate
9 granted / 81 resolved
-40.9% vs TC avg
Strong +20% interview lift
Without
With
+20.5%
Interview Lift
resolved cases with interview
Typical timeline
4y 7m
Avg Prosecution
41 currently pending
Career history
122
Total Applications
across all art units
Statute-Specific Performance

§101
39.1%
-0.9% vs TC avg
§103
37.3%
-2.7% vs TC avg
§102
16.2%
-23.8% vs TC avg
§112
6.8%
-33.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 81 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 21 August 2025 has been entered.

Response to Amendment
Claims 1-3, 5-9, 12-17, & 19-24 were previously pending in this application.  The amendment filed 21 August 2025 has been entered and the following has occurred:  Claims 1, 8-9, 14, 16-17, 21, & 24 have been amended.  No claims have been added.  No claims have been cancelled.
Claims 1-3, 5-9, 12-17, & 19-24 remain pending in the application.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 5-9, 12-17, & 19-24 are rejected under 35 U.S.C. 103 as being unpatentable over Quy et al. (U.S.  Patent Publication No. 2021/0118323), hereinafter “Quy”, in view of Amble et al. (U.S. Patent Publication No. 2020/0221951), hereinafter “Amble”.

Claim 1 –
Regarding Claim 1,  Quy discloses a method for providing a user characteristic to a service provider for a virtual conference with a user, the method comprising: 
collecting, by a user computing device (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer), raw media data associated with the user during the virtual conference between the user and the service provider (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist), wherein
the raw media data comprises video data (See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting);
extracting, by the user computing device during the virtual conference (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy; See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist), intermediate user data from the raw media data by performing a first processing of the video data (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting), wherein the intermediate user data comprises one or more of a physiological signal and a behavioral signal associated the user (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses voice features extracted from a video call and various body language features, i.e. behavioral signals, and further known techniques classify these features to obtain emotion data, i.e. second processing, being identified and extracted for analysis from the video signal, such as posture, head movements, or fidgeting); 
encoding, by the user computing device, the video data into encoded media data (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer, and the EMD processing/outputting results related to a user’s emotion data such as by displaying said results, constituting “transformed” media data under broadest reasonable interpretation (BRI); See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “encoded” media data under BRI; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; See Quy Par [0046] discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data, albeit not recited explicitly for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection), the encoded media data being smaller data packets of the video data (See Quy Par [0016] which discloses received signals possibly being further processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, i.e. an initial processing of the intermediate user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; it is further understood that signal separation methods and/or removal of artifacts could constitute the transformation of a raw, larger data signal into a “smaller data packet” of raw media data, under BRI, since these extraneous portions of data are effectively being removed, albeit not explicitly recited for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection); and 
transmitting, by the user computing device (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer, and the EMD processing/outputting results related to a user’s emotion data;  See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network), the encoded media data with the intermediate user data to a remote device for second processing of the intermediate user data (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network, albeit not recited for “encoded” said media data per se, which is instead met by Amble as reasoned further below in the rejection), wherein the second processing of the intermediate user data includes extracting one or more of a physiological characteristic and a behavioral characteristic based at least in part upon the intermediate user data (See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “transformed” media data under BRI; See Quy Par [0046] which discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network;  See Quy Par [0038] which discloses the already-processed emotion data being processed, analyzed, and outputted including sharing emotion data among the network of users, constituting a second processing of the intermediate user data; See Quy Par [0039] which discloses the server deriving emotion data from the signals, such as by the extraction techniques described in Quy Par [0017]-[0019] which disclose extracting a physiological characteristic from the intermediate user data, e.g. heart rate activity, EEG, etc., and Quy Par [0019]-[0021] which disclose extracting a behavioral characteristic from the intermediate user data, such as facial expressions, head movements, posture, or fidgeting, i.e. in Quy Par [0021], audio and/or video signal received, algorithm extracts voice features from the audio signal, and second processing, i.e. classifying technique, classifies the voice features to obtain emotion data).

While Quy generally discloses “transforming” video data into “transformed” media data, it is understood by Examiner that “encoding” has a more specific meaning than simply “transforming” data, such as a specific action performed .  As such, Quy does not seem to anticipate the following claim limitations that recite “encoding” raw media data and/or “encoded” media data:
encoding, by the computing device, the video data into encoded media data, 
the encoded media data being smaller data packets of the video data;
transmitting, by the user computing device, the encoded media data with the intermediate user data to a remote device for second processing of the intermediate user data;
However, Amble specifically discloses “encoding” said video data as shown below:
encoding, by the user computing device, the video data into encoded media data (It is further understood by Examiner that “compression” comprises encoding data into fewer bits/data packets than the original data (While not relied upon, see Wikipedia “Data Compression” NPL for further evidence of compression comprising “encoding” efforts i.e. encoding information using fewer bits than the original data file/stream), therefore see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet), the encoded media data being smaller data packets of the video data (See Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, such as based on bandwidth requirements for various telemetry communications can be performed, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet, such that if said audio-video conference data streams are compressed, they would thereby encode data packets using fewer bits than the original audio-video data stream);
transmitting, by the user computing device, the encoded media data with the intermediate user data to a remote device for second processing of the intermediate user data (see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams, and specifically mentions a device integration layer, such that the integration of any third-party medical (digital) device regardless of the output format to integrate with DIL and communicate with remote entities, i.e. devices, in a tele-health session, such as by transmitting data in the correct method/frequency/security to the cloud servers for distribution to remote entities; See Amble Par [0086] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed as described in Amble Par [0087]-[0093]).
The disclosure of Amble is directly applicable to the disclosure of Quy because both disclosures share limitations and capabilities such as being directed towards providing and monitoring tele-health sessions over a tele-health platform.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the disclosure of Quy, which already generally discloses “transforming” video data into transformed media data, for the “transforming” to specifically comprise “encoding” and/or compressing of said data, as disclosed by Amble, because this allows for bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, i.e., communications based on bandwidth requirements for various telemetry communications, can be performed and further optimized in the system (See Amble Par [0035]).

Claim 2 –
Regarding Claim 2, Quy and Amble disclose the method of claim 1 in its entirety.  Quy further discloses a method, wherein:
the one or more of a physiological signal and a behavioral signal is a waveform (See Quy Par [0015] which discloses the use of biosensors recording physiological signals such that electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG), photoplethysmography (PPG) can be employed, which are all understood by one of ordinary skill in the art to constitute waveform signals, and further discloses PPG including recording heart pulse rate and blood volume pulse such that heart rate variability (HRV) is determined) corresponding to one or more of a frequency (See Quy Par [0018] which discloses valence was being associated with HRV, in particular the ratio of low frequency to high frequency (LF/HF) heart rate activity) associated with the user characteristic (See Quy Par [0018] which discloses if LF/HF is low (calibrated for that user) and/or the heart rate range is low (calibrated for that user) this indicates a negative emotional state, i.e. associated with a user characteristic).

Claim 3 –
Regarding Claim 3, Quy and Amble disclose the method of claim 1 in its entirety.  Quy further discloses a method, wherein:
collecting the raw media data associated with the user further comprises collecting, by the computing device (See Quy Par [0016] which discloses received signals possibly being further processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, i.e. an initial processing of the intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD)), raw media data using a sensor (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist) communicatively coupled to the user computing device (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist).

Claim 5 –
Regarding Claim 5, Quy and Amble disclose the method of claim 1 in its entirety.  Quy and Amble further disclose a method, wherein:
the one or more of a physiological characteristic and a behavioral characteristic comprises a plurality of user characteristics (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0021] which discloses voice features extracted from a video call and various body language features, i.e. behavioral signals, being identified and extracted for analysis from the video signal, such as posture, head movements, or fidgeting);
assigning, by the computing device, an authorization level to the service provider (See Amble Par [0037] which defines a role-based access schema for tele-health delivery that factors in pre-defined sets of views for any given role and Amble Par [0076] which discloses the module being used for identification of user roles (such as Physician, Clinician, patient/resident, HIPAA-authorized member, non-authorized member, system admin, etc.)), indicating which of the plurality of user characteristics are accessible by the service provider (See Amble Par [0037] which discloses the priority configuration being developed based on analytics using information such as available views can be presented to the provider based on the security role and the access; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc.).  
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combined disclosure of Quy and Amble, which already discloses a service provider having access to the user characteristics, to assign an authorization level to the service provider indicating which of the plurality of user characteristics are accessible by the service provider, as taught by Amble, to allow for viewing of all the above data in multiple widgets in a portal screen based on the access rights of the physician with the patient and facility information (See Amble Par [0037]).

Claim 6 –
Regarding Claim 6, Quy and Amble disclose the method of claim 1 in its entirety.  Quy further discloses a method, wherein:
the service provider includes at least one of a healthcare provider or a fitness trainer (See Quy Par [0010] which discloses monitoring emotion data during an online interaction with a therapist, i.e. service provider; See Quy Par [0034] which discloses a therapist, clinical psychologist, psychiatrist, clinician, behavioral therapist, caregiver, counsellor, facilitator, or other healthcare provider).

Claim 7 –
Regarding Claim 7, Quy and Amble disclose the method of claim 1 in its entirety.  Quy further discloses a method, wherein:
the computing device is a game controller (See Quy Par [0015] which discloses sensors being be integrated into the casing of a mobile phone, game controller), and wherein collecting the raw media data associated with the user during the virtual conference between the user and the service provider further comprises collecting raw media data associated with the user during a game played by the user on a game server. (See Quy Par [0015] which discloses sensors being be integrated into the casing of a mobile phone, game controller;  See Quy Par [0024] which discloses the emotion data can be monitored or implemented via various means such as group activities, multiplayer games, etc., to enhance telepresence in video conferencing between remote participants by monitoring and sharing their emotional responses, i.e. participants of a multiplayer game;  See Quy Par [0025] which discloses a server application program allowing the interaction with the users, sharing emotion data among multiple users in real time or later as required, such that the interaction could be group activities, multiplayer games as in Quy Par [0024])

Claim 8 –
Regarding Claim 8, Quy and Amble disclose the method of claim 1 in its entirety.  Quy further discloses a method, wherein:
the one or more of a physiological signal and a behavioral signal comprise one or more of peripheral blood flow, heart rate, heart rate variability, respiration, blood oxygenation (See Quy Par [0015] which discloses the use of biosensors recording physiological signals such that electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG), photoplethysmography (PPG) can be employed, which are all understood by one of ordinary skill in the art to constitute waveform signals, and further discloses PPG including recording heart pulse rate and blood volume pulse such that HRV is determined).  

Claim 9 –
Regarding Claim 9, Quy discloses a method for providing a user characteristic to a service provider for a virtual conference with a user, the method comprising: 
collecting, by a user computing device (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer), raw media data associated with the user (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist), wherein
the raw media data comprises video data (See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting);
extracting, by the user computing device (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy; See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist), intermediate user data from the raw media data by performing a first processing of the video data (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting), wherein the intermediate user data comprises one or more of a physiological signal and a behavioral signal associated with the user (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses voice features extracted from a video call and various body language features, i.e. behavioral signals, and further known techniques classify these features to obtain emotion data, i.e. second processing, being identified and extracted for analysis from the video signal, such as posture, head movements, or fidgeting);
encoding, by the computing device the video data into encoded media data (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer, and the EMD processing/outputting results related to a user’s emotion data such as by displaying said results, constituting “transformed” media data under broadest reasonable interpretation (BRI); See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “encoded” media data under BRI; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; See Quy Par [0046] discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data, albeit not recited explicitly for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection), the encoded media data being smaller data packets of the video data (See Quy Par [0016] which discloses received signals possibly being further processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, i.e. an initial processing of the intermediate user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; it is further understood that signal separation methods and/or removal of artifacts could constitute the transformation of a raw, larger data signal into a “smaller data packet” of raw media data, under BRI, since these extraneous portions of data are effectively being removed, albeit not explicitly recited for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection);
transmitting, by the user computing device (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer, and the EMD processing/outputting results related to a user’s emotion data;  See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network), the encoded media data with the intermediate user data to a server (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network, albeit not recited for “encoded” said media data per se, which is instead met by Amble as reasoned further below in the rejection);
performing, by the server, a second processing of the encoded media data and the intermediate user data to generate the user characteristic, including one or more of a physiological characteristic and a behavioral characteristic, wherein the second processing of the intermediate user data includes extracting the user characteristics based at least in part upon the intermediate user data (See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “transformed” media data under BRI; See Quy Par [0046] which discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network;  See Quy Par [0038] which discloses the already-processed emotion data being processed, analyzed, and outputted including sharing emotion data among the network of users, constituting a second processing of the intermediate user data; See Quy Par [0039] which discloses the server deriving emotion data from the signals, such as by the extraction techniques described in Quy Par [0017]-[0019] which disclose extracting a physiological characteristic from the intermediate user data, e.g. heart rate activity, EEG, etc., and Quy Par [0019]-[0021] which disclose extracting a behavioral characteristic from the intermediate user data, such as facial expressions, head movements, posture, or fidgeting, i.e. in Quy Par [0021], audio and/or video signal received, algorithm extracts voice features from the audio signal, and second processing, i.e. classifying technique, classifies the voice features to obtain emotion data); and 
providing, by the server, the user characteristic to the service provider for use in the virtual conference (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist;  See Quy Par [0046] which discloses transmitting the data to the EMD and the EMD displaying the emotion data to the user(s))

While Quy generally discloses “transforming” video data into “transformed” media data, it is understood by Examiner that “encoding” has a more specific meaning than simply “transforming” data, such as a specific action performed .  As such, Quy does not seem to anticipate the following claim limitations that recite “encoding” raw media data and/or “encoded” media data:
encoding, by the computing device, the video data into encoded media data, 
the encoded media data being smaller data packets of the video data;
transmitting, by the user computing device, the encoded media data with the intermediate user data to server;
performing, by the server, a second processing of the encoded media data.
However, Amble specifically discloses “encoding” said video data as shown below:
encoding, by the user computing device, the video data into encoded media data (It is further understood by Examiner that “compression” comprises encoding data into fewer bits/data packets than the original data (While not relied upon, see Wikipedia “Data Compression” NPL for further evidence of compression comprising “encoding” efforts i.e. encoding information using fewer bits than the original data file/stream), therefore see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet), the encoded media data being smaller data packets of the video data (See Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, such as based on bandwidth requirements for various telemetry communications can be performed, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet, such that if said audio-video conference data streams are compressed, they would thereby encode data packets using fewer bits than the original audio-video data stream);
transmitting, by the user computing device, the encoded media data with the intermediate user data to a server (see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams, and specifically mentions a device integration layer, such that the integration of any third-party medical (digital) device regardless of the output format to integrate with DIL and communicate with remote entities, i.e. devices, in a tele-health session, such as by transmitting data in the correct method/frequency/security to the cloud servers for distribution to remote entities; See Amble Par [0086] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed as described in Amble Par [0087]-[0093]);
performing, by the server, a second processing of the encoded media data (see Amble Par [0086] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed as described in Amble Par [0087]-[0093]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the disclosure of Quy, which already generally discloses “transforming” video data into transformed media data, for the “transforming” to specifically comprise “encoding” and/or compressing of said data, as disclosed by Amble, because this allows for bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, i.e., communications based on bandwidth requirements for various telemetry communications, can be performed and further optimized in the system (See Amble Par [0035]).

Claim 12 –
Regarding Claim 12, Quy and Amble disclose the method of claim 9 in its entirety.  Quy further discloses a method, wherein:
one or more of a physiological signal and a behavioral signal associated with the user comprises a waveform corresponding to the user characteristic (See Quy Par [0015] which discloses the use of biosensors recording physiological signals such that electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG), photoplethysmography (PPG) can be employed, which are all understood by one of ordinary skill in the art to constitute waveform signals, and further discloses PPG including recording heart pulse rate and blood volume pulse such that heart rate variability (HRV) is determined, (See Quy Par [0018] which discloses if LF/HF is low (calibrated for that user) and/or the heart rate range is low (calibrated for that user) this indicates a negative emotional state, i.e. associated/corresponding with a user characteristic).

Claim 13 –
Regarding Claim 13, Quy and Amble disclose the method of claim 9 in its entirety.  Amble discloses disclose a method, further comprising:
generating, by the server, a report based at least in part on the extracted one or more of a physiological characteristic and a behavioral characteristic (See Amble Par [0081] which discloses an evaluation module that can collect and monitor aspects of a healthcare organization and produce reports in support of ACOs that allows all participants, i.e. user and service provider, to rate/evaluate the system and the other participants directly involved with a healthcare session; See Amble Par [0044] which discloses patient data being collected digitally at every encounter of the system for participants within the system and is available for analytics; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc., and further states that methods can generate statistics for each area and present to the actors of the system for all aspects), the report being accessible by the user and the service provider (See Amble Par [0081] which discloses an evaluation module that can collect and monitor aspects of a healthcare organization and produce reports in support of ACOs that allows all participants, i.e. user and service provider, to rate/evaluate the system and the other participants directly involved with a healthcare session; See Amble Par [0044] which discloses patient data being collected digitally at every encounter of the system for participants within the system and is available for analytics; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc., and further states that methods can generate statistics for each area and present to the actors of the system for all aspects)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combined method of Quy and Amble, which already discloses presenting one or more of a physiological characteristics and a behavioral characteristic, to specifically disclose generating a report based on the extracted physiological and behavioral characteristic to be accessible by the user and the service provider, as taught by Amble, because this allows patient data to be collected, analyzed, and reviewed digitally at every encounter of the system for participants within the system (See Amble Par [0044]-[0045]).

Claim 14 –
Regarding Claim 14, Quy and Amble disclose the method of claim 9 in its entirety.  Quy further discloses a method, wherein:
collecting the raw media data associated with the user comprises collecting raw media data asynchronously with the virtual conference (See Quy Par [0016] which discloses received signals possibly being further processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, i.e. an initial processing of the intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD);  See Quy Par [0043 which discloses the video communication, biometric data, emotion data, and alliance may also be recorded and stored for asynchronous review and analysis).

Claim 15 –
Regarding Claim 15, Quy and Amble disclose the method of claim 9 in its entirety.  Quy further discloses a method, wherein:
the computing device is a game controller (See Quy Par [0015] which discloses sensors being be integrated into the casing of a mobile phone, game controller), and wherein collecting the media data associated with the user during the virtual conference between the user and the service provider further comprises collecting media data associated with the user during a game played by the user on a game server (See Quy Par [0015] which discloses sensors being be integrated into the casing of a mobile phone, game controller;  See Quy Par [0024] which discloses the emotion data can be monitored or implemented via various means such as group activities, multiplayer games, etc., to enhance telepresence in video conferencing between remote participants by monitoring and sharing their emotional responses, i.e. participants of a multiplayer game;  See Quy Par [0025] which discloses a server application program allowing the interaction with the users, sharing emotion data among multiple users in real time or later as required, such that the interaction could be group activities, multiplayer games as in Quy Par [0024]).

Claim 16 –
Regarding Claim 16, Quy and Amble disclose the method of claim 9 in its entirety.  Quy further discloses a method, wherein:
the service provider comprises at least one of a healthcare provider and a fitness trainer (See Quy Par [0010] which discloses monitoring emotion data during an online interaction with a therapist, i.e. service provider; See Quy Par [0034] which discloses a therapist, clinical psychologist, psychiatrist, clinician, behavioral therapist, caregiver, counsellor, facilitator, or other healthcare provider), and 
wherein the one or more of a physiological signal and a behavioral signal comprises peripheral blood flow, heart rate, heart rate variability, respiration, blood oxygenation, blood pressure, facial actions, blushing, blinking, and/or body and vocal vibrations (See Quy Par [0015] which discloses the use of biosensors recording physiological signals such that electrocardiogram (ECG), electromyogram (EMG), and electroencephalogram (EEG), photoplethysmography (PPG) can be employed, which are all understood by one of ordinary skill in the art to constitute waveform signals, and further discloses PPG including recording heart pulse rate and blood volume pulse such that HRV is determined).

Claim 17 –
Regarding Claim 17, Quy discloses a system for providing a user characteristic to a service provider for a virtual conference with a user, the system comprising: 
a computing device configured to (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer): 
collect raw media data associated with the user (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist), wherein 
the raw media data comprises video data (See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting);
extract intermediate user data from the raw media data by performing a first processing of the video data (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy; See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist; See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting), the intermediate user data comprising one or more of a physiological signal and a behavioral signal associated with the user (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy; See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist; See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0021] which discloses voice features extracted from a video call and various body language features, i.e. behavioral signals, and further known techniques classify these features to obtain emotion data, i.e. second processing, being identified and extracted for analysis from the video signal, such as posture, head movements, or fidgeting); 
encode the video data into encoded media data (See Quy Par [0017] which discloses an emotion monitoring device (EMD) being implemented on a number of devices, such as, a user’s tablet computer, smart display, netbook computer, laptop, personal computer, and the EMD processing/outputting results related to a user’s emotion data such as by displaying said results, constituting “transformed” media data under broadest reasonable interpretation (BRI); See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “encoded” media data under BRI; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; See Quy Par [0046] discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data, albeit not recited explicitly for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection), the encoded media data being smaller data packets of the video data (See Quy Par [0016] which discloses received signals possibly being further processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, i.e. an initial processing of the intermediate user data; See Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting; it is further understood that signal separation methods and/or removal of artifacts could constitute the transformation of a raw, larger data signal into a “smaller data packet” of raw media data, under BRI, since these extraneous portions of data are effectively being removed, albeit not explicitly recited for “encoding” said media data per se, which is instead met by Amble as reasoned further below in the rejection);
transmit the encoded media data with the intermediate user data to a server (See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network, albeit not recited for “encoded” said media data per se, which is instead met by Amble as reasoned further below in the rejection); and the server configured to: 
perform a second processing of the encoded media data and the intermediate user data to generate the user characteristic including one or more of a physiological characteristic and a behavioral characteristic (See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “transformed” media data under BRI, albeit not recited for “encoded” said media data per se, which is instead met by Amble as reasoned further below in the rejection; See Quy Par [0046] which discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network;  See Quy Par [0038] which discloses the already-processed emotion data being processed, analyzed, and outputted including sharing emotion data among the network of users, constituting a second processing of the intermediate user data; See Quy Par [0039] which discloses the server deriving emotion data from the signals, such as by the extraction techniques described in Quy Par [0017]-[0019] which disclose extracting a physiological characteristic from the intermediate user data, e.g. heart rate activity, EEG, etc., and Quy Par [0019]-[0021] which disclose extracting a behavioral characteristic from the intermediate user data, such as facial expressions, head movements, posture, or fidgeting, i.e. in Quy Par [0021], audio and/or video signal received, algorithm extracts voice features from the audio signal, and second processing, i.e. classifying technique, classifies the voice features to obtain emotion data), wherein the second processing of the intermediate user data includes extracting the user characteristic based at least in part on the intermediate user data (See Quy Par [0016] which discloses received signals possibly being processed to enhance signal detection and remove artifacts based on blind signal separation methods or machine learning techniques, also constituting “transformed” media data under BRI; See Quy Par [0046] which discloses “The biosensor sends the signal to a signal processing unit (SPU) which amplifies the signal and reduces artifact and noise in the signal, but for some types of biosensors, e.g. camera or microphone, this step may be omitted and the signals are processed in a later step to remove artifact and noise, e.g., by discarding signal epochs with poor image or audio quality, and data outliers”, i.e. effectively describing that in some instances a separate step of transforming the “raw media data” into “transformed raw media data” occurs, and that separate step is outside of the process of receiving raw media data, processing said raw media data into intermediate user data, and extracting physiological characteristics and/or behavioral characteristics of the user from said intermediate user data;  See Quy Par [0017] which discloses the physiological signals being transmitted to an emotion monitoring device (EMD) and further discloses the EMD processing the physiological signals to derive and display emotion data, such as arousal and valence components, thereby constituting processed raw media data, i.e. intermedia user data; See Quy Par [0025] which discloses the biometric and emotion data being transmitted to an internet server, or a cloud infrastructure, via a wired or wireless telecommunication network;  See Quy Par [0038] which discloses the already-processed emotion data being processed, analyzed, and outputted including sharing emotion data among the network of users, constituting a second processing of the intermediate user data; See Quy Par [0039] which discloses the server deriving emotion data from the signals, such as by the extraction techniques described in Quy Par [0017]-[0019] which disclose extracting a physiological characteristic from the intermediate user data, e.g. heart rate activity, EEG, etc., and Quy Par [0019]-[0021] which disclose extracting a behavioral characteristic from the intermediate user data, such as facial expressions, head movements, posture, or fidgeting, i.e. in Quy Par [0021], audio and/or video signal received, algorithm extracts voice features from the audio signal, and second processing, i.e. classifying technique, classifies the voice features to obtain emotion data); and
provide the user characteristic to the service provider for use in the virtual conference (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy;  See Quy Par [0040] which discloses the emotion data displayed for review by a therapist such that the virtual conference would be between the user and the service provider, i.e. therapist;  See Quy Par [0046] which discloses transmitting the data to the EMD and the EMD displaying the emotion data to the user(s)).  

While Quy generally discloses “transforming” video data into “transformed” media data, it is understood by Examiner that “encoding” has a more specific meaning than simply “transforming” data, such as a specific action performed .  As such, Quy does not seem to anticipate the following claim limitations that recite “encoding” raw media data and/or “encoded” media data:
encode the video data into encoded media data, 
the encoded media data being smaller data packets of the video data;
transmit the encoded media data with the intermediate user data to a server;
perform a second processing of the encoded media data.
However, Amble specifically discloses “encoding” said video data as shown below:
encode the video data into encoded media data (It is further understood by Examiner that “compression” comprises encoding data into fewer bits/data packets than the original data (While not relied upon, see Wikipedia “Data Compression” NPL for further evidence of compression comprising “encoding” efforts i.e. encoding information using fewer bits than the original data file/stream), therefore see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet), the encoded media data being smaller data packets of the video data (See Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams;  See Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, such as based on bandwidth requirements for various telemetry communications can be performed, and further specifically discloses at Amble Par [0069] that the communication between a client and server can occur in the form of data packet, such that if said audio-video conference data streams are compressed, they would thereby encode data packets using fewer bits than the original audio-video data stream);
transmit the encoded media data with the intermediate user data to a server (see Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams, and specifically mentions a device integration layer, such that the integration of any third-party medical (digital) device regardless of the output format to integrate with DIL and communicate with remote entities, i.e. devices, in a tele-health session, such as by transmitting data in the correct method/frequency/security to the cloud servers for distribution to remote entities; See Amble Par [0086] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed as described in Amble Par [0087]-[0093]);
perform a second processing of the encoded media data (see Amble Par [0086] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed as described in Amble Par [0087]-[0093]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the disclosure of Quy, which already generally discloses “transforming” video data into transformed media data, for the “transforming” to specifically comprise “encoding” and/or compressing of said data, as disclosed by Amble, because this allows for bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, i.e., communications based on bandwidth requirements for various telemetry communications, can be performed and further optimized in the system (See Amble Par [0035]).

Claim 19 –
Regarding Claim 19, Quy and Amble disclose the system of claim 17 in its entirety.  Amble discloses a method, wherein:
the server is further configured to:
generate a report based at least in part on the extracted one or more of a physiological characteristic and a behavioral characteristic (See Amble Par [0081] which discloses an evaluation module that can collect and monitor aspects of a healthcare organization and produce reports in support of ACOs that allows all participants, i.e. user and service provider, to rate/evaluate the system and the other participants directly involved with a healthcare session; See Amble Par [0044] which discloses patient data being collected digitally at every encounter of the system for participants within the system and is available for analytics; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc., and further states that methods can generate statistics for each area and present to the actors of the system for all aspects), the report being accessible by the user and the service provider (See Amble Par [0081] which discloses an evaluation module that can collect and monitor aspects of a healthcare organization and produce reports in support of ACOs that allows all participants, i.e. user and service provider, to rate/evaluate the system and the other participants directly involved with a healthcare session; See Amble Par [0044] which discloses patient data being collected digitally at every encounter of the system for participants within the system and is available for analytics; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc., and further states that methods can generate statistics for each area and present to the actors of the system for all aspects)
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combined method of Quy and Amble, which already discloses presenting one or more of a physiological characteristics and a behavioral characteristic, to specifically disclose generating a report based on the extracted physiological and behavioral characteristic to be accessible by the user and the service provider, as taught by Amble, because this allows patient data to be collected, analyzed, and reviewed digitally at every encounter of the system for participants within the system (See Amble Par [0044]-[0045]).

Claim 20 –
Regarding Claim 20, Quy and Amble disclose the system of claim 17 in its entirety.  Amble discloses a method, further comprising:
the computing device is configured to assign an authorization level to the service provider (See Amble Par [0037] which defines a role-based access schema for tele-health delivery that factors in pre-defined sets of views for any given role and Amble Par [0076] which discloses the module being used for identification of user roles (such as Physician, Clinician, patient/resident, HIPAA-authorized member, non-authorized member, system admin, etc.)) indicating which one or more of a physiological characteristic and a behavioral characteristic are accessible by the service provider (See Amble Par [0037] which discloses the priority configuration being developed based on analytics using information such as available views can be presented to the provider based on the security role and the access; See Amble Par [0045] which discloses the varying user characteristics that can be captured and presented, such as captured digital device information; ECG/EKG; NIBP; respiratory rate; heart rate; oxygen saturation; body temperature; blood glucose; image analytics; etc.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the combined disclosure of Quy and Amble, which already discloses a service provider having access to the user characteristics, to assign an authorization level to the service provider indicating which of the plurality of user characteristics are accessible by the service provider, as taught by Amble, to allow for viewing of all the above data in multiple widgets in a portal screen based on the access rights of the physician with the patient and facility information (See Amble Par [0037]).

Claim 21 –
Regarding Claim 21, Quy and Amble disclose the method of claim 9 in its entirety. Quy discloses a method, further comprising:
collecting raw media data associated with the user further comprises collecting, by the user computing device, raw media data using a sensor communicatively coupled to the user computing device (See Quy Par [0015] which discloses a biosensor that records physiological signals which are integrated into an augmented or virtual reality device; See Quy Par [0013]-[0014] which discloses emotion monitors being connected to a network to share emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy).

Claim 22 –
Regarding Claim 22, Quy and Amble disclose the method of claim 9 in its entirety.  Quy further discloses a method, wherein:
the service provider includes at least one of a healthcare provider or a fitness trainer (See Quy Par [0040] which discloses the emotion data displayed for review by one or more entities, including a group facilitator, counselor, therapist, etc., a therapist understood to constitute a “healthcare provider”; See claim 24 which discloses transmitting and displaying emotion data captured and derived from physiological signals, such that varying entities can review said data, including a marriage counselor, couples therapist, behavioral therapist, psychiatrist, clinician coach, or a virtual counselor, coach or therapist).

Claim 23 –
Regarding Claim 23, Quy and Amble disclose the method of claim 17 in its entirety.  Quy further discloses a method, wherein:
one or more of a physiological signal and a behavioral signal associated with the user comprises a waveform corresponding to the user characteristic (See Quy Par [0015] which discloses biosensors recording one or more physiological signals, including various biosignals that present as waveforms, such as electrocardiogram, electromyogram (EMG), and electroencephalogram;  See Quy Par [0017] which discloses distinguishing between emotional states by the analysis of EMG signals, body heat signatures (i.e. physiological signals that can be a waveform) and voice features, body language, or encoding of facial micro-expressions, i.e. behavioral signals that can be a waveform; See Quy Par [0018]-[0019] which discloses utilizing algorithms to provide a map of both emotional arousal and valence states, i.e. user characteristics, from said physiological signals, e.g. calculated from measured changes in skin conductance level (SCL) and changes in heart rate (HR), in particular the beat-to-beat heart rate variability (HRV)).

Claim 24 –
Regarding Claim 24, Quy and Amble disclose the method of claim 17 in its entirety.  Quy further discloses a method, wherein:
collecting the media data associated with the user comprises collecting raw media data asynchronously with the virtual conference (See Quy Par [0040] which discloses the emotion data displayed for asynchronous review by one or more entities, including a group facilitator, counselor, therapist, and/or virtual therapist, such as for purposes described in Quy Par [0013]-[0014] sharing emotion data such as during video calls for social and business interactions and/or teletherapy, e.g. virtual group therapy, which is understood to constitute a virtual conference).


Response to Arguments
Applicant's arguments filed 21 August 2025 have been fully considered but they are not persuasive:
Regarding 35 U.S.C. 102 rejections of Claims 1-3, 6-9, 12, & 14-17, Applicant argues on p. 7-8 of Arguments/Remarks that Quy does not anticipate the amended limitations as found in independent claims 1, 9, & 17.  More specifically, Applicant argues that Quy does not anticipate the raw media data comprising video data, extracting intermediate user data from the raw media data by performing a first processing of the video data, and/or encoding, by the user computing device, the video data into encoded media data, as recited in independent claims 1, 9, & 17, because Quy focuses on voice features (i.e. which are derived from audio data), rather than video data.  Examiner agrees with Applicant’s arguments.  Therefore, the previous 35 U.S.C. 102 rejections have been withdrawn.  However, a new ground of rejection has been made under 35 U.S.C. 103 over Quy, in view of Amble.  More specifically, Examiner notes that Quy generally discloses receiving and analyzing video data and “transforming” said video data into transformed media data, but does not explicitly disclose said transforming comprising “encoding” and/or compressing of said data.  Rather, said encoding of raw media data, and transmission of said encoded media data efforts are instead met by newly cited portions of Amble Par, i.e. Amble Par [0026]-[0027] which specifically discloses a system module performing the optimization, compression, and/or transmission of various tele-health data, including audio-video conference streams, Amble Par [0035] which specifically mentions bandwidth management through application-specific compression algorithms, such that real-time auto-configuring for data transmission, such as based on bandwidth requirements for various telemetry communications can be performed, Amble Par [0069] which discloses the communication between a client and server can occur in the form of data packet, such that if said audio-video conference data streams are compressed, they would thereby encode data packets using fewer bits than the original audio-video data stream, and Amble Par [0086]-[0093] which discloses sending digital information to a system portal during an n-way telemedicine session, such that that processing or further analysis of the data can be performed.
Separately, regarding Applicant’s arguments of Quy focusing on voice features (i.e. which are derived from audio data), rather than video data, Examiner points to Quy Par [0016] which discloses that the raw media data “may be further processed to enhance signal detection and remove artifacts using algorithms based on blind signal separation methods or machine learning technique”, i.e. before the raw audio/video signal, i.e. raw media data, is processed to obtain voice features, and this further processed raw media data that has had blind signal separation and/or artifact removal applied.  Further, Quy Par [0020] which discloses the use of one or more of a camera image, webcam, etc. for capturing raw media data and categorizing behavioral data from said captured raw media data;  See Quy Par [0021] which discloses a voice, i.e. audio, or video call such that the algorithm extracts voice features from the audio signal associated with the video call, and further discloses various body language features can be identified and extracted for analysis from the video signal, i.e. video data, such as posture, head movements, or fidgeting,  and is therefore understood by Examiner that this would constitute transformed “video data”, albeit not “encoded” video data, which is instead met by Amble, as explained above.  As such, independent claims 1, 9, & 17 and claims dependent therefrom remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.
Regarding 35 U.S.C. 102 rejections of Claims 1-3, 6-9, 12, & 14-17, Applicant argues on p. 8-9 of Arguments/Remarks that because independent claims 9 & 17 are substantially similar to independent claim 1, independent claims 9 & 17 are also purportedly allowable over the prior art.  Examiner respectfully disagrees with Applicant’s arguments.  As discussed above, independent claim 1 remains rejected under 35 U.S.C. 103.  Therefore, Applicant’s argument regarding independent claim 1 being purportedly allowable over the prior art are rendered moot, and independent claims 9 & 17 which are substantially similar to independent claim 1 also remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.  As such, independent claims 1, 9, & 17 and claims dependent therefrom remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.
Regarding 35 U.S.C. 102 rejections of Claims 1-3, 6-9, 12, & 14-17, Applicant argues on p. 9 of Arguments/Remarks that because dependent claims 2-3, 6-8, 12, & 14-16 are dependent from purportedly allowable independent claims 1 & 9, claims 2-3, 6-8, 12, & 14-16 are also allowable by virtue of dependency from independent claims 1 & 9.  Examiner respectfully disagrees with Applicant’s arguments.  As discussed above, independent claims 1, 9, & 17 remain rejected under 35 U.S.C. 103.  Therefore, Applicant’s argument regarding independent claims 1 & 9 being purportedly allowable over the prior art are rendered moot, and therefore dependent claims 2-3, 6-8, 12, 14-16, & 19-24 also remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.  As such, independent claims 1, 9, & 17 and claims dependent therefrom remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.
Regarding 35 U.S.C. 103 rejections of Claims 5, 13, & 19-20, Applicant argues on p. 9 of Arguments/Remarks that because claim 5 is dependent from claim 1, claim 13 is dependent from claim 9, and claims 19-20 are dependent from claim 17, the 35 U.S.C. 103 rejections should be withdrawn from claims 5, 13, & 19-20 by virtue of dependency from purportedly allowable independent claims 1, 9, & 17.  Examiner respectfully disagrees with Applicant’s arguments.  As discussed above, independent claims 1, 9, & 17 remain rejected under 35 U.S.C. 103.  Therefore, Applicant’s argument regarding independent claims 1, 9, & 17 being purportedly allowable over the prior art are rendered moot, and therefore dependent claims 5, 13, & 19-20 also remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.  As such, independent claims 1, 9, & 17 and claims dependent therefrom remain rejected under 35 U.S.C. 103 over Quy, in view of Amble.




Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Shipon et al. (U.S. Patent Publication No. 2007/0118389) discloses a system for an integrated teleconferencing system, such that medical annotations can be coded, stored, and recalled, such that medical annotations can be converted to a compressed symbolic representation particularly suitable for immediate analysis for medical diagnosis and treatment, and real-time interactive video or multimedia presentation and/or transmission;
Peters et al. (U.S. Patent Publication No. 2021/0176429) discloses a video conference system that allows for media streams to be transmitted, and for example, size or resolution of video can be changed, such that bandwidth of the conference is reduced by increasing a compression level, changing a compression codec, reducing a frame rate, or stopping transmission of a media stream;
Peters et al. (U.S. Patent Publication No. 2021/0185276) discloses an endpoint device captures video data during a network-based communication session, and further enables analysis of video data at the video source, such that at the source of the video stream, the full quality video is available, e.g., full resolution, full frame rate, full color bit-depth, etc. and minimal compression or no compression, but when transmitted, to use network bandwidth efficiently, the video streams sent to the server are often of reduced quality, e.g., reduced size, reduced frame rate, reduced color bit-depth, with significant compression applied;
Aman et al. (U.S. Patent Publication No. 2007/0279494) discloses a system for analyzing video content and video data for events, such that algorithms can be used for encoding and decoding highly segmented and compressed video and it is possible to create a sophisticated automatic filming system.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUNTER J RASNIC whose telephone number is (571)270-5801. The examiner can normally be reached M-F 8am-5:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Shahid Merchant can be reached on (571) 270-1360. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/H.R./Examiner, Art Unit 3684
                                                                                                                                                                                              


/Shahid Merchant/Supervisory Patent Examiner, Art Unit 3684
Read full office action
Prosecution Timeline

Jun 16, 2021
Application Filed
Sep 08, 2023
Non-Final Rejection — §103
Nov 28, 2023
Interview Requested
Dec 05, 2023
Applicant Interview (Telephonic)
Dec 05, 2023
Examiner Interview Summary
Dec 15, 2023
Response Filed
Apr 01, 2024
Final Rejection — §103
Jun 05, 2024
Response after Non-Final Action
Jun 24, 2024
Response after Non-Final Action
Jun 26, 2024
Interview Requested
Jul 03, 2024
Request for Continued Examination
Jul 06, 2024
Response after Non-Final Action
Aug 26, 2024
Non-Final Rejection — §103
Jan 06, 2025
Response Filed
Apr 15, 2025
Final Rejection — §103
Aug 21, 2025
Request for Continued Examination
Aug 25, 2025
Response after Non-Final Action
Dec 22, 2025
Non-Final Rejection — §103
Apr 07, 2026
Examiner Interview Summary
Apr 07, 2026
Applicant Interview (Telephonic)
Precedent Cases

Applications granted by this same examiner with similar technology

16/971,822
Patent 12142364
SYSTEMS AND METHODS THAT PROVIDE A POSITIVE EXPERIENCE DURING WEIGHT MANAGEMENT
2y 5m to grant Granted Nov 12, 2024
16/695,642
Patent 11961606
Systems and Methods for Processing Medical Images For In-Progress Studies
2y 5m to grant Granted Apr 16, 2024
16/140,414
Patent 11908558
PROSPECTIVE MEDICATION FILLINGS MANAGEMENT
2y 5m to grant Granted Feb 20, 2024
16/504,565
Patent 11875904
IDENTIFICATION OF EPIDEMIOLOGY TRANSMISSION HOT SPOTS IN A MEDICAL FACILITY
2y 5m to grant Granted Jan 16, 2024
16/669,357
Patent 11862314
METHODS AND SYSTEMS FOR PATIENT CONTROL OF AN ELECTRONIC PRESCRIPTION
2y 5m to grant Granted Jan 02, 2024
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

5-6
Expected OA Rounds
11%
Grant Probability
32%
With Interview (+20.5%)
4y 7m
Median Time to Grant
High
PTA Risk
Based on 81 resolved cases by this examiner. Grant probability derived from career allow rate.
VIDEO-BASED PHYSIOLOGICAL SENSING INTEGRATED INTO VIRTUAL CONFERENCES

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email