Last updated: April 19, 2026
Application No. 18/488,577
Identifying A Video Frame For An Image In A Video Conference

Non-Final OA §103
Filed
Oct 17, 2023
Examiner
MOHAMMED, ASSAD
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Zoom Video Communications, Inc.
OA Round
3 (Non-Final)
Interview Optional

— +11.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 587 resolved cases, 2023–2026
Examiner Intelligence

MOHAMMED, ASSAD View full profile →
Grants 73% — above average
Career Allow Rate
430 granted / 587 resolved
+11.3% vs TC avg
Moderate +11% lift
Without
With
+11.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
24 currently pending
Career history
611
Total Applications
across all art units
Statute-Specific Performance

§101
7.3%
-32.7% vs TC avg
§103
67.5%
+27.5% vs TC avg
§102
7.8%
-32.2% vs TC avg
§112
9.5%
-30.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 587 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
1.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 2/6/2026 has been entered.

Claim Rejections - 35 USC § 103
2.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
3.	Claim(s) 1, 9, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175).
Regarding claim 1, Ramamoorthy teaches a method, comprising: downloading, to a client device, pre-trained model configuration data for identifying video frames having a specified feature (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy discloses downloading to a device applications in which include facial expression classification. Ramamoorthy does not disclose identifying, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected and transmitting, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number.
Mireles teaches identifying, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected (see ¶ 0014, 0026-0027, 0062-0063, 0126-0127, The device identifying face expression in correlation to a face model. A device can extract facial landmark containers from a video feed captured at the device, encode muscle actions (and their intensities) detected in the video feed in separate facial expression containers (or append values representing these muscle actions and intensities to concurrent facial landmark containers). Transmit video feed of facial landmark containers and/or a video feed of facial expression containers to the receiving device.).
The combination of Mireles to Ramamoorthy provides models for facial expression wherein Ramamoorthy provides an downloaded version to a device.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy to incorporate detected in frames in the video feed, in a feed of facial landmark containers and representing sets of facial muscle actions, detected in frames in the video feed. The modification provides for capturing facial expression and using a model to detect the changes in facial expression. 
Jennings teaches transmitting, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
The combination of Jennings to Mireles and Ramamoorthy provides transmitting timestamp data to a server in order to retrieve image that is in correlation with the timestamp data from a server. The combination with the conferencing functions of Ramamoorthy and Mireles in conjunction with the features of retrieving an image in correlation with a timestamp entry would be plausible in implementing features of a function taught by Jennings.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting timestamp data from a user device to a server in order to retrieve image data associated with the timestamp time frame. The modification provides sending timestamp data of a video recording and retrieving an of the video data with timestamps from a user device. 

Regarding claim 9, Ramamoorthy teaches a non-transitory computer readable medium storing instructions operable to cause one or more processors to perform operations comprising: downloading, by client device, pre-trained model configuration data for identifying video frames having a specified feature (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy discloses downloading to a device applications in which include facial expression classification. Ramamoorthy does not disclose identifying, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected and transmitting, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number.
Mireles teaches identifying, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected (see ¶ 0014, 0026-0027, 0062-0063, 0126-0127, The device identifying face expression in correlation to a face model. A device can extract facial landmark containers from a video feed captured at the device, encode muscle actions (and their intensities) detected in the video feed in separate facial expression containers (or append values representing these muscle actions and intensities to concurrent facial landmark containers). Transmit video feed of facial landmark containers and/or a video feed of facial expression containers to the receiving device.).
The combination of Mireles to Ramamoorthy provides models for facial expression wherein Ramamoorthy provides an downloaded version to a device.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy to incorporate detected in frames in the video feed, in a feed of facial landmark containers and representing sets of facial muscle actions, detected in frames in the video feed. The modification provides for capturing facial expression and using a model to detect the changes in facial expression. 
Jennings teaches transmitting, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
The combination of Jennings to Mireles and Ramamoorthy provides transmitting timestamp data to a server in order to retrieve image that is in correlation with the timestamp data from a server. The combination with the conferencing functions of Ramamoorthy and Mireles in conjunction with the features of retrieving an image in correlation with a timestamp entry would be plausible in implementing features of a function taught by Jennings.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting timestamp data from a user device to a server in order to retrieve image data associated with the timestamp time frame. The modification provides sending timestamp data of a video recording and retrieving an of the video data with timestamps from a user device. 

Regarding claim 17, Ramamoorthy teaches a system, comprising: a memory subsystem; and processing circuitry configured to execute instructions stored in the memory subsystem to: download, by client device, pre-trained model configuration data for identifying video frames having a specified feature (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy discloses downloading to a device applications in which include facial expression classification. Ramamoorthy does not disclose identify, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected and transmit, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number.
Mireles teaches identify, by the client device and using an image selection engine configured according to the pre-trained model configuration data, a video frame having the specified feature during an online video conference to which the client device is connected (see ¶ 0014, 0026-0027, 0062-0063, 0126-0127, The device identifying face expression in correlation to a face model. A device can extract facial landmark containers from a video feed captured at the device, encode muscle actions (and their intensities) detected in the video feed in separate facial expression containers (or append values representing these muscle actions and intensities to concurrent facial landmark containers). Transmit video feed of facial landmark containers and/or a video feed of facial expression containers to the receiving device.).
The combination of Mireles to Ramamoorthy provides models for facial expression wherein Ramamoorthy provides an downloaded version to a device.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy to incorporate detected in frames in the video feed, in a feed of facial landmark containers and representing sets of facial muscle actions, detected in frames in the video feed. The modification provides for capturing facial expression and using a model to detect the changes in facial expression. 
Jennings teaches transmit, by the client device, to a server, without transmitting image data of the video frame, an identifier of the video frame that identifies the video frame within a camera-generated video stream of the client device, for storage in connection with a recording of the online video conference, the identifier comprising at least one of a timestamp of a frame identification number (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
The combination of Jennings to Mireles and Ramamoorthy provides transmitting timestamp data to a server in order to retrieve image that is in correlation with the timestamp data from a server. The combination with the conferencing functions of Ramamoorthy and Mireles in conjunction with the features of retrieving an image in correlation with a timestamp entry would be plausible in implementing features of a function taught by Jennings.
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting timestamp data from a user device to a server in order to retrieve image data associated with the timestamp time frame. The modification provides sending timestamp data of a video recording and retrieving an of the video data with timestamps from a user device. 

4.	Claim(s) 2, 3, 6, 18 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175).
Regarding claim 2, Ramamoorthy and Mireles do not teach the method of claim 1, wherein the identifier comprises a timestamp.  
Jennings teaches wherein the identifier comprises a timestamp device (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting video data from a local device to a server with a timestamp. The modification provides sending recorded video data with timestamps of the camera generated video stream from the device. 

Regarding claim 3, Ramamoorthy and Mireles do not teach the method of claim 1, wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to cause the server to generate, based on the video frame, an image in a standalone image file format separate from a video recording format.  
Jennings teaches wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to cause the server to generate, based on the video frame, an image in a standalone image file format separate from a video recording format (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting timestamp data from a user device to a server in order to retrieve image data associated with the timestamp time frame. The modification provides sending timestamp data of a video recording and retrieving an of the video data with timestamps from a user device. 

Regarding claim 6, Ramamoorthy teaches the method of claim 1, comprising: downloading the pre-trained model configuration data in response to receiving, over a network, an indication of the specified feature (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).

Regarding claim 18, Ramamoorthy does not teach the system of claim 17, wherein the identifier comprises at least one of a timestamp or a frame identification number.  
Jennings teaches wherein the identifier comprises at least one of a timestamp or a frame identification number (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting video data from a local device to a server with a timestamp. The modification provides sending recorded video data with timestamps of the camera generated video stream from the device. 


5.	Claim(s) 4, 12, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of Back et al. (US 2021/0382672).
Regarding claim 4, Ramamoorthy, Mireles and Jennings do not teach the method of claim 1, comprising: generating a video stream by a camera of the client device for transmission to the video conference; and obtaining the video frame from the video stream. 
Back teaches generating a video stream by a camera of the client device for transmission to the video conference; and obtaining the video frame from the video stream (see fig. 1A, ¶ 0049. An active video conferencing session wherein the video feed is interrupted, causing display content to reflect a freeze frame or default picture.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate a freeze frame of a video conferencing session. The modification provides an image of from the video conference that is captured during the conferencing session.  

Regarding claim 12, Ramamoorthy, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, the operations comprising: generating, by the client device, a video stream by a camera of the client device for transmission to the video conference; and obtaining the video frame from the video stream. 
Back teaches generating, by the client device, a video stream by a camera of the client device for transmission to the video conference; and obtaining the video frame from the video stream (see fig. 1A, ¶ 0049. An active video conferencing session wherein the video feed is interrupted, causing display content to reflect a freeze frame or default picture.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate a freeze frame of a video conferencing session. The modification provides an image of from the video conference that is captured during the conferencing session.  

Regarding claim 20, Ramamoorthy, Mireles and Jennings do not teach the system of claim 17, the processing circuitry configured to execute the instructions stored in the memory subsystem to: obtain the video frame from a video stream generated for transmission to the video conference. 
Back teaches obtain the video frame from a video stream generated for transmission to the video conference (see fig. 1A, ¶ 0049. An active video conferencing session wherein the video feed is interrupted, causing display content to reflect a freeze frame or default picture.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate a freeze frame of a video conferencing session. The modification provides an image of from the video conference that is captured during the conferencing session.  


6.	Claim(s) 5, 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of Yamazaki (US 2023/0397868).
Regarding claim 5, Ramamoorthy, Mireles and Jennings do not teach the method of claim 1, comprising: identifying the video frame in real-time after generating the video frame by a camera of the client device. 
Yamazaki teaches identifying the video frame in real-time after generating the video frame by a camera of the client device (see fig. 2-4, 9, ¶ 0049, 0076. Conference information is recorded in a conference information apparatus.  The conference is recorded along with facial expressions (specific features) wherein the change in the facial expression (specific feature) is captured and timestamped at specific locations in the recorded stream. The timestamp that is associated with the video being captured will be done in real-time. ).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate recording facial expressions (specific features) wherein the change in the facial expression (specific feature) is captured and timestamped at specific locations in the recorded stream. The modification provides for capturing facial expression (specific features) that are recorded and provided in a recorded time summary.

Regarding claim 13, Ramamoorthy, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, the operations comprising: identifying the video frame in real-time after generating the video frame by the client device. 
Yamazaki teaches identifying the video frame in real-time after generating the video frame by the client device (see fig. 2-4, 9, ¶ 0049, 0076. Conference information is recorded in a conference information apparatus.  The conference is recorded along with facial expressions (specific features) wherein the change in the facial expression (specific feature) is captured and timestamped at specific locations in the recorded stream. The timestamp that is associated with the video being captured will be done in real-time. ).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate recording facial expressions (specific features) wherein the change in the facial expression (specific feature) is captured and timestamped at specific locations in the recorded stream. The modification provides for capturing facial expression (specific features) that are recorded and provided in a recorded time summary.

7.	Claim(s) 7, 14, 15 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of Kretz et al. (US 2008/0292299).
Regarding claim 7, Mireles and Jennings do not teach the method of claim 1, comprising: downloading the pre-trained model configuration data in response to receiving, via a graphical user interface of the client device, a user input representing the specified feature.  
Ramamoorthy teaches downloading the pre-trained model (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy does not disclose configuration data in response to receiving, via a graphical user interface of the client device, a user input representing the specified feature
Kretz teaches configuration data in response to receiving, via a graphical user interface of the client device, a user input representing the specified feature (see fig .8, ¶ 0078-0080. Configuration data  for an application can be implemented for user to capture images. This would be done via the user device interface.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate configuration data  for an application can be implemented for user to capture images. The modification provides for an user to access the application via the interface to interact with the features of the application.   

Regarding claim 14, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, the operations comprising: downloading the pre-trained model configuration data in response to receiving an indication of the specified feature.  
Ramamoorthy teaches downloading the pre-trained model (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy does not disclose configuration data in response to receiving an indication of the specified feature.
Kretz teaches configuration data in response to receiving an indication of the specified feature (see fig .8, ¶ 0078-0080. Configuration data  for an application can be implemented for user to capture images. This would be done via the user device interface.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate configuration data  for an application can be implemented for user to capture images. The modification provides for an user to access the application via the interface to interact with the features of the application.   

Regarding claim 15, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, the operations comprising: downloading the pre-trained model configuration data in response to receiving, via a user interface of the client device, a user input representing the specified feature.  
Ramamoorthy teaches downloading the pre-trained model (see fig. 2, ¶ 0024, 0036-0037. Facial expression classification module is a trained machine learning model that is configured to analyze facial images or videos and determine the emotion or facial expression displayed by the subject. Facial expression classification module is configured to continuously monitor the facial images of the participants of a video conference and to continually output an emotion displayed by the participants. The emotion can include one of the following: happy; sad; angry; surprised; confused; neutral; disgusted; interested; helpful; relieved; and fearful. The participant can include applications in which include facial expression classification. Programs can be downloaded to computer (client device) from an external computer or external storage device through a network adapter card or network interface included in network module. The facial expression classification in a video conference would be a specific feature to be identify in video frames for participants. Thus expressive emotions that are captured and identified as a specific feature for a trained model to identify.).
Ramamoorthy does not disclose configuration data in response to receiving, via a user interface of the client device, a user input representing the specified feature.
Kretz teaches configuration data in response to receiving, via a user interface of the client device, a user input representing the specified feature (see fig .8, ¶ 0078-0080. Configuration data  for an application can be implemented for user to capture images. This would be done via the user device interface.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate configuration data  for an application can be implemented for user to capture images. The modification provides for an user to access the application via the interface to interact with the features of the application.   

8.	Claim(s) 8, 16 are rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of Pauli et al. (US 2023/0154166).
Regarding claim 8, Ramamoorthy, Mireles and Jennings do not teach the method of claim 1, comprising: configuring an artificial neural network of the image selection engine according to weights provided in the pre-trained model configuration data.  
Pauli teaches configuring an artificial neural network of the image selection engine according to weights provided in the pre-trained model configuration data (see ¶ 0040-0041. Specific classification performance score(s) determined for different item(s) of interest. Image selector may utilize a weighting scheme, where images comprising such items of interest are weighted more, thereby increasing the likelihood that such images are selected for training.).
 It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate image selector utilizes a weighting scheme, wherein images of interest are weighted more, thereby increasing the likelihood that such images are selected for training. The modification provides for weighting images for imaging selection.    

Regarding claim 16, Ramamoorthy, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, the operations comprising: configuring an artificial neural network of the image selection engine based on weights provided in the pre-trained model configuration data.  
Pauli teaches configuring an artificial neural network of the image selection engine based on weights provided in the pre-trained model configuration data (see ¶ 0040-0041. Specific classification performance score(s) determined for different item(s) of interest. Image selector may utilize a weighting scheme, where images comprising such items of interest are weighted more, thereby increasing the likelihood that such images are selected for training.).
 It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate image selector utilizes a weighting scheme, wherein images of interest are weighted more, thereby increasing the likelihood that such images are selected for training. The modification provides for weighting images for imaging selection.    

9.	Claim(s) 10 is rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of Vu et al. (US 2021/0295096).
Regarding claim 10, Ramamoorthy, Mireles and Jennings do not teach the non-transitory computer readable medium of claim 9, wherein the identifier comprises a frame identification number.  
Vu teaches wherein the identifier comprises a frame identification number (see fig. 11, ¶ 0056. Video data (e.g., the sample videos) is extracted into video frames (e.g., video frames) and each video frame is labeled, for example, with a frame identification number (ID) in text.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate a video frame with a frame ID number. The modification provides for frame ID number for each extracted frame.

10.	Claim(s) 11 is rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of North et al. (US 2013/0342629).
	Regarding claim 11, Ramamoorthy and Mireles do not teach the non-transitory computer readable medium of claim 9, wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to prompt the server to generate an image corresponding to the video frame, the identifier causing the server to generate the image by identifying a foreground of the video frame and generating the image to include the foreground and a preset background different from a background of the determined frame.
Jennings teaches wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to prompt the server to generate an image corresponding to the video frame, the identifier causing the server to generate the image (see fig. 1, 6, col. 7, line 13-25. A user device wherein a user can request retrieval of an image. The user input specifies a time period of interest which would be a timestamp to identify time period of interest. Upon the request, the system will retrieve the images from a server and provide the image to the user device.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy and Mireles to incorporate transmitting timestamp data from a user device to a server in order to retrieve image data associated with the timestamp time frame. The modification provides sending timestamp data of a video recording and retrieving an of the video data with timestamps from a user device. 
North teaches the identifier causing the server to generate the image by identifying a foreground of the video frame and generating the image to include the foreground and a preset background different from a background of the determined frame (see ¶ 0015. The modification server can segment video images of the received video stream into foreground images and background images according to the video modification plan. The modification server can modify background images and/or foreground images according the video modification plan to generate a plurality of modified background images and/or a plurality of modified foreground images. In turn, the modification server can replace background images and/or foreground images with modified background images and/or modified foreground images to generate a modified video stream. The modification server can then transmit the modified video stream to another communication device associated with the video call session.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate modifying the background video image for transmitting to another remote device in a conference. The modification provides generating and modifying background images from the video data.  


11.	Claim(s) 19 is rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of  Madisetti et al. (US 2022/0114142).
Regarding claim 19, Ramamoorthy, Mireles and Jennings do not teach the system of claim 17, wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to cause the server to generate an image file.  
Madisetti teaches wherein transmitting the identifier to the server comprises: transmitting the identifier to the server to cause the server to generate an image file (see claim 12. file comprises a timestamp indicating a time in at least one of the audio recording file and the video recording file (this will include an image) with which the item file is associated, defining a related item time; and the network communication device is further operable to: receive a replay request from a user device, the replay request comprising a link to the package file; transmit at least one of the audio recording file and the video recording file for playback on the user device; and transmit the item file to the user device such that the derivative conference file is represented on the user device at the related item time.).
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate timestamp (identifier) of a recording to receive a replay at that specific time frame. The modification provides for timestamp identifier being requested and the captured time frame of the video to be presented to the user. 

12.	Claim(s) 19 is rejected under 35 U.S.C. 103 as being unpatentable over Ramamoorthy et al. (US 2025/0055957) in view of Mireles et al. (US 2022/0370733) in further view of Jennings et al. (US 10,593,175) further in view of  Xu et al. (US 2025/0077765).
Regarding claim 21, Ramamoorthy, Mireles and Jennings do not teach the method of claim 1, wherein the pre-trained model configuration data identifies weights for configuring the image selection engine to score and select individual video frames.
Xu teaches wherein the pre-trained model configuration data identifies weights for configuring the image selection engine to score and select individual video frames (see fig. 11, ¶ 0051. A image search incorporates a search engine that searches the images based on ranking image search similarity score and image scores. These components combined use a weighting calculation in order to determine the best selection for the image.). 
It would have been obvious to one of ordinary skill in the art before the effective filing date to modify Ramamoorthy, Mireles and Jennings to incorporate image selection based on weighting score for the image being searched. The modification provides for image retrieve based on the weighted score of the image being searched.  

Conclusion
13.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASSAD MOHAMMED whose telephone number is (571)270-7253. The examiner can normally be reached 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASSAD MOHAMMED/           Examiner, Art Unit 2691         

/DUC NGUYEN/           Supervisory Patent Examiner, Art Unit 2691
Read full office action
Prosecution Timeline

Oct 17, 2023
Application Filed
Jul 24, 2025
Non-Final Rejection — §103
Aug 12, 2025
Interview Requested
Aug 28, 2025
Applicant Interview (Telephonic)
Aug 29, 2025
Examiner Interview Summary
Sep 02, 2025
Response Filed
Dec 03, 2025
Final Rejection — §103
Jan 02, 2026
Interview Requested
Jan 09, 2026
Examiner Interview Summary
Jan 09, 2026
Applicant Interview (Telephonic)
Feb 06, 2026
Request for Continued Examination
Feb 17, 2026
Response after Non-Final Action
Feb 27, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/105,074
Patent 12604149
ELECTRONIC DEVICE AND METHOD THEREOF FOR OUTPUTTING AUDIO DATA
2y 5m to grant Granted Apr 14, 2026
18/340,183
Patent 12598441
AUDIO SIGNAL PROCESSING METHOD AND AUDIO SIGNAL PROCESSING APPARATUS
2y 5m to grant Granted Apr 07, 2026
18/585,594
Patent 12587801
RE-MIXING A COMPOSITE AUDIO PROGRAM FOR PLAYBACK WITHIN A REAL-WORLD VENUE
2y 5m to grant Granted Mar 24, 2026
18/626,976
Patent 12587774
SYSTEM AND METHOD OF ASSEMBLING A COMPRESSION TRIGGERED HEADSET POWER SAVING SYSTEM FOR AN AUDIO HEADSET
2y 5m to grant Granted Mar 24, 2026
18/245,792
Patent 12581240
Method and System for Determining Audio Channel Role of Sound Box, Electronic Device, and Storage Medium
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
73%
Grant Probability
84%
With Interview (+11.1%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 587 resolved cases by this examiner. Grant probability derived from career allow rate.