Last updated: April 19, 2026
Application No. 18/791,360
Visual Content Filtering For Contact Center Agents

Non-Final OA §102§103
Filed
Jul 31, 2024
Examiner
MOHAMMED, ASSAD
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Zoom Video Communications, Inc.
OA Round
1 (Non-Final)
Interview Optional

— +11.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 587 resolved cases, 2023–2026
Examiner Intelligence

MOHAMMED, ASSAD View full profile →
Grants 73% — above average
Career Allow Rate
430 granted / 587 resolved
+11.3% vs TC avg
Moderate +11% lift
Without
With
+11.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
24 currently pending
Career history
611
Total Applications
across all art units
Statute-Specific Performance

§101
7.3%
-32.7% vs TC avg
§103
67.5%
+27.5% vs TC avg
§102
7.8%
-32.2% vs TC avg
§112
9.5%
-30.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 587 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 102
1.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


2.	Claim(s) 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 15, 16, 18, 19, 20 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Zhu et al. (US 2023/0421719).
	Regarding claim 1, Zhu teaches a method, comprising: determining, at a first device of a contact center agent, to filter visual content of a video stream of the contact center agent for a contact center engagement with a contact center user; obtaining, at the first device, filtered content corresponding to the determination to filter the visual content (see fig. 2, ¶ 0023-0024, 0027, 0038-0039, 0053-0054. The smart video conferencing receives  a video from a camera of the device and a second video from a camera of a second device. The conferencing system can correspond to a financial institution, such as a customer service agent. Content determination to analyze the context data and determine content presented by the smart video conferencing system. Content determination component can retrieve context data from the database and determine whether or not the information should be presented as part of a virtual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history.); generating, at the first device, an updated video stream by replacing the visual content with the filtered content (see ¶ 0044, 0053-0054, 0056-0057. A virtual background is determined based on the context. The virtual background can be selected from several predetermined backgrounds in a data store. Alternatively, determination of the virtual background can comprise generating a background. A video of a meeting participant can be segmented from its actual background. Subsequently, the virtual environment comprising a virtual background and content can replace the actual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history); and outputting, in place of the video stream, the updated video stream for rendering at a second device of the contact center user during the contact center engagement (see ¶ 0005, 0020, 0026, 0070. The virtual background is updated dynamically in the smart video conferencing system. The conferencing system is between customer and agent.).

Regarding 2, Zhu teaches the method of claim 1, wherein determining to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user comprises: determining that the filtered content corresponds to the contact center user (see ¶ 0028. smart video conference system can identify a scheduled video call between a customer and a service agent of a financial institution from a calendar of the customer, service agent, or both. Meeting data can also be acquired from the calendar, such as a description or title of the meeting, time, and participants. Additional content can be determined from sensors or derived from other data sources, financial institution logo, location, and current weather. The system can generate a series of virtual backgrounds or environments (1, 2, . . . n, where n is an integer greater than two) over time. If the balance satisfies the threshold, the background can be updated to include one or more potential intervention tools to nudge the user toward better financial habits, such as financial education, automatic saving, and credit building information.).

Regarding claim 4, Zhu teaches the method of claim 1, wherein determining to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user comprises: determining that a relevance score associated with the visual content meets a threshold (see ¶ 0057. The video or image frames of a meeting participant are overlaid on the virtual environment. For example, the machine learning model can infer a probability that a pixel belongs to a human participant or not. If the probability satisfies a threshold probability, the pixel can be classified as a human participant rather than a background. Thus determine that the visual content meets a threshold. The score can be related to the threshold level.) 

Regarding 5, Zhu teaches the method of claim 1, wherein the visual content corresponds to a background of the video stream and obtaining the filtered content corresponding to the determination to filter the visual content comprises: obtaining, as the filtered content, a virtual background from a library accessible to the first device (see ¶ 0038-0039. The content determination component can retrieve context data from the database and determine whether or not the information should be presented as part of a virtual background.).  
Regarding claim 6, Zhu teaches the method of claim 1, wherein the visual content corresponds to a background of the video stream and obtaining the filtered content corresponding to the determination to filter the visual content comprises: generating, as the filtered content, a virtual background (see ¶ 0028. smart video conference system can identify a scheduled video call between a customer and a service agent of a financial institution from a calendar of the customer, service agent, or both. Meeting data can also be acquired from the calendar, such as a description or title of the meeting, time, and participants. Additional content can be determined from sensors or derived from other data sources, financial institution logo, location, and current weather. The system can generate a series of virtual backgrounds or environments (1, 2, . . . n, where n is an integer greater than two) over time. If the balance satisfies the threshold, the background can be updated to include one or more potential intervention tools to nudge the user toward better financial habits, such as financial education, automatic saving, and credit building information.).

Regarding claim 7, Zhu teaches the method of claim 1, wherein the visual content corresponds to a foreground of the video stream and generating the updated video stream by replacing the visual content with the filtered content comprises: asserting a filter against the foreground of the video stream to replace the visual content with the filtered content (see ¶ 0057, 0064. The foreground imagery is overlaid on the human figure and virtual background. For example, a participant's name and position can be displayed at the bottom of the screen and over an image of the participant. In another instance, a news ticker can scroll across the bottom of the screen that includes relevant content or news. Still further yet, content can be displayed in free spaces such as over and adjacent to a participant's shoulder in a window similar to a newscast report.)

Regarding claim 8, Zhu teaches the method of claim 1, wherein the visual content corresponds to a background of the video stream and generating the updated video stream by replacing the visual content with the filtered content comprises: combining a foreground of the video stream and a virtual background, as the filtered content, to generate the updated video stream (see ¶ 0057, 0064. A video of a meeting participant can be segmented from its actual background. Subsequently, the virtual environment comprising a virtual background and content can replace the actual background. In other words, the video or image frames of a meeting participant are overlaid on the virtual environment.).  

Regarding claim 9, Zhu teaches a non-transitory computer readable medium storing instructions operable to cause one or more processors to perform operations comprising: determining, at a first device of a contact center agent, to filter visual content of a video stream of the contact center agent for a contact center engagement with a contact center user; obtaining, at the first device, filtered content corresponding to the determination to filter the visual content (see fig. 2, ¶ 0023-0024, 0027, 0038-0039, 0053-0054. The smart video conferencing receives  a video from a camera of the device and a second video from a camera of a second device. The conferencing system can correspond to a financial institution, such as a customer service agent. Content determination to analyze the context data and determine content presented by the smart video conferencing system. Content determination component can retrieve context data from the database and determine whether or not the information should be presented as part of a virtual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history.); generating, at the first device, an updated video stream by replacing the visual content with the filtered content (see ¶ 0044, 0053-0054, 0056-0057. A virtual background is determined based on the context. The virtual background can be selected from several predetermined backgrounds in a data store. Alternatively, determination of the virtual background can comprise generating a background. A video of a meeting participant can be segmented from its actual background. Subsequently, the virtual environment comprising a virtual background and content can replace the actual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history); and outputting, in place of the video stream, the updated video stream for rendering at a second device of the contact center user during the contact center engagement (see ¶ 0005, 0020, 0026, 0070. The virtual background is updated dynamically in the smart video conferencing system. The conferencing system is between customer and agent.).

Regarding claim 10, Zhu teaches the non-transitory computer readable medium of claim 9, wherein the visual content corresponds to a background of the video stream (see ¶ 0057, 0064. A video of a meeting participant can be segmented from its actual background. Subsequently, the virtual environment comprising a virtual background and content can replace the actual background. In other words, the video or image frames of a meeting participant are overlaid on the virtual environment.).  

Regarding claim 11, Zhu teaches the non-transitory computer readable medium of claim 9, wherein the visual content corresponds to a foreground of the video stream (see ¶ 0057, 0064. The foreground imagery is overlaid on the human figure and virtual background. For example, a participant's name and position can be displayed at the bottom of the screen and over an image of the participant. In another instance, a news ticker can scroll across the bottom of the screen that includes relevant content or news. Still further yet, content can be displayed in free spaces such as over and adjacent to a participant's shoulder in a window similar to a newscast report.)

Regarding claim 12, Zhu teaches the non-transitory computer readable medium of claim 9, wherein the determination to filter the visual content is based on a relevance score determined for the visual content meeting a threshold  (see ¶ 0057. The video or image frames of a meeting participant are overlaid on the virtual environment. For example, the machine learning model can infer a probability that a pixel belongs to a human participant or not. If the probability satisfies a threshold probability, the pixel can be classified as a human participant rather than a background. Thus determine that the visual content meets a threshold. The score can be related to the threshold level.)   

Regarding claim 15, Zhu teaches a system, comprising: a memory subsystem; and processing circuitry configured to execute instructions stored in the memory subsystem to: determine, at a first device of a contact center agent, to filter visual content of a video stream of the contact center agent for a contact center engagement with a contact center user; obtain, at the first device, filtered content corresponding to the determination to filter the visual content (see fig. 2, ¶ 0023-0024, 0027, 0038-0039, 0053-0054. The smart video conferencing receives  a video from a camera of the device and a second video from a camera of a second device. The conferencing system can correspond to a financial institution, such as a customer service agent. Content determination to analyze the context data and determine content presented by the smart video conferencing system. Content determination component can retrieve context data from the database and determine whether or not the information should be presented as part of a virtual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history.); generate, at the first device, an updated video stream by replacing the visual content with the filtered content (see ¶ 0044, 0053-0054, 0056-0057. A virtual background is determined based on the context. The virtual background can be selected from several predetermined backgrounds in a data store. Alternatively, determination of the virtual background can comprise generating a background. A video of a meeting participant can be segmented from its actual background. Subsequently, the virtual environment comprising a virtual background and content can replace the actual background. The background can be replaced by a virtual background determined based on one or more factors. The factors can include context data acquired from a calendar, content based on the context data, and transaction history); and output, in place of the video stream, the updated video stream for rendering at a second device of the contact center user during the contact center engagement (see ¶ 0005, 0020, 0026, 0070. The virtual background is updated dynamically in the smart video conferencing system. The conferencing system is between customer and agent.).

Regarding claim 16, Zhu teaches the system of claim 15, wherein, to determine to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user, the processing circuitry is configured to execute the instructions to: determine a relevance score for the visual content; and determine that the relevance score meets a threshold (see ¶ 0057. The video or image frames of a meeting participant are overlaid on the virtual environment. For example, the machine learning model can infer a probability that a pixel belongs to a human participant or not. If the probability satisfies a threshold probability, the pixel can be classified as a human participant rather than a background. Thus determine that the visual content meets a threshold. The score can be related to the threshold level.)   

Regarding claim 18, Zhu teaches the system of claim 15, wherein the visual content corresponds to a background of the video stream and, to determine to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user, the processing circuitry is configured to execute the instructions to: determine that a virtual background, as the filtered content, corresponds to an organization with which the contact center user is associated (see ¶ 0028. smart video conference system can identify a scheduled video call between a customer and a service agent of a financial institution from a calendar of the customer, service agent, or both. Meeting data can also be acquired from the calendar, such as a description or title of the meeting, time, and participants. Additional content can be determined from sensors or derived from other data sources, financial institution logo, location, and current weather. The system can generate a series of virtual backgrounds or environments (1, 2, . . . n, where n is an integer greater than two) over time. If the balance satisfies the threshold, the background can be updated to include one or more potential intervention tools to nudge the user toward better financial habits, such as financial education, automatic saving, and credit building information.).
 
Regarding claim 19, Zhu teaches the system of claim 15, wherein the processing circuitry is configured to execute the instructions to: obtain input from the first device indicating to update the video stream according to the filtered content (see fig. 8-9, ¶ 0065, 0070. The  update is presented in real time. That will be the indication upon the change of the background.).

Regarding claim 20, Zhu teaches the system of claim 15, wherein the contact center engagement is facilitated over a video conferencing modality (see ¶ 0029. The smart video conference system can be implemented by a communication platform that includes video conferencing (e.g., Zoom, Teams, WebEx).

Claim Rejections - 35 USC § 103
3.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
4.	Claim(s) 3, 13, 17 are rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. (US 2023/0421719) in view of Agrawal et al. (US 2024/0275911).
	Regarding claim 3, Zhu does not teach the method of claim 1, wherein determining to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user comprises: determining that a first relevance score associated with the visual content is lower than a second a second relevance score associated with the filtered content.  
Agrawal teaches determining that a first relevance score associated with the visual content is lower than a second a second relevance score associated with the filtered content (see ¶ 0098. Detected change in the foreground image being less than the foreground image change threshold and/or in response to neither background or foreground image presenting a change that is greater than their respective change thresholds, presenting the unmodified live video feed to the VCS.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zhu to incorporate the virtual image being changed having a less threshold to modify the changes. The modification is to update the changes based a threshold level.

Regarding claim 13, Zhu does not teach the non-transitory computer readable medium of claim 9, wherein the determination to filter the visual content is based on a first relevance score determined for the visual content being lower than a second relevance score determined for the filtered content.  
Agrawal teaches wherein the determination to filter the visual content is based on a first relevance score determined for the visual content being lower than a second relevance score determined for the filtered content (see ¶ 0098. Detected change in the foreground image being less than the foreground image change threshold and/or in response to neither background or foreground image presenting a change that is greater than their respective change thresholds, presenting the unmodified live video feed to the VCS.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zhu to incorporate the virtual image being changed having a less threshold to modify the changes. The modification is to update the changes based a threshold level.

Regarding claim 17, Zhu does not teach the system of claim 15, wherein, to determine to filter the visual content of the video stream of the contact center agent for the contact center engagement with the contact center user, the processing circuitry is configured to execute the instructions to: determine a first relevance score for the visual content; determine a second relevance score for the filtered content; and determine that the first relevance score is lower than the second relevance score.  
Agrawal teaches determine a first relevance score for the visual content; determine a second relevance score for the filtered content; and determine that the first relevance score is lower than the second relevance score (see ¶ 0098. Detected change in the foreground image being less than the foreground image change threshold and/or in response to neither background or foreground image presenting a change that is greater than their respective change thresholds, presenting the unmodified live video feed to the VCS.).
	It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zhu to incorporate the virtual image being changed having a less threshold to modify the changes. The modification is to update the changes based a threshold level.

5.	Claim(s) 14 is rejected under 35 U.S.C. 103 as being unpatentable over Zhu et al. (US 2023/0421719) in view of Roper (US 2023/0126108).
Regarding claim 14, Zhu does not teach the non-transitory computer readable medium of claim 9, wherein the determination to filter the visual content is made prior to a start of the contact center engagement.  
Roper teaches wherein the determination to filter the visual content is made prior to a start of the contact center engagement (see ¶ 0066. The backgrounds are pre-process and provided before the joining of the meeting.).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Zhu to incorporate the pre defining a background before the start of the meeting. The modification is to determine background image prior to start of meeting. 

Conclusion
6.	Any inquiry concerning this communication or earlier communications from the examiner should be directed to ASSAD MOHAMMED whose telephone number is (571)270-7253. The examiner can normally be reached 9:00AM-5:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at 571-272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/ASSAD MOHAMMED/Examiner, Art Unit 2691                                                                                                                                                                                                        
/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691
Read full office action
Prosecution Timeline

Jul 31, 2024
Application Filed
Feb 05, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/105,074
Patent 12604149
ELECTRONIC DEVICE AND METHOD THEREOF FOR OUTPUTTING AUDIO DATA
2y 5m to grant Granted Apr 14, 2026
18/340,183
Patent 12598441
AUDIO SIGNAL PROCESSING METHOD AND AUDIO SIGNAL PROCESSING APPARATUS
2y 5m to grant Granted Apr 07, 2026
18/585,594
Patent 12587801
RE-MIXING A COMPOSITE AUDIO PROGRAM FOR PLAYBACK WITHIN A REAL-WORLD VENUE
2y 5m to grant Granted Mar 24, 2026
18/626,976
Patent 12587774
SYSTEM AND METHOD OF ASSEMBLING A COMPRESSION TRIGGERED HEADSET POWER SAVING SYSTEM FOR AN AUDIO HEADSET
2y 5m to grant Granted Mar 24, 2026
18/245,792
Patent 12581240
Method and System for Determining Audio Channel Role of Sound Box, Electronic Device, and Storage Medium
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
73%
Grant Probability
84%
With Interview (+11.1%)
3y 0m
Median Time to Grant
Low
PTA Risk
Based on 587 resolved cases by this examiner. Grant probability derived from career allow rate.