Prosecution Insights
Last updated: April 19, 2026
Application No. 18/600,926

SYSTEMS AND METHODS FOR ARTIFICIAL-INTELLIGENCE ASSISTANCE IN VIDEO COMMUNICATIONS

Non-Final OA §102§103
Filed
Mar 11, 2024
Examiner
NGUYEN, DUC MINH
Art Unit
2691
Tech Center
2600 — Communications
Assignee
Liveperson Inc.
OA Round
1 (Non-Final)
22%
Grant Probability
At Risk
1-2
OA Rounds
3y 11m
To Grant
40%
With Interview

Examiner Intelligence

Grants only 22% of cases
22%
Career Allow Rate
19 granted / 85 resolved
-39.6% vs TC avg
Strong +17% interview lift
Without
With
+17.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 11m
Avg Prosecution
16 currently pending
Career history
101
Total Applications
across all art units

Statute-Specific Performance

§101
2.4%
-37.6% vs TC avg
§103
62.6%
+22.6% vs TC avg
§102
22.5%
-17.5% vs TC avg
§112
8.3%
-31.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 85 resolved cases

Office Action

§102 §103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 11, 2024, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1-2 and 4-6 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kelly (US 20220327961 A1). Regarding claim 1, Kelly discloses a computer-implemented method comprising: extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly). Regarding claim 2, Kelly discloses the computer-implemented method of claim 1, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly). Regarding claim 4, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]). Regarding claim 5, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]). Regarding claim 6, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 3 and 7-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly (US 20220327961 A1) in view of Ramakrishnan et. al (US 20240146871 A1, hereinafter Ramakrishnan). Regarding claim 3, Kelly discloses the computer-implemented method of claim 1. Kelly does not disclose “wherein the particular issue is associated with a hardware or software fault in a device operated by a user.” However, Ramakrishnan does teach wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment. Regarding claim 7, Kelly discloses the computer-implemented method of claim 1. Kelly does not disclose “wherein the predicted action associated with the particular object comprises a maintenance action or a repair action configured to restore operability in the particular action.” However, Ramakrishnan does teach wherein the predicted action associated with the particular object comprises a maintenance action or a repair action configured to restore operability in the particular action (Fig. 5, 502, 504, 508; Ramakrishnan). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment. Regarding claim 8, Kelly discloses a system comprising: extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly). Kelly does not expressively teach “one or more processors; and a non-transitory machine-readable storage medium storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations including.” However, Ramakrishnan does teach one or more processors (Fig. 1, 101; Ramakrishnan); and a non-transitory machine-readable storage medium storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations including (Fig. 1, 102-103; Ramakrishnan). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment. Regarding claim 9, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly). Regarding claim 10, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan). Regarding claim 11, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]). Regarding claim 12, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]). Regarding claim 13, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly). Regarding claim 14, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly). Regarding claim 15, Kelly discloses extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly). Kelly does not expressively teach “a non-transitory machine-readable storage medium storing instructions that when executed by one or more processors, cause the one or more processors to perform operations including.” However, Ramakrishnan does teach a non-transitory machine-readable storage medium storing instructions that when executed by one or more processors, cause the one or more processors to perform operations including (Fig. 1, 186-187; Ramakrishnan). It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment. Regarding claim 16, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly). Regarding claim 17, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan). Regarding claim 18, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]). Regarding claim 19, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]). Regarding claim 20, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAAD AHMED SYED whose telephone number is (571) 272-6777. The examiner can normally be reached Monday - Friday 8:30 am - 5:00 pm. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at (571) 272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /SAAD AHMED SYED/Examiner, Art Unit 2691 /DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691
Read full office action

Prosecution Timeline

Mar 11, 2024
Application Filed
Dec 09, 2025
Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12563110
IMAGE SECURITY USING SOURCE IDENTIFICATION
2y 5m to grant Granted Feb 24, 2026
Patent 12549900
LOUDSPEAKER TRANSDUCER ARRANGEMENT
2y 5m to grant Granted Feb 10, 2026
Patent 12477275
Speaker Device and Acoustic System
2y 5m to grant Granted Nov 18, 2025
Patent 12389183
PLAYER DEVICE AND ASSOCIATED SIGNAL PROCESSING METHOD
2y 5m to grant Granted Aug 12, 2025
Patent 11889221
Selective Video Conference Segmentation
2y 5m to grant Granted Jan 30, 2024
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
22%
Grant Probability
40%
With Interview (+17.4%)
3y 11m
Median Time to Grant
Low
PTA Risk
Based on 85 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month