DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 11, 2024, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-2 and 4-6 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Kelly (US 20220327961 A1).
Regarding claim 1, Kelly discloses a computer-implemented method comprising: extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly).
Regarding claim 2, Kelly discloses the computer-implemented method of claim 1, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly).
Regarding claim 4, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]).
Regarding claim 5, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]).
Regarding claim 6, Kelly discloses the computer-implemented method of claim 1, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 3 and 7-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kelly (US 20220327961 A1) in view of Ramakrishnan et. al (US 20240146871 A1, hereinafter Ramakrishnan).
Regarding claim 3, Kelly discloses the computer-implemented method of claim 1.
Kelly does not disclose “wherein the particular issue is associated with a hardware or software fault in a device operated by a user.”
However, Ramakrishnan does teach wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment.
Regarding claim 7, Kelly discloses the computer-implemented method of claim 1.
Kelly does not disclose “wherein the predicted action associated with the particular object comprises a maintenance action or a repair action configured to restore operability in the particular action.”
However, Ramakrishnan does teach wherein the predicted action associated with the particular object comprises a maintenance action or a repair action configured to restore operability in the particular action (Fig. 5, 502, 504, 508; Ramakrishnan).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment.
Regarding claim 8, Kelly discloses a system comprising: extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly).
Kelly does not expressively teach “one or more processors; and a non-transitory machine-readable storage medium storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations including.”
However, Ramakrishnan does teach one or more processors (Fig. 1, 101; Ramakrishnan); and a non-transitory machine-readable storage medium storing instructions that when executed by the one or more processors, cause the one or more processors to perform operations including (Fig. 1, 102-103; Ramakrishnan).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment.
Regarding claim 9, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly).
Regarding claim 10, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan).
Regarding claim 11, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]).
Regarding claim 12, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]).
Regarding claim 13, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly).
Regarding claim 14, Kelly, in view of Ramakrishnan, discloses the system of claim 8, wherein the neural network is configured to generate a predicted identification of the object (Fig. 4, 407-408; Kelly).
Regarding claim 15, Kelly discloses extracting one or more video frames from video streams of a set of communication sessions (“a sequence of these per-frame features” Kelly [0019]), wherein the video frames include a representation of an object (Fig. 5, 53; Kelly), and wherein the object is associated with an issue for which the communication session is established (“sign language information” Kelly [0019]); defining a training dataset from the one or more video frames and features extracted from the set of communication sessions (Fig. 3; Kelly); training a neural network using the training dataset, the neural network being configured to generate predictions of actions associated with the object (Fig. 4, 407-408; Kelly); extracting a video frame from a new video stream of a new communication session, wherein the video frame includes a representation of a particular object, and wherein the object is associated with a particular issue (“In real time, or after the signing is completed, the sign language information is sent to 12, which extracts out features (e.g. body pose keypoints, hand keypoints, hand pose, thresholded image, etc. . . . )” Kelly [0019]); executing the neural network using the video frame from the new video stream, wherein the neural network generates a predicted action associated with the particular object (“The features produced by 12 are then transmitted to component 13 which extracts sign language information (e.g. detecting if an individual is signing, transcribing that signing into gloss, or translating that signing into a target language) from a sequence of these per-frame features.” Kelly [0019]); and facilitating a transmission of a communication to a device of the new communication session, the communication including a representation of the predicted action (“Finally, the output is displayed on 14.” Kelly [0019]; Fig. 1, 14; Kelly).
Kelly does not expressively teach “a non-transitory machine-readable storage medium storing instructions that when executed by one or more processors, cause the one or more processors to perform operations including.”
However, Ramakrishnan does teach a non-transitory machine-readable storage medium storing instructions that when executed by one or more processors, cause the one or more processors to perform operations including (Fig. 1, 186-187; Ramakrishnan).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the sign recognition application (as taught by Kelly) with the greenhouse emission advising application (as taught by Ramakrishnan). The rationale to do so is to combine prior art elements according to known methods to yield the predictable result of utilizing sign language to communicate an issue with the software or hardware in a device in a video conference environment.
Regarding claim 16, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the new communication session is between a user device (Fig. 1, 11; Kelly) and a terminal device (Fig. 1, 14; Kelly).
Regarding claim 17, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the particular issue is associated with a hardware or software fault in a device operated by a user (Fig. 3, 312; Ramakrishnan).
Regarding claim 18, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is an ensemble network comprising two or more neural networks configured to generate outputs of different types (“In the processing module 202, the feature train is split into each individual sign via the sign-splitting component 209 via a 1D Convolutional Neural Network which highlights the sign transition periods.” Kelly [0028]).
Regarding claim 19, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]).
Regarding claim 20, Kelly, in view of Ramakrishnan, discloses the non-transitory machine-readable storage medium of claim 15, wherein the neural network is configured to generate a boundary box over the object (“These results are combined to find the bounding box of both the dominant and non-dominant hand by iterating through all bounding boxes found from 205 and finding the one closest to each wrist joint produced by 206.” Kelly [0023]).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAAD AHMED SYED whose telephone number is (571) 272-6777. The examiner can normally be reached Monday - Friday 8:30 am - 5:00 pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Duc Nguyen can be reached at (571) 272-7503. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/SAAD AHMED SYED/Examiner, Art Unit 2691
/DUC NGUYEN/Supervisory Patent Examiner, Art Unit 2691