Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This is responsive to applicant’s amendment filed on 4/20/2026. Claims 1, 3, 7, 12 and 13 have been amended. No claims have been added and no claims have been canceled. Claims 1 – 18 are pending. Claims 1, 7 and 13 are independent.
The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action.
Drawings
The drawings were received on 4/20/2026. These drawings are approved.
Specification
The substitute specification filed 4/20/2026 has not been entered.
Claim Rejections - 35 USC § 103
Claims 1, 7 and 13, as best understood, are rejected under 35 U.S.C. 103 as being unpatentable over Stoker et al -hereinafter “Stoker” (US 20200321007 A1) in view of Ghulman (US 20120078628 A1).
Regarding claim 1, Stoker discloses a method for the supplementation of an alternative form of audio (text transcription) in a communicative conference (video conferencing) for the benefit of the hard of hearing (hearing-impaired person), see abstract and see PP 0085 which teaches the use of Live American Sign Language (ASL) interpreter for “supplementing the transcription capabilities of the system”.
The method taught by Stoker comprises:
capturing audio in a communicative conference between different parties to the conference; see real-time audio transcription, abstract, PP 0016
transforming the captured audio into an alternative form (text transcription), see PP0019, “transcribed audio messages 107, in Fig.6 and transcription bar 210 in Fig. 7,
transmitting the captured audio to the live audio interpreter, see PP 0085 which recites the use of live American Sign Language (ASL) interpreter for one or more users. Inherently, the audio is transmitted to the interpreter.
transmitting the captured audio to the live audio interpreter (inherent and needed) while concurrently delivering the captured audio and the alternative form to one or more at least one of the different parties of the conference who are not hard of hearing, Stoker teaches that transcribed text is transmitted to a user interface for viewing by hearing-impaired persons, and it further contemplates delivery of the transcription (alternative form) to multiple users/devices participating in the same event.
Stoker teaches [PP 0035] that the disclosure relates generally to audio transcription, video conferencing, and online collaboration systems….. Embodiments of systems described herein provide real-time display of audio-to-text transcription in a lecture setting, such as a lecture from a professor in a classroom. It is beneficial to have the text transcript in settings such as university lectures to all participating students. Stoker also teaches [PP 0053] that “multiple users can log in to an event and receive transcribed text from a single microphone.”
While Stoker teaches the use of “alternative form of audio” (text transcription), teaches the use of a device such as a personal computer with user interface 101 (Fig. 2) for viewing the conference participants 102, presentation 104 and transcribed audio messages 107, and teaches the use of an ASL interpreter for supplementation of the transcription, it does not explicitly teach :
“invoking a companion window to the communicative conference with a view to a live audio interpreter for viewing by a hard of hearing one of the different parties to the conference; and
merging the view to the live audio interpreter with the communicative conference and the alternative form, ……. delivering the alternative form and the view to the live interpreter to the hard of hearing one of the different parties in place of the captured audio.”
That is, the hard of hearing will receive the view of the interpreter and alternative form (text), but not the captured audio.
On one hand, since Stoker teaches the use of a sign language (ASL) interpreter, it is inherent that the hearing-impaired user needs to see the ASL interpreter on the user interface 101, Fig. 2. Thus, one of ordinary skill in the art would use one of the available displays such as 104 (if available) or even add a view for the sign language interpreter in order to supplement the transcription.
On the other hand, Ghulman discloses a head-mounted text display for the hearing impaired and teaches [PP 0024] the memory 46 of controller 14 includes a database of video data representative of individual words, such as graphical depictions of sign language and teaches the use of speech-to-text conversion. The textual data signal and the corresponding video data are transmitted simultaneously to the receiver 18, and the textual data and the corresponding video images may then be displayed simultaneously (“merge the view”) to the user. In FIGS. 2A and 2B, a sign language display 40 is shown adjacent to the textual displays 30, 32. The graphical display 40 allows for simultaneous display of sign language with the textual display (“merging the view”). The user may selectively display only text, only the graphical display, or both simultaneously. In Ghulman, the audio is converted to text (alternative form in place of the captured audio). The captured audio is NOT played to the user in Ghulman,
Thus, it would have been obvious to one of ordinary skill in the art before the fling date of the current application to utilize the features taught by Ghulman in the conferencing system disclosed by Stoker so that the hearing-impaired user of Stoker would be able to “invoke a companion window” on interface 101, Fig. 6 to view the live audio interpreter simultaneously while reviewing the alternative form (transcribed audio messages 107). This will give the user versatility (to see both the sign language and the transcription) and it also improves accuracy since some expressions or gestures may be more accurate in one form versus the other. Further, this provides a back-up if one of them has an issue or a problem (e.g., rely on transcript if there was a temporary video problem with viewing the sign language interpreter).
It is noted the Ghulam does not send the captured audio to the hearing-impaired user. Advantages of do so include saving resources/bandwidth and also providing added privacy since generally a hearing-impaired person may need to blast the volume up if audio is available – which is unnecessary since the video of the interpret and the text are both available (no need for audio). Also blasting the volume up by the hearing-impaired person may be annoying to others who are in the same place/room with the hearing-impaired person but not participating in the conference.
Claims 7 and 13 are rejected for the same reasons discussed above with respect to claim 1. Stoker discloses (PP 37 ), on-line collaboration system 10 (Fig. 1) which comprises one or more processor 12 having stored there a computer executable instructions 16 and teaches the use cloud service 26 (read as the claimed “supplementation module”).
Claims 2 – 5, 8 – 11 and 14 – 17 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Sjoberg (US 10025776 B1).
The combination of Stoker and Ghulman does not specifically teach the claimed techniques for selecting the live audio [sign language] interpreter.
Regarding claims 2, 8 and 14, Sjoberg teaches that the user/customer selects a translator (interpreter) from a selection of available translators who are available “now” (col. 9, lines 4 - 35). The system looks up "one or more translators that are qualified and available to provide translation to the customer (PP Col. 2, lines 40-54, and col. 6, line 56 – col. 7, line 8).
Regarding claims 3-5, 9-11 and 15-17, Sjoberg teaches (PP 39) that a translator may be selected based on language (demographic), "most qualified", highest rating. After a customer selects the "parameters" such as language, etc., , the system (PP 62) may populate a list of translators associated with the customer’s preference (criteria specified by the one of the different parties). See for example, the customer may select "French", "Fastest Translation” and "Jacue with a 4.5-star average rating). The system looks up "one or more translators that are qualified and available (“polling”) to provide translation to the customer (PP 12 and 30)
Sjoberg teaches that once the language has been identified (as indicated by the customer's translation request), the LTM system 100 may select one or more translators qualified to provide the translation. Selection of one or more translators may be based on other factors including an estimated or predicted response time (e.g. fastest response time), availability of the one or more translators, and quality ratings.
For claims 2 – 5, 8 – 11 and 14 – 17, Sjoberg teaches the details related to the “selection” of a translator (interpreter) from among available and qualified translators. It would have been obvious to one of ordinary skill at the art to use these selection techniques in the combination of Stoker in view of Ghulman so that the user may utilize these beneficial selection techniques while deciding which interpreter to select. These obvious selection techniques are actually needed since for example if the conference meeting is in Spanish, obviously, a Spanish speaking sign language interpret will be needed, or if the coreference is about “medical issues”, obviously an interpreter with medical expertise would be needed.
Claims 6, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Legatski (US 20230353819).
Regarding claims 6, 12 and 18, Legatski teaches (PP 22 and 47) receiving a request from a Hard of Hearing participant (”the at least one of the different parties of the conference”) to be presented with a sign language interpreter (“SLI”) view within the user interface UI and presents, within the UI of a client device associated with the first participant, an SLI view. Legatski also teaches [PP 60] that other individuals who do not require interpretation may view a different generated recording where the Sign Language Interpreter (SLI) is not visible (“obscuring the view to the live audio interpreter with respect to the remaining ones of the different parties to the conference”).
Thus, it would have been obvious to one of ordinary skill in the art before the filing date of the current application to utilize the teachings of Legatski in the combination of Stoker in view of Ghulman as so that the view of the sign language interpreter is presented to the hearing-impaired participant and not presented to the other participants since the other participants do not need the sign language interpreter which could possibility distract them. Also, presenting the view of the sign language interpret to those who do not need it (e.g., all participant) may unnecessarily use up some of the available bandwidth.
Response to Arguments
Applicant's arguments filed on 4/20/2026 have been fully considered but they are not persuasive.
Applicant’s arguments have been addressed in the above rejections.
To summarize, as discussed above, Stoker discloses a real-time audio transcription, video conferencing, and facilitate the distribution and streaming of a live American Sign Language (‘ASL’) interpreter to one or more users, supplementing the transcription capabilities of the system.
Ghulman discloses a system in which spoken words are converted into text and corresponding video data representative of sign language may be transmitted and displayed simultaneously.
Applicant argues (page 28) that “the captured audio is not transmitted to the hard of hearing one of the participants. Instead, only the alternative form of the captured audio and a view to the sign language interpreter is provided to the hard of hearing conference participant while the non-hard of hearing participants receive both the captured audio and the alternative form of the captured audio.” First, the above rejection addresses applicant’s arguments. Second, Ghulman teaches that the hearing-impaired receives the transcript and the video of the sign language interpreter (No audio). As discussed above, sending audio to the hearing-impaired user is not needed and disadvantageous. In the combination of Stoker and Ghulman, the hearing impaired does not receive audio.
Once the references teach a conference system with real-time transcription and live interpreter supplementation, routing conference audio and available text/visual outputs according to participants’ needs would have been an obvious design choice within the level of ordinary skill in the art. The claimed arrangement is simply a predictable implementation of known accessibility features and does not rise to the level of patentable distinction. That is, a hearing-impaired user needs the transcript and/or the video on the sign language interpreter (both according to Ghulman), and other participants can benefit from the transcript, in addition to the audio (according to Stoker).
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHMAD F. MATAR whose telephone number is (571)272-7488. The examiner can normally be reached M-F 9 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AHMAD F. MATAR/Supervisory Patent Examiner, Art Unit 2693