Prosecution Insights
Last updated: May 29, 2026
Application No. 18/649,563

MERGING OF SUPPLEMENTATIVE LIVE AUDIO TRANSFORMATION

Final Rejection §103
Filed
Apr 29, 2024
Examiner
MATAR, AHMAD
Art Unit
2693
Tech Center
2600 — Communications
Assignee
Nagish Inc.
OA Round
2 (Final)
50%
Grant Probability
Moderate
3-4
OA Rounds
2y 0m
Est. Remaining
62%
With Interview

Examiner Intelligence

Grants 50% of resolved cases
50%
Career Allowance Rate
8 granted / 16 resolved
-12.0% vs TC avg
Moderate +12% lift
Without
With
+12.5%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
5 currently pending
Career history
19
Total Applications
across all art units

Statute-Specific Performance

§103
85.3%
+45.3% vs TC avg
§102
5.9%
-34.1% vs TC avg
§112
8.8%
-31.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 16 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment This is responsive to applicant’s amendment filed on 4/20/2026. Claims 1, 3, 7, 12 and 13 have been amended. No claims have been added and no claims have been canceled. Claims 1 – 18 are pending. Claims 1, 7 and 13 are independent. The text of those sections of Title 35, U.S. Code not included in this action can be found in a prior Office action. Drawings The drawings were received on 4/20/2026. These drawings are approved. Specification The substitute specification filed 4/20/2026 has not been entered. Claim Rejections - 35 USC § 103 Claims 1, 7 and 13, as best understood, are rejected under 35 U.S.C. 103 as being unpatentable over Stoker et al -hereinafter “Stoker” (US 20200321007 A1) in view of Ghulman (US 20120078628 A1). Regarding claim 1, Stoker discloses a method for the supplementation of an alternative form of audio (text transcription) in a communicative conference (video conferencing) for the benefit of the hard of hearing (hearing-impaired person), see abstract and see PP 0085 which teaches the use of Live American Sign Language (ASL) interpreter for “supplementing the transcription capabilities of the system”. The method taught by Stoker comprises: capturing audio in a communicative conference between different parties to the conference; see real-time audio transcription, abstract, PP 0016 transforming the captured audio into an alternative form (text transcription), see PP0019, “transcribed audio messages 107, in Fig.6 and transcription bar 210 in Fig. 7, transmitting the captured audio to the live audio interpreter, see PP 0085 which recites the use of live American Sign Language (ASL) interpreter for one or more users. Inherently, the audio is transmitted to the interpreter. transmitting the captured audio to the live audio interpreter (inherent and needed) while concurrently delivering the captured audio and the alternative form to one or more at least one of the different parties of the conference who are not hard of hearing, Stoker teaches that transcribed text is transmitted to a user interface for viewing by hearing-impaired persons, and it further contemplates delivery of the transcription (alternative form) to multiple users/devices participating in the same event. Stoker teaches [PP 0035] that the disclosure relates generally to audio transcription, video conferencing, and online collaboration systems….. Embodiments of systems described herein provide real-time display of audio-to-text transcription in a lecture setting, such as a lecture from a professor in a classroom. It is beneficial to have the text transcript in settings such as university lectures to all participating students. Stoker also teaches [PP 0053] that “multiple users can log in to an event and receive transcribed text from a single microphone.” While Stoker teaches the use of “alternative form of audio” (text transcription), teaches the use of a device such as a personal computer with user interface 101 (Fig. 2) for viewing the conference participants 102, presentation 104 and transcribed audio messages 107, and teaches the use of an ASL interpreter for supplementation of the transcription, it does not explicitly teach : “invoking a companion window to the communicative conference with a view to a live audio interpreter for viewing by a hard of hearing one of the different parties to the conference; and merging the view to the live audio interpreter with the communicative conference and the alternative form, ……. delivering the alternative form and the view to the live interpreter to the hard of hearing one of the different parties in place of the captured audio.” That is, the hard of hearing will receive the view of the interpreter and alternative form (text), but not the captured audio. On one hand, since Stoker teaches the use of a sign language (ASL) interpreter, it is inherent that the hearing-impaired user needs to see the ASL interpreter on the user interface 101, Fig. 2. Thus, one of ordinary skill in the art would use one of the available displays such as 104 (if available) or even add a view for the sign language interpreter in order to supplement the transcription. On the other hand, Ghulman discloses a head-mounted text display for the hearing impaired and teaches [PP 0024] the memory 46 of controller 14 includes a database of video data representative of individual words, such as graphical depictions of sign language and teaches the use of speech-to-text conversion. The textual data signal and the corresponding video data are transmitted simultaneously to the receiver 18, and the textual data and the corresponding video images may then be displayed simultaneously (“merge the view”) to the user. In FIGS. 2A and 2B, a sign language display 40 is shown adjacent to the textual displays 30, 32. The graphical display 40 allows for simultaneous display of sign language with the textual display (“merging the view”). The user may selectively display only text, only the graphical display, or both simultaneously. In Ghulman, the audio is converted to text (alternative form in place of the captured audio). The captured audio is NOT played to the user in Ghulman, Thus, it would have been obvious to one of ordinary skill in the art before the fling date of the current application to utilize the features taught by Ghulman in the conferencing system disclosed by Stoker so that the hearing-impaired user of Stoker would be able to “invoke a companion window” on interface 101, Fig. 6 to view the live audio interpreter simultaneously while reviewing the alternative form (transcribed audio messages 107). This will give the user versatility (to see both the sign language and the transcription) and it also improves accuracy since some expressions or gestures may be more accurate in one form versus the other. Further, this provides a back-up if one of them has an issue or a problem (e.g., rely on transcript if there was a temporary video problem with viewing the sign language interpreter). It is noted the Ghulam does not send the captured audio to the hearing-impaired user. Advantages of do so include saving resources/bandwidth and also providing added privacy since generally a hearing-impaired person may need to blast the volume up if audio is available – which is unnecessary since the video of the interpret and the text are both available (no need for audio). Also blasting the volume up by the hearing-impaired person may be annoying to others who are in the same place/room with the hearing-impaired person but not participating in the conference. Claims 7 and 13 are rejected for the same reasons discussed above with respect to claim 1. Stoker discloses (PP 37 ), on-line collaboration system 10 (Fig. 1) which comprises one or more processor 12 having stored there a computer executable instructions 16 and teaches the use cloud service 26 (read as the claimed “supplementation module”). Claims 2 – 5, 8 – 11 and 14 – 17 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Sjoberg (US 10025776 B1). The combination of Stoker and Ghulman does not specifically teach the claimed techniques for selecting the live audio [sign language] interpreter. Regarding claims 2, 8 and 14, Sjoberg teaches that the user/customer selects a translator (interpreter) from a selection of available translators who are available “now” (col. 9, lines 4 - 35). The system looks up "one or more translators that are qualified and available to provide translation to the customer (PP Col. 2, lines 40-54, and col. 6, line 56 – col. 7, line 8). Regarding claims 3-5, 9-11 and 15-17, Sjoberg teaches (PP 39) that a translator may be selected based on language (demographic), "most qualified", highest rating. After a customer selects the "parameters" such as language, etc., , the system (PP 62) may populate a list of translators associated with the customer’s preference (criteria specified by the one of the different parties). See for example, the customer may select "French", "Fastest Translation” and "Jacue with a 4.5-star average rating). The system looks up "one or more translators that are qualified and available (“polling”) to provide translation to the customer (PP 12 and 30) Sjoberg teaches that once the language has been identified (as indicated by the customer's translation request), the LTM system 100 may select one or more translators qualified to provide the translation. Selection of one or more translators may be based on other factors including an estimated or predicted response time (e.g. fastest response time), availability of the one or more translators, and quality ratings. For claims 2 – 5, 8 – 11 and 14 – 17, Sjoberg teaches the details related to the “selection” of a translator (interpreter) from among available and qualified translators. It would have been obvious to one of ordinary skill at the art to use these selection techniques in the combination of Stoker in view of Ghulman so that the user may utilize these beneficial selection techniques while deciding which interpreter to select. These obvious selection techniques are actually needed since for example if the conference meeting is in Spanish, obviously, a Spanish speaking sign language interpret will be needed, or if the coreference is about “medical issues”, obviously an interpreter with medical expertise would be needed. Claims 6, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Legatski (US 20230353819). Regarding claims 6, 12 and 18, Legatski teaches (PP 22 and 47) receiving a request from a Hard of Hearing participant (”the at least one of the different parties of the conference”) to be presented with a sign language interpreter (“SLI”) view within the user interface UI and presents, within the UI of a client device associated with the first participant, an SLI view. Legatski also teaches [PP 60] that other individuals who do not require interpretation may view a different generated recording where the Sign Language Interpreter (SLI) is not visible (“obscuring the view to the live audio interpreter with respect to the remaining ones of the different parties to the conference”). Thus, it would have been obvious to one of ordinary skill in the art before the filing date of the current application to utilize the teachings of Legatski in the combination of Stoker in view of Ghulman as so that the view of the sign language interpreter is presented to the hearing-impaired participant and not presented to the other participants since the other participants do not need the sign language interpreter which could possibility distract them. Also, presenting the view of the sign language interpret to those who do not need it (e.g., all participant) may unnecessarily use up some of the available bandwidth. Response to Arguments Applicant's arguments filed on 4/20/2026 have been fully considered but they are not persuasive. Applicant’s arguments have been addressed in the above rejections. To summarize, as discussed above, Stoker discloses a real-time audio transcription, video conferencing, and facilitate the distribution and streaming of a live American Sign Language (‘ASL’) interpreter to one or more users, supplementing the transcription capabilities of the system. Ghulman discloses a system in which spoken words are converted into text and corresponding video data representative of sign language may be transmitted and displayed simultaneously. Applicant argues (page 28) that “the captured audio is not transmitted to the hard of hearing one of the participants. Instead, only the alternative form of the captured audio and a view to the sign language interpreter is provided to the hard of hearing conference participant while the non-hard of hearing participants receive both the captured audio and the alternative form of the captured audio.” First, the above rejection addresses applicant’s arguments. Second, Ghulman teaches that the hearing-impaired receives the transcript and the video of the sign language interpreter (No audio). As discussed above, sending audio to the hearing-impaired user is not needed and disadvantageous. In the combination of Stoker and Ghulman, the hearing impaired does not receive audio. Once the references teach a conference system with real-time transcription and live interpreter supplementation, routing conference audio and available text/visual outputs according to participants’ needs would have been an obvious design choice within the level of ordinary skill in the art. The claimed arrangement is simply a predictable implementation of known accessibility features and does not rise to the level of patentable distinction. That is, a hearing-impaired user needs the transcript and/or the video on the sign language interpreter (both according to Ghulman), and other participants can benefit from the transcript, in addition to the audio (according to Stoker). Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHMAD F. MATAR whose telephone number is (571)272-7488. The examiner can normally be reached M-F 9 - 5:30. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /AHMAD F. MATAR/Supervisory Patent Examiner, Art Unit 2693
Read full office action

Prosecution Timeline

Apr 29, 2024
Application Filed
Nov 19, 2025
Non-Final Rejection mailed — §103
Apr 20, 2026
Response Filed
May 12, 2026
Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12574458
METHOD FOR TRANSMITTING CALL AUDIO DATA AND APPARATUS
2y 6m to grant Granted Mar 10, 2026
Patent 12563143
Pre-Authentication for Interactive Voice Response System
2y 4m to grant Granted Feb 24, 2026
Patent 12549669
System and method to evaluate microservices integrated in Interactive Voice Response (IVR) operations
2y 2m to grant Granted Feb 10, 2026
Patent 12462816
AUDIO ENCODING METHOD AND CODING DEVICE
2y 10m to grant Granted Nov 04, 2025
Patent 9137370
Call center input/output agent utilization arbitration system
3y 4m to grant Granted Sep 15, 2015
Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

3-4
Expected OA Rounds
50%
Grant Probability
62%
With Interview (+12.5%)
4y 1m (~2y 0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 16 resolved cases by this examiner. Grant probability derived from career allowance rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month