Last updated: April 19, 2026

Application No. 18/649,563

MERGING OF SUPPLEMENTATIVE LIVE AUDIO TRANSFORMATION

Non-Final OA §103§112

Filed

Apr 29, 2024

Examiner

MATAR, AHMAD

Art Unit

2693

Tech Center

2600 — Communications

Assignee

Nagish Inc.

OA Round

1 (Non-Final)

Interview Optional

— +11.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 13 resolved cases, 2023–2026

Examiner Intelligence

MATAR, AHMAD View full profile →

Grants only 38% of cases

Career Allow Rate

5 granted / 13 resolved

-23.5% vs TC avg

Moderate +12% lift

Without

With

+11.5%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

6 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

3.9%

-36.1% vs TC avg

§103

46.2%

+6.2% vs TC avg

§102

23.1%

-16.9% vs TC avg

§112

23.1%

-16.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 13 resolved cases

Office Action

§103 §112

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 4/29/2024 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the references in the IDS have been considered by the examiner EXCEPT for  2019171716 and 2006026001.  It appears that applicant has provided incorrect or incomplete numbers for these two references. 

Drawings
New corrected drawings in compliance with 37 CFR 1.121(d) are required in this application because :
Fig. 3, the “Yes” and “No” lines after decision “315” overlap and create unclear path. 
Auxiliary stream 130 (mentioned in PP 22) is not shown in Fig. 1. 
In Figs, 1 – 3, the use of the terms “translator” and “interpreter” interchangeably is confusing.  See objection to the specification.  The drawings need to be consistent with the terminology in the specification. 
The abstract and the disclosure (e.g., PP7 and PP13) state “transmitting the captured audio to the live audio interpreter while concurrently delivering the captured audio and the alternative form to at least one of the different parties [hard of hearing] of the conference” but Fig. 1 shows that only the “captioning” of the audio being transmitted to the hard of hearing.
Applicant is advised to employ the services of a competent patent draftsperson outside the Office, as the U.S. Patent and Trademark Office no longer prepares new drawings. The corrected drawings are required in reply to the Office action to avoid abandonment of the application. The requirement for corrected drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities:
PP 22, “selectedbased” should be –selected based--.  
Appropriate correction is required.

Claim Objections
Claim 3 is objected to because of the following informality: 
On line 1 “according criteria” should be –according to criteria--.
 Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the enablement requirement.  The claim(s) contains subject matter which was not described in the specification in such a way as to enable one skilled in the art to which it pertains, or with which it is most nearly connected, to make and/or use the invention.
In engineering and product documentation the terms “translator” and “interpreter” have different and widely accepted meanings. A translator is normally associated with “written text” and an interpreter is normally associated with real-time (live) spoken or sign language. Using the terms interchangeably in the specification brings confusion, especially since the specification has live sign language interpretation and captioning/text.  It is unclear if the “translator” and “interpreter” are the same or different conference participants and it is unclear if they have different tasks. For example, it is not clear if the claimed “live audio interpreter” is the same as the disclosed “live translator”.

The following is a quotation of 35 U.S.C. 112(b):

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 1 – 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
In claim 1, lines 11-12, it is not clear what is meant by “delivering ….. to at least one of the different parties of the conference”?  Is this the “hard of hearing” conference participant or any arbitrary “party of the conference”? Note that “at least one” may imply delivering to a few or even all of the “different parties”.  It is noted that the “hard of hearing” is only recited in the preamble and that none of the limitations is tied to the “hard of hearing”.  
Also, it is clear whether the “displaying” (line 5) is provided to “the different parties of the conference” or just to the “hard of hearing” conference participant?
The use of the term “live audio interpreter” in association with the “hard of hearing” is confusing since a “live audio interpreter” is generally understood in the art to provide “audio” interpretation (e.g., from language to language).  Based on the claimed limitations such as “hard of hearing” and “merging the view ….”, it appears that the claim should recite “live sign language interpreter”. 
Independent claims 7 and 13 are rejected for the same reasons discussed above for claim 1.  Dependent claims 2-6, 8-12 and 14-18 are rejected for being dependent upon a rejected claim. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 7 and 13, as best understood, are rejected under 35 U.S.C. 103 as being unpatentable over Stoker et al -hereinafter “Stoker” (US 20200321007 A1) in view of Ghulman (US 20120078628 A1).
Regarding claim 1, Stoker discloses a method for the supplementation of an alternative form of audio (text transcription) in a communicative conference (video conferencing) for the benefit of the hard of hearing (hearing-impaired person), see abstract and see PP 0085 which teaches the use of Live American Sign Language (ASL) interpreter for “supplementing the transcription capabilities of the system”. 
The method taught by Stoker comprises: 
capturing audio in a communicative conference between different parties to the conference; see real-time audio transcription, abstract, PP 0016
transforming captured audio into an alternative form (text transcription) and displaying the alternative form in connection with the communicative conference; see PP0019, “transcribed audio messages 107, in Fig.6 and transcription bar 210 in Fig. 7,
transmitting the captured audio to the live audio interpreter, see PP 0085 which recites the use of live American Sign Language (ASL) interpreter for one or more users.  Inherently, the audio is transmitted to the interpreter.  
While Stoker teaches the use of “alternative form of audio” (text transcription), teaches the use of a device such as a personal computer with user interface 101 (Fig. 2) for viewing the conference participants 102, presentation 104 and transcribed audio messages 107, and teaches the use of an ASL interpreter for supplementation of the transcription, it does not explicitly teach “invoking a companion window to the communicative conference with a view to a live audio interpreter; and, merging the view to the live audio interpreter with the communicative conference and the alternative form”. 
On one hand, since Stoker teaches the use of a sign language (ASL) interpreter, it is inherent that the hearing-impaired user needs to see the ASL interpreter on the user interface 101, Fig. 2.  Thus, one of ordinary skill in the art would use one of the available displays such as 104 (if available) or even add a view for the sign language interpreter in order to supplement the transcription.  
On the other hand, Ghulman discloses a head-mounted text display for the hearing impaired and teaches [PP 0024] the memory 46 of controller 14 includes a database of video data representative of individual words, such as graphical depictions of sign language and teaches the use of speech-to-text conversion. The textual data signal and the corresponding video data are transmitted simultaneously to the receiver 18, and the textual data and the corresponding video images may then be displayed simultaneously to the user. In FIGS. 2A and 2B, a sign language display 40 is shown adjacent to the textual displays 30, 32. The graphical display 40 allows for simultaneous display of sign language with the textual display (“merging the view”). The user may selectively display only text, only the graphical display, or both simultaneously.
Thus, it would have been obvious to one of ordinary skill in the art before the fling date of the current application to utilize the feature taught by Ghulman in the conferencing system disclosed by Stoker so that the hearing-impaired user of Stoker would be able to “invoke a companion window” on interface 101, Fig. 6 to view the live audio interpreter simultaneously while reviewing the alternative form (transcribed audio messages 107).  This will give the user versatility (to see both the sign language and the transcription) and it also improves accuracy since some expressions or justers may be more accurate in one form versus the other.  Further, this provides a back-up if one of them has an issue or a problem (e.g., rely on transcript if there was a temporary video problem with viewing the sign language interpreter). 
As for “transmitting the captured audio to the live audio interpreter while concurrently delivering the captured audio and the alternative form to at least one of the different parties of the conference”, normally, by default audio is sent to all conference participants in Stoker and it would be up to the hearing-impaired person to turn the volume down or totally mute the audio.  
 
Claims 7 and 13 are rejected for the same reasons discussed above with respect to claim 1.  Stoker discloses (PP 37 ), on-line collaboration system 10 (Fig. 1) which comprises one or more processor 12 having stored there a computer executable instructions 16 and teaches the use cloud service 26 (read as the claimed “supplementation module”). 

Claims 2 – 5, 8 – 11 and 14 – 17 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Sjoberg (US 10025776 B1).
The combination of Stoker and Ghulman does not specifically teach the claimed techniques for selecting the live audio [sign language] interpreter.

Regarding claims 2, 8 and 14, Sjoberg teaches that the user/customer selects a translator (interpreter) from a selection of available translators who are available “now” (col. 9, lines 4 - 35).  The system looks up "one or more translators that are qualified and available to provide translation to the customer (PP Col. 2, lines 40-54, and col. 6, line 56 – col. 7, line 8).

Regarding claims 3-5, 9-11 and 15-17, Sjoberg teaches (PP 39) that a translator may be selected based on language (demographic), "most qualified", highest rating.  After a customer selects the "parameters" such as language, etc., , the system (PP 62) may populate a list of translators associated with the customer’s preference  (criteria specified by the one of the different parties).  See for example, the customer may select "French", "Fastest Translation” and "Jacue with a 4.5-star average rating).   The system looks up "one or more translators that are qualified and available (“polling”)  to provide translation to the customer (PP 12 and 30)
Sjoberg teaches that once the language has been identified (as indicated by the customer's translation request), the LTM system 100 may select one or more translators qualified to provide the translation.  Selection of one or more translators may be based on other factors including an estimated or predicted response time (e.g. fastest response time), availability of the one or more translators, and quality ratings.
For claims 2 – 5, 8 – 11 and 14 – 17, Sjoberg teaches the details related to the “selection” of a translator (interpreter) from among available and qualified translators. It would have been obvious to one of ordinary skill at the art to use these selection techniques in the combination of  Stoker in view of Ghulman so that the user may utilize these beneficial selection techniques while deciding which interpreter to select.  These obvious selection techniques are actually needed since for example if the conference meeting is in Spanish, obviously, a Spanish speaking sign language interpret will be needed, or if the coreference is about “medical issues”, obviously an interpreter with medical expertise would be needed. 

Claims 6, 12 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Stoker in view of Ghulman as applied to claims 1, 7 and 13 above, and further in view Legatski (US 20230353819).

Regarding claims 6, 12 and 18, Legatski teaches (PP 22 and 47) receiving a request from a Hard of Hearing participant (”the at least one of the different parties of the conference”) to be presented with a sign language interpreter (“SLI”) view within the user interface UI and presents, within the UI of a client device associated with the first participant, an SLI view.  Legatski also teaches [PP 60] that other individuals who do not require interpretation may view a different generated recording where the Sign Language Interpreter (SLI) is not visible (“obscuring the view to the live audio interpreter with respect to the remaining ones of the different parties to the conference”).
Thus, it would have been obvious to one of ordinary skill in the art before the filing date of the current application to utilize the teachings of Legatski in the combination of Stoker in view of Ghulman as so that the view of the sign language interpreter is presented to the hearing-impaired participant and not presented to the other participants since the other participants do not need the sign language interpreter which could possibility distract them. Also, presenting the view of the sign language interpret to those who do not need it (e.g., all participant) may unnecessarily use up some of the available bandwidth.  

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Oh et al, 20150046148, [0058] The controller 110 converts the first caption data into second caption data by adding the sign language animation to the first caption data. When video data is displayed on the screen 190, the controller 110 controls the screen 190 to display the second caption data including the sign language animation together. In other words, the controller 110 controls the screen 190 to simultaneously display the first caption data and the sign language data.

Ayanoglu, 20240163124 A1, [PP 0060] teaches that the mapping data structure of 401 can define a number of profiles 430.  For instance, K can be automatically selected as an assistant for users having a prerequisite indicating need for a Spanish translator, a mediator or an administrator;  L can be automatically selected as an assistant for users having a prerequisite indicating need for a French translator or a Spanish translator; and A can be automatically selected as an assistant for users having a prerequisite indicating need for a sign language interpreter.

Sanders et al (US 8,244,222) teaches a method in a data processing system for matching end users with providers of translation and interpretation services, based on a plurality of factors, predefined or preset by the end users, comprising the steps of: receiving an end user request for translation or interpretation services over an Internet-enabled global communications network or mobile and wireless network; defining a data set consisting of information from the group of: native language, dialect, culture, location, gender, age, education level, availability, rating range, experience, reviews, subject matter expertise, the desired need, including but not limited to medical, business, academic, legal, and personal reasons (see claim 1). 

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AHMAD F. MATAR whose telephone number is (571)272-7488. The examiner can normally be reached M-F 9 - 5:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/AHMAD F. MATAR/Supervisory Patent Examiner, Art Unit 2693

Read full office action

Prosecution Timeline

Apr 29, 2024

Application Filed

Nov 15, 2025

Non-Final Rejection — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/281,130

Patent 12574458

METHOD FOR TRANSMITTING CALL AUDIO DATA AND APPARATUS

2y 5m to grant Granted Mar 10, 2026

18/380,314

Patent 12563143

Pre-Authentication for Interactive Voice Response System

2y 5m to grant Granted Feb 24, 2026

18/525,494

Patent 12549669

System and method to evaluate microservices integrated in Interactive Voice Response (IVR) operations

2y 5m to grant Granted Feb 10, 2026

18/146,616

Patent 12462816

AUDIO ENCODING METHOD AND CODING DEVICE

2y 5m to grant Granted Nov 04, 2025

13/468,030

Patent 9137370

Call center input/output agent utilization arbitration system

2y 5m to grant Granted Sep 15, 2015

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

38%

Grant Probability

50%

With Interview (+11.5%)

2y 7m

Median Time to Grant

Low

PTA Risk

Based on 13 resolved cases by this examiner. Grant probability derived from career allow rate.