Last updated: April 19, 2026
Application No. 18/396,162
SEMIAUTOMATED RELAY METHOD AND APPARATUS

Non-Final OA §103§DP
Filed
Dec 26, 2023
Examiner
YIP, JACK
Art Unit
3715
Tech Center
3700 — Mechanical Engineering & Manufacturing
Assignee
ULTRATEC, INC.
OA Round
3 (Non-Final)
Interview Optional

— +37.6% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 702 resolved cases, 2023–2026
Examiner Intelligence

YIP, JACK View full profile →
Grants only 33% of cases
Career Allow Rate
229 granted / 702 resolved
-37.4% vs TC avg
Strong +38% interview lift
Without
With
+37.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
51 currently pending
Career history
753
Total Applications
across all art units
Statute-Specific Performance

§101
22.8%
-17.2% vs TC avg
§103
42.4%
+2.4% vs TC avg
§102
15.0%
-25.0% vs TC avg
§112
12.4%
-27.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 702 resolved cases
Office Action

§103 §DP
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 1/20/2026 has been entered.  Claims 1 – 3, 5 – 19, and 21 – 22 are pending.  Claims 4 and 20 have been cancelled.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1 – 3 and 5 – 19 are rejected under 35 U.S.C. 103 as being unpatentable over Cloran et al. (US 2010/0063815 A1) in view of Chang et al. (US 8,423,361 B1).
Re claims 1, 14:
1.  Cloran teaches [A] caption device for use by a hard of hearing assisted user (AU) to assist the AU during voice communications with a hearing user (HU) using an HU device (Cloran, fig. 1; [0010], “a display”; [0022], “a computer-aided transcription (“CAT”) terminal”; [0017], “media server”), the caption device comprising: 
a display screen; a memory; at least one communication link element for linking to a communication network; a speaker (Cloran, [0023]; [0010]); 
a captioning actuator input feature for receiving a request to initiate captioning of an HU voice signal during a voice communication session between the caption device and the HU device (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
at least one processor programmed to perform the steps of: 
(i) receiving an HU voice signal from the HU device during a call (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
(iii) broadcasting the HU voice signal to the AU via the speaker (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
(iv) receiving, prior to any captioning of the at least the portion of the HU voice signal that is stored in memory, a caption request via the captioning actuator (Cloran, pg. 6, claim 11,  “11. The system of claim 10, wherein: the interpreted intent is to suspend transcription of audio from all audio streams; and the automatic response is to suspend transcription of audio from all audio streams until the system receives instruction from a user to resume” (Cloran, pg. 6, claim 11); 
(v) upon receiving the caption request, retrieving the portion of the HU voice signal (Cloran, [0045]; [0032], “The automated transcription subsystem converts the analyst's speech into text”); 
(vi) obtaining a text caption corresponding to the portion of the HU voice signal (Cloran, [0037], “the user interface of an update to the transcript text”; [0039], “requests a transcription update”; [0032], “The automated transcription subsystem converts the analyst's speech into text”); and 
(vi) presenting the text caption to the AU via the display screen (Cloran, [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible. In some of these embodiments, updates are provided on a per-word basis, while in others updates are sent with each keystroke of the analyst”).

Cloran teaches 14. A caption device for use by a hard of hearing assisted user (AU) to assist the AU during voice communications with a hearing user (HU) using an HU device (Cloran, fig. 1; [0010], “a display”; [0022], “a computer-aided transcription (“CAT”) terminal”; [0017], “media server”), the caption device comprising: 
a display screen; a memory; at least one communication link element for linking to a communication network; a speaker (Cloran, [0023]; [0010]); 
a captioning actuator input feature for receiving a request to initiate captioning of an HU voice signal during a voice communication session between the caption device and the HU device (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
at least one processor programmed to perform the steps of: 
(i) receiving an HU voice signal from the HU device during a call (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
(iii) broadcasting the HU voice signal to the AU via the speaker (Cloran, [0010], “Each participant in the call 110, 120, and 130 has at least a voice connection 115, 125, and 135 to central system 140 through a telephone 114, 124, and 134”; Abstract, “multiple audio lines are used to provide a real-time transcript of a conference call”); 
(iv) receiving, prior to any captioning of the at least the portion of the HU voice signal that is stored in memory, a caption request via the captioning actuator (Cloran, pg. 6, claim 11,  “11. The system of claim 10, wherein: the interpreted intent is to suspend transcription of audio from all audio streams; and the automatic response is to suspend transcription of audio from all audio streams until the system receives instruction from a user to resume” (Cloran, pg. 6, claim 11)); 
(v) upon receiving the caption request, retrieving the portion of the HU voice signal from the memory (Cloran, [0045]; [0032], “The automated transcription subsystem converts the analyst's speech into text”); 
(vi) using an automatic speech recognition (ASR) program to generate a text caption corresponding to the portion of the HU voice signal (Cloran, [0037], “the user interface of an update to the transcript text”; [0039], “requests a transcription update”; [0032], “The automated transcription subsystem converts the analyst's speech into text”); and 
(vi) presenting the text caption to the AU via the display screen (Cloran, [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible. In some of these embodiments, updates are provided on a per-word basis, while in others updates are sent with each keystroke of the analyst”).

Cloran does not explicitly disclose  (ii) storing at least a most recent portion of the HU voice signal in the memory … (iv) receiving, prior to any captioning of the at least the most recent portion of the HU voice signal that is stored in memory, a caption request via the captioning actuator; (v) upon receiving the caption request, retrieving the most recent portion of the HU voice signal from the memory; (vi) running an automatic speech recognition (ASR) program to generate a text caption corresponding to the most recent portion of the HU voice signal.

Chang et al. (US 8,423,361 B1) teaches an invention related to multi-core processing for parallel speech-to-text processing.  Chang further teaches the limitation: (ii) storing at least a most recent portion of the HU voice signal in the memory … (iv) receiving, prior to any captioning of the most recent portion of the HU voice signal that is stored in memory, a caption request via the captioning actuator; (v) upon receiving the caption request, retrieving the most recent portion of the HU voice signal from the memory; (vi) running an automatic speech recognition (ASR) program to generate a text caption corresponding to the most recent portion of the HU voice signal (Chang, col. 7, lines 62 – 67, “the system 500 can distribute the processor job 65 descriptors for the portions of the queue in a last-in-first-out order” – the most recent voice signal from the memory processed first or “last-in-first out order”; fig. 5; col. 2, lines 1 – 22, “the queue including multiple jobs where each job includes one or more identifiers of a respective portion of the audio file classified as belonging to the one or more speech types; distributing the jobs in the queue to processors for speech-to-text processing of the corresponding portion of the audio file; performing speech-to-text processing on each portion to generate a corresponding text file”; the system 500 issues a command to start a captioning session where the voice signal is already buffered).  Therefore, in view of Chang, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the ASR system described in Cloran, by processing the latest speech signal first as taught by Chang, in order to allow a user to conduct real-time communication by processing the last speech signal first and presented latest transcription first.  Furthermore, it allows a user to arrange to the priority for which voice signal to process first wherein job descriptors with high priority are served first (Chang, col. 8, lines 1 – 10). 

Re claims 2 – 3, 9 – 11:
2. The device of claim 1 wherein the step of obtaining a text caption includes initiating a process whereby an automated speech recognition (ASR) engine converts the retrieved HU voice signal to text (Cloran, [0045]; [0032], “The automated transcription subsystem converts the analyst's speech into text”).

3. The device of claim 2 wherein at least one of: the ASR is run by the at least one processor (Cloran, [0045]; [0032], “The automated transcription subsystem converts the analyst's speech into text”); or wherein, after presenting the text caption via the display screen, the ASR engine error corrects the text, the at least one processor using the corrections to make in line corrections to the text captions presented to the AU via the display screen.

9. The device of claim 3 wherein the at least one processor is further programmed to continue obtaining text captions associated with HU voice signal received during the call for at least a period after receiving the caption request and presenting the text captions associated with the HU voice signal received subsequent to the request via the display screen (Cloran, [0020], “the audio stream at moments that are apparently moments of silence between words or sentences”; [0024]; [0027], “receiving updates to the transcript as new chunks are transcribed and corrections are made”; [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”).

10. The device of claim 9 for use with a remote relay that performs captioning services, the step of obtaining text captions for voice signal received after the caption request including the at least one processor transmitting at least a portion of the HU voice signal received after the caption request to the relay and receiving the text caption back from the relay (Cloran, figs. 3 – 4;[0020], “the audio stream at moments that are apparently moments of silence between words or sentences”; [0024]; [0027], “receiving updates to the transcript as new chunks are transcribed and corrections are made”; [0035]).

11. The device of claim 10 wherein at least a portion of the HU voice signal received after the caption request is also transcribed by the ASR and the resulting text caption is presented via the display screen (Cloran, [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible. In some of these embodiments, updates are provided on a per-word basis, while in others updates are sent with each keystroke of the analyst”).

Re claims 6 – 8:
6. The device of claim 1 for use with a remote relay that performs captioning services, the step of obtaining a text caption including the at least one processor transmitting the retrieved most recent portion of the HU voice signal to the relay and receiving the text caption back from the relay.  7. The device of claim 6 further including, after receiving the caption request, establishing a communication link with the relay.  8. The device of claim 1 wherein the at least one processor is further programmed to continue obtaining text captions associated with HU voice signal received during the call for at least a period after receiving the caption request and presenting the text captions associated with the HU voice signal received subsequent to the request via the display screen  (Cloran, [0037], “the user interface of an update to the transcript text”; [0039], “requests a transcription update”; [0032], “The automated transcription subsystem converts the analyst's speech into text”; [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible. In some of these embodiments, updates are provided on a per-word basis, while in others updates are sent with each keystroke of the analyst”; fig. 1).

Re claims 12 – 13:
12. The device of claim 1 wherein the step of obtaining a text caption includes performing at least first and second different transcription processes to generate first and second text captions corresponding to the retrieved HU voice signal and using the first and second text captions to derive an output text caption that is presented via the display screen (Cloran, [0030], “Automatic transcription subsystem 430 sends the audio chunk to service factory”; first text captions from speakers; [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible”; [0032], “repeats the words into a microphone”; second text captions from analyst by revoice).

13. The device of claim 12 wherein the first transcription process includes using an automatic speech recognition (ASR) engine to generate the first text caption and the second transcription process includes a re-voicing process whereby a call assistant listens to the retrieved voice signal and revoices the retrieved voice signal to an ASR to generate the second text caption (Cloran, [0032], “an analyst listens to the speech of the conference call participant and repeats the words into a microphone. The automated transcription subsystem converts the analyst's speech into text that is then associated with the speech of the conference call user … The analyst is then presented with the audio and repeats the speech as an input into the automatic transcription subsystem 430”; second transaction – repeats the words).

Re Claim 5:
5. The device of claim 3 wherein the in-line corrections are visually distinguished from other text that is not corrected (Cloran, [0035], “issued corrections”; [0049], “mark that word or passage as inaccurately transcribed, and the system may respond by providing the audio to one or more additional analysts for transcription and double-checking”; [0050], “check those segments marked as "bad" or "incorrect,””; [0048]).

Re claims 15 – 16:
15. The device of claim 14 wherein, after presenting the text caption via the display screen, the ASR engine error corrects the text, the at least one processor using the corrections to make in line corrections to the text captions presented to the AU via the display screen.  16. The device of claim 15 wherein the in-line corrections are visually distinguished from other text that is not corrected (Cloran, [0035], “issued corrections”; [0049], “mark that word or passage as inaccurately transcribed, and the system may respond by providing the audio to one or more additional analysts for transcription and double-checking”; [0050], “check those segments marked as "bad" or "incorrect,””; [0048]).

Re claim 17:
17. The device of claim 12 wherein the at least one processor is further programmed to continue generating text captions associated with HU voice signal received during the call for at least a period after receiving the caption request and presenting the text captions associated with the HU voice signal received subsequent to the request via the display screen (Cloran, [0030], “Automatic transcription subsystem 430 sends the audio chunk to service factory”; first text captions from speakers; [0035], “Newly transcribed passages and newly issued corrections are provided as updates to the user interface in substantially real time”; [0046], “Updates and corrections to the transcript text are, in some embodiments, sent to the user interface as soon as possible”; [0032], “repeats the words into a microphone”; second text captions from analyst by revoice).

Re claim 18:
18. The device of claim 17 wherein text captions corresponding to the retrieved voice signal are visually distinguished from text captions corresponding to HU voice signal received after the caption request is received  (Cloran, [0048], “marking inaccurate transcription”; [0050]).

Re claim 19:
19. The device of claim 12 for use with a remote relay that performs caption error correction services, the at least one processor transmitting the generated captions and the HU voice signal to the relay as it is generated and receiving error corrections to the generated captions back from the relay and using the error corrections to make in line corrections to captions presented via the display screen (Cloran, [0035], “issued corrections”; [0049], “mark that word or passage as inaccurately transcribed, and the system may respond by providing the audio to one or more additional analysts for transcription and double-checking”; [0050], “check those segments marked as "bad" or "incorrect,””; [0048]) .

Claim 21 is rejected under 35 U.S.C. 103 as being unpatentable over Cloran and Chang as applied to claim 14 above, and further in view of Carraux et al. (US 2008/0221881 A1).
Re claim 21:
Cloran does not explicitly disclose continuing to receive the HU voice signal during the ongoing call for at least a duration of time.  Carraux teaches a speech processing system divides a spoken audio stream
into partial audio streams (Carraux, Abstract).  Carraux teaches a 21. The device of claim 14, wherein the at least one processor is further programmed to: after receiving the caption request: continuing to receive the HU voice signal during the ongoing call for at least a duration of time; obtaining text captions corresponding to the continuing voice signal; and presenting the text captions corresponding to the continuing voice signal via the display screen (Carraux, [0021], “While the recording device 106 is recording each snippet, the recording device may keep track of a start time 130 of the snippet relative to the beginning of the dictation 104 (or to any other reference point within the dictation 104), and a real (absolute) start time 132 of the snippet 202 (to maintain the correspondence of the snippet to other forms of user input, such as the click of a button in a GUI)”; fig. 1, 142 - “Complete Event”; [0016], “enable speech to be transcribed automatically and in real-time (i.e., as the speaker is speaking and before completion of the speech)”; [0037], “end time of the first recorded snippet”;  [0044], “updates the current transcription position 140 to point to the end of the text transcribed”; [0013], “for displaying and editing a transcript according to one embodiment of the present invention”).  Therefore, in view of Carraux, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the device described in Cloran, by providing entire transcription at the end of the transcription process as taught by Carraux, in order to enable speech to be transcribed automatically and in real-time (i.e., as the speaker is speaking and before completion of the speech) (Carraux, [0016]). 

Claim 22 is rejected under 35 U.S.C. 103 as being unpatentable over Bijl et al. (US 6,366,882 B1) in view of Othmer et al. (US 2011/0269429 Al).
Re claim 22:
Bijl teaches 22. A caption device (Bijl, Abstract) comprising: 
a display; a memory; a speaker; a user input (Bijl, col. 13, lines 22 – 27; col. 7, lines 24 - 34); and 
at least one processor in electrical communication with the display, the memory, the speaker, and the user input (Bijl, col. 13, lines 22 – 27; col. 7, lines 24 - 50), the at least one processor being configured to: 
receive, from a hearing user (HU), a HU voice signal (Bijl, Abstract, “plurality of user terminals for recording speeches”; col. 7, lines 35 – 50, “Once recorded … which includes the recorded speech”);
store at least a portion of the HU voice signal in the memory (Bijl, col. 7, lines 35 – 50, “Once recorded … which includes the recorded speech”);  
receive, from the user input, a caption request (Bijl, col. 7, lines 35 – 50, “Once recorded, a request for dictation, which includes the recorded speech, is sent to the server 6. In preparing the dictation request, in some types of user terminal 2”); and 
in response to receiving the caption request, obtain a text caption of at least a portion of the HU voice signal, wherein the portion of the HU voice signal has not been previously transcribed (Bijl, fig. 3; col. 7, lines 35 – 50, “Once recorded, a request for dictation, which includes the recorded speech, is sent to the server 6. In preparing the dictation request, in some types of user terminal 2”).

Bijl does not explicitly disclose receive, from a hearing user (HU) during a call.  Othmer teaches a communication device may participate in telephone calls.  The communication device may allow a user to request transcription of a telephone call by prompting the user when the telephone call is completed (Othmer, Abstract).  Othmer further teaches a caption device comprising: a display; a memory; a speaker; a user input; and at least one processor in electrical communication with the display, the memory, the speaker (Othmer, fig. 1, 102), and the user input receive, from a hearing user (HU) during a call (Othmer, Abstract).  Therefore, in view of Othmer, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to modify the ASR system described in Bijl, by providing transcription service for phone calls as taught by Othmer, in order to provide textual record on what was said during a phone call (Othmer, [0005]). 

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the “right to exclude” granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA  as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CFR 1.321(b). 
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CFR 1.111(a). For a reply to final Office action, see 37 CFR 1.113(c). A request for reconsideration while not provided for in 37 CFR 1.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1 and 14 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 39 of U.S. Patent No. 11627221. Although the claims at issue are not identical, they are not patentably distinct from each other because the subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘221.
Claims 1, 14 and 20 are rejected on the ground of nonstatutory double patenting as being unpatentable over claim 19 of U.S. Patent No. 12137183. Although the claims at issue are not identical, they are not patentably distinct from each other because the subject matter claimed in the instant application is fully disclosed in the more specific claims of ‘183.

Response to Arguments
Applicant's arguments filed 1/20/2026 have been fully considered but they are not persuasive. 
Applicant argues:
Independent claim 1 calls for, among other things, at least one processor programmed to perform the steps of: (iv) receiving, prior to any captioning of the at least the most recent portion of the HU voice signal that is stored in memory, a caption request via the captioning actuator. Cloran, Chang, nor combinations thereof, teach or suggest at least these claim elements.
The examiner submits that Cloran teaches the limitation: prior to any captioning of portion of the HU voice signal that is stored in memory and during an ongoing call, a caption request via the captioning actuator (Cloran, pg. 6, claim 11,  “11. The system of claim 10, wherein: the interpreted intent is to suspend transcription of audio from all audio streams; and the automatic response is to suspend transcription of audio from all audio streams until the system receives instruction from a user to resume” (Cloran, pg. 6, claim 11)).   Hence, Cloran teaches a feature which allows a user to select a portion of audio chunk to be transcribed or start / suspend a transcription of an audio chunk. 

Applicant argues:
First, the Office construes the terminal equipment 112 as being the claimed caption device. However, Cloran does not describe audio data being stored in the memory of terminal equipment 112. Rather, Cloran describes that, " ... service factory 320 retains (or manages retention of) the audio captured from each audio line 115, 125 ... " Cloran, ¶ [0021]. Cloran also describes that, "[b]ack office system 350 stores the audio data and transcription in back office repository 360." Cloran, ¶ [0022]. Further, then, Cloran does not retrieve a HU voice signal from the memory of the terminal equipment 112, at least because Cloran does not first store the HU voice signal in the memory.
The examiner submits that a user may issue a start / suspension / resume of a transcription (Cloran, pg. 6, claims 11 - 12).  The “request”, “start” and “suspension” are caption requests, where the audio data is stored in the memory while the transcription process is “on-hold” or suspended. 

Applicant argues:
The Office asserted that such a combination would permit a user to conduct real-time communication. However, as described above, Cloran already describes real-time communication-and so there would be no reason to modify Cloran with Chang for the purpose of real-time communication.  Chang also does not improve real-time communication.  Chang is unrelated to real-time communications, and is rather focused on audio files of sound sources that have ceased outputting sounds, such as, for example, audio files of a movie, audio files derived from a CD, DVD, MP3, etc. Indeed, Chang does not mention an audio file originating from a telephone call or other real-time communication data source, further bolstering this point.
The examiner disagrees.  Chang teaches “… The audio file can be locally stored or retrieved from a remote location. The audio file can be received, for example, in response to a user selection of a particular audio file. In some implementations, a user uploads the audio file from a client device. The system 200 can receive the audio file from an application. For example, a digital audio workstation can provide the audio file before or after digital signal processing” (Chang, col. 4, lines 40 – 50; col. 8, lines 49 - 61).   Chang teaches a system capable of processing audio signal in real time (Chang, col. 9, lines 61 – 68) and it allows the system to assign priority for which voice signal to process first wherein job descriptors with high priority are served first (Chang, col. 8, lines 1 – 10).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACK YIP whose telephone number is (571)270-5048. The examiner can normally be reached Monday thru Friday; 9:00 AM - 5:00 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, XUAN THAI can be reached at (571) 272-7147. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/JACK YIP/Primary Examiner, Art Unit 3715
Read full office action
Prosecution Timeline

Dec 26, 2023
Application Filed
Mar 11, 2025
Non-Final Rejection — §103, §DP
Jun 10, 2025
Response Filed
Sep 15, 2025
Final Rejection — §103, §DP
Dec 18, 2025
Examiner Interview Summary
Dec 18, 2025
Applicant Interview (Telephonic)
Jan 20, 2026
Request for Continued Examination
Feb 18, 2026
Response after Non-Final Action
Mar 05, 2026
Non-Final Rejection — §103, §DP (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/613,064
Patent 12588859
SYSTEM AND METHOD FOR INTERACTING WITH HUMAN BRAIN ACTIVITIES USING EEG-FNIRS NEUROFEEDBACK
2y 5m to grant Granted Mar 31, 2026
18/732,429
Patent 12592160
System and Method for Virtual Learning Environment
2y 5m to grant Granted Mar 31, 2026
17/896,373
Patent 12558290
BLOOD PRESSURE LOWERING TRAINING DEVICE
2y 5m to grant Granted Feb 24, 2026
17/498,279
Patent 12525140
SYSTEMS AND METHODS FOR PROGRAM TRANSMISSION
2y 5m to grant Granted Jan 13, 2026
18/940,136
Patent 12512012
SYSTEM FOR EVALUATING RADAR VECTORING APTITUDE
2y 5m to grant Granted Dec 30, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
33%
Grant Probability
70%
With Interview (+37.6%)
4y 1m
Median Time to Grant
High
PTA Risk
Based on 702 resolved cases by this examiner. Grant probability derived from career allow rate.