DETAILED ACTION
Introduction
This office action is in response to applicant’s amendment filed 11/21/2025. Claims 2-21 are currently pending and have been examined. Applicant’s IDS have been considered. There is no claim to foreign priority.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed 11/21/2025 have been fully considered but they are not persuasive. More specifically, applicant argues, regarding claim 2
“The applied references fail to disclose and would not have rendered obvious at least "receive, from the first calling device, a first set of translation preferences associated with the inmate that identify an output language associated with the translation functionality to be provided to the first calling device" and "initiating language translation of the first caller voice data and the second caller voice data, wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device, the language translation including receiving voice data in a second language from the second calling device, translating the received voice data from the second language to the output language associated with the first set of translation preferences to generate translated voice data," as recited in independent claim 2.
The Office Action relies primarily on Vigliotti and Gao as allegedly disclosing the claim limitations. Office Action, p. 3. Gao discloses speech-to-speech translation that converts a source language to a target language. Gao, [0001]. But nothing in Gao describes customizable output languages that can be configured by a user. Likewise, Vigliotti describes providing translationservices only to text-based communications, such as email, SMS text, or chat messages.”
However, the Examiner does not concur with the applicant’s assessment of the prior art. Vigliotti explicitly describes translation preferences, associated with the translation functionality to be provided to the first calling device and the combination explicitly makes obvious initiating language translation of the first caller voice data and the second caller voice data, wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device, the language translation including receiving voice data in a second language from the second calling device, translating the received voice data from the second language to the output language associated with the first set of translation preferences to generate translated voice data (see the rejection below). The applicant’s arguments seem to be explicitly providing the motivation for combining the references, “But nothing in Gao describes customizable output languages that can be configured by a user. Likewise, Vigliotti describes providing translationservices only to text-based communications, such as email, SMS text, or chat messages.” Such that a clear motivation would be to provide Vigliotti with that which Gao teaches, and vice versa…such that a preference based translation system, would then include voice translation to enhance a text-based translation system (see the corresponding rejection below). Therefore, the applicant’s corresponding arguments are deemed non-persuasive. Applicant’s remaining arguments are based on the above arguments and are deemed non-persuasive as well.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2-6, 9-13, 16-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Vigliotti et al. (Vigliotti, US 2015/0229591) in view of Gao et al. (Gao, US 2016/0147740).
As per claim 2, Gao teaches a method for improving efficiency of bidirectional language translation by a call translation server associated with a controlled environment, comprising:
receiving, from a first calling device, a request to enable call translation functionality on the first calling device, wherein the request includes an identifier of an inmate in the controlled environment (Fig. 1A, as containing his calling devices connected via a network, translation server, paragraphs [0021-0023, 0033, 0034] and corresponding call translation service/server, and prison user, initiating call, as the controlled environment and inmate, see also his registered user, as the identifier as received);
enabling, responsive to receiving the request, the call translation functionality based on an authorization of the identifier of the inmate (ibid-after login, thus authorization, communication and translation functionality is enabled, see paragraphs [0036-0041, 0040-0043]-see all call data received by the first user, as translation enabled);
receive, from the first calling device, a first set of translation preferences associated with the inmate that identify an output language associated with the translation functionality to be provided in the first calling device (ibid-see above inmate discussion-see also, paragraphs [0041-0043]-his first user and device, via GUI, language preferences entered and all communication therefrom to be translated into that selected language preference);
initiating a call session between the first calling device located within the controlled environment and a second calling device (ibid-paragraph [0043, 0044], see his first and second mobile device discussion);
detecting the first set of translation preferences associated with the first calling device and a second set of translation preferences associated with the second calling device (paragraph [0035, 0041, 0043, 0046]-his “language preferences” and corresponding translation of all data to the preference language, as part of the translation preferences and settings for each user);
determining that the first set of translation preferences does not match the second set of translation preferences (paragraph [0047]-as his translation preference match determination, Fig. 3);
receiving, in the call session, first caller [voice] data from the first calling device and second caller [voice] data from the second calling device (ibid-paragraph [0043]-Figs. 2E, 3, his to and from data, first and second device communication);
initiating language translation of the first caller [voice] data and the second caller [voice] data, [wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device] (ibid-see above translation discussion, paragraphs [0043-0047, Figs. 2E, 3 and 4, language translation discussion),
the language translation including receiving [voice] data in a second language from a second calling device (ibid-see above language translation, from a second language as receiving), translating the received [voice] data from the second language to the output language associated with the first set of translation preferences to generate translated [voice] data (the received data is translated from the second language into the translation preference language as discussed above); and
outputting the translated [voice] data to the second calling device (ibid-see his displayed translation, Figs. 2E, 3, his abstract).
Vigliotti lacks teaching that which Gao teaches, the above “data”, in each claim limitation, as voice data, wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device (paragraph [0021, 0026]-his S2S translation, and Fig. 1, receiving spoken input utterance, by the first and second device, wherein his processing pipeline includes pre-processed voice data, see Fig. 1 item 132, the Examiner notes, the data interpreted and as caller voice data hereinafter, based on the combination of Vigliotti with Gao).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Vigliotti and Gao to combine the prior art element of translating call data from a first device to a second device in a controlled environment as taught by Vigliotti with a speech to speech voice translation, from a calling device to a second calling device as taught by Gao as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be enhancing a messaging system to include translation of a communication via a calling device to include voice data (ibid, Gao, paragraphs [0020, 0026]).
As per claims 3, 10 and 17, Vigliotti further makes obvious the method of claim 2, wherein enabling the call translation functionality comprises installing a call translation application on the first calling device (paragraph [0033]-his downloaded application to the devices (first and second), Fig. 1).
As per claims 4, 11 and 18, Vigliotti with Gao further makes obvious the method of claim 3, wherein the pre-processed voice data is received from the call translation application (ibid-Gao, paragraph [0020, 0029]-and Fig. 1, his video conferencing application stored on the user device, and corresponding S2S processing pipeline therein, as similarly combined and motivated, see above Gao voice data discussion).
As per claims 5, 12 and 19, Vigliotti with Gao further makes obvious the method of claim 2, wherein the pre-processed voice data comprises a preliminary call translation of a portion of the voice data (ibid, see claims 1, 2, pre-processed voice data, discussion see also Gao, paragraphs [0027, 0028, 0033, 0034]-as his multiple stage translation of the voice input data, conversion into a format suitable for translation, as similarly combined and motivated).
As per claims 6, 13 and 20, Vigliotti with Gao further makes obvious the method of claim 2, wherein the language translation further comprises:
generating first translated caller voice data via a first translation of the first caller voice data upon receiving the first caller voice data from the first calling device (ibid-see Vigliotti, translation in both directions from a first device to a second device, paragraph [0043]-each device, Fig. 1, item 100, sending and receiving translation data as sent, the voice data as combined with Gao);
generating second translated caller voice data via a second translation of the second caller voice data upon receiving the second caller voice data from the second calling device (ibid); and
transmitting the first translated caller voice data to the second calling device and the second translated caller voice data to the first calling device (ibid).
As per claim 9, claim 9 sets forth limitations similar to claim 2 and is thus rejected under similar reasons and rationale, wherein Vigliotti with Gao make obvious a call translation server within a controlled environment (see claim 2, corresponding call translation server discussion), comprising: a memory (Vigliotti, Fig. 1B); and a processor coupled to the memory (ibid) the processor configured to: receive, from a first calling device, a request to enable call translation functionality on the first calling device, wherein the request includes an identifier of an inmate in the controlled environment (ibid-see claim 2, corresponding and similar limitation); enable, responsive to receiving the request, the call translation functionality based on an authorization of the identifier of the inmate (ibid); receive, from the first calling device, a first set of translation preferences associated with the inmate that identify an output language associated with the translation functionality to be provided to the first calling device (ibid); initiate a call session between the first calling device located within the controlled environment and a second calling device (ibid); detect the first set of translation preferences associated with the first calling device and a second set of translation preferences associated with the second calling device (ibid); determine that the first set of translation preferences does not match the second set of translation preferences (ibid); receive, in the call session, first caller voice data from the first calling device and second caller voice data from the second calling device (ibid); initiate language translation of the first caller voice data and the second caller voice data, wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device (ibid), the language translation including receiving voice data in a second language from the second calling device, translating the received voice data from the second language to the output language associated with the first set of translation preferences to generate translated voice data (ibid); and output the translated voice data to the second calling device (ibid).
As per claim 16, claim 16 sets forth limitations similar to claim 2 and is thus rejected under similar reasons and rationale, wherein the non-transitory computer-readable medium is deemed to embody the method, such that Vigliotti with Gao make obvious a non-transitory computer-readable medium having instructions stored therein, which when executed by a processor in a call translation server in a controlled environment cause the processor to perform operations, the operations comprising (paragraph [0004]): receiving, from a first calling device, a request to enable call translation functionality on the first calling device, wherein the request includes an identifier of an inmate in the controlled environment (ibid-see claim 2, corresponding and similar limitation); enabling, responsive to receiving the request, the call translation functionality based on an authorization of the identifier of the inmate (ibid); receiving, from the first calling device, a first set of translation preferences associated with the inmate that identify an output language associated with the translation functionality to be provided to the first calling device (ibid); initiating a call session between the first calling device located within the controlled environment and a second calling device (ibid); detecting the first set of translation preferences associated with the first calling device and a second set of translation preferences associated with the second calling device (ibid); determining that the first set of translation preferences does not match the second set of translation preferences (ibid); receiving, in the call session, first caller voice data from the first calling device and second caller voice data from the second calling device (ibid); initiating language translation of the first caller voice data and the second caller voice data, wherein the first caller voice data comprises receiving pre-processed voice data from the first calling device (ibid), the language translation including receiving voice data in a second language from the second calling device, translating the received voice data from the second language to the output language associated with the first set of translation preferences to generate translated voice data (ibid); and outputting the translated voice data to the second calling device (ibid).
Claim(s) 7, 14 and 21 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao et al. (Gao, US 2016/0147740) in view of Vigliotti et al. (Vigliotti, US 2015/022959), as applied to claim 2, and further in view of Waibel et al. (Waibel, US 2022/0092278).
As per claims 7, 14 and 21, Vigliotti with Gao further makes obvious the method of claim 2, but lacks further comprising, that which Waibel teaches:
storing the first caller voice data and the first translated caller voice data in a first profile associated with the first calling device (paragraph [0142]-his original speech and translated speech pairs, specific to users, see also paragraphs [0142-0145]-MT, speech and translation pairs, his claim 1); and
storing the second caller voice data and the second translated caller voice data in a second profile associated with the second calling device (ibid).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Vigliotti, Gao and Waibel to combine the prior art element of a call translation server, as taught by Vigliotti, with adapting and training a bidirectional machine translation engine/server as taught by Gao with a user specific bilingual speech repository as taught by Waibel as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be providing synthesis in a target language using voice/speech translation memory data (ibid- see also Gao, abstract, using ASR features for training a machine translation server, ibid-see Waibel citation).
Claim(s) 8 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Gao et al. (Gao, US 2016/0147740) in view of Vigliotti et al. (Vigliotti, US 2015/022959), as applied to claim 7, and further in view of Bangalore et al. (Bangalore, US 2010/0082326).
As per claims 8 and 15, Vigliotti with Gao with Waibel make obvious the method of claim 7, but lack further comprising, that which Bangalore teaches: generating a first speech profile based on first speech metrics detected in the first caller voice data and the first translated caller voice data paragraphs [0028-0031]-his stored user/speaker preferences, including speech details, rate of speech, word order, length of pauses, pitch, etc. as his multiple speech metrics for the original data and translated voice data, ibid-as his generated details as discussed above, comprising multiple speech metrics, the user speech metrics are based on original voice data and the translated voice data, see his translated output speech factors as the corresponding associated details generated); and
generating a second speech profile based on second speech metrics detected in the second caller voice data and the second translated caller voice data (ibid-see above speech metrics as applied to each caller voice data).
Thus, it would have been obvious to one of ordinary skill in the linguistics art, before the effective filing date of the invention, as all the claimed elements were known in the prior art and one skilled in the art could have combined the elements as claimed by known methods (computer implemented techniques and algorithms combining processes and steps in natural language processing), in view of the teachings of Vigliotti, Gao and Bangalore and Waibel to combine the prior art element of adapting and training a bidirectional machine translation engine/server as taught by Gao with enhancing and training the translation engine with speaker/user specific and preference prosodic information as taught by Bangalore as each element performs the same function as it does separately, as the combination would yield predictable results, KSR International Co. v. Teleflex Inc., 550 US. -- 82 USPQ2nd 1385 (2007), wherein the predictable result would be enriching spoken language translation with prosodic information (ibid-Bangalore, abstract, see also Gao, abstract, using ASR features for training a machine translation server).
Conclusion
Applicant's amendment necessitated the ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LAMONT M SPOONER whose telephone number is (571)272-7613. The examiner can normally be reached 8:00 AM -5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at (571)272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LAMONT M SPOONER/Primary Examiner, Art Unit 2657
1/30/2026