Prosecution Insights
Last updated: April 19, 2026
Application No. 16/992,489

SYSTEM AND METHOD USING CLOUD STRUCTURES IN REAL TIME SPEECH AND TRANSLATION INVOLVING MULTIPLE LANGUAGES, CONTEXT SETTING, AND TRANSCRIPTING FEATURES

Final Rejection §103
Filed
Aug 13, 2020
Examiner
JACKSON, JAKIEDA R
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Wordly Inc.
OA Round
8 (Final)
74%
Grant Probability
Favorable
9-10
OA Rounds
3y 0m
To Grant
89%
With Interview

Examiner Intelligence

Grants 74% — above average
74%
Career Allow Rate
669 granted / 905 resolved
+11.9% vs TC avg
Strong +15% interview lift
Without
With
+15.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
35 currently pending
Career history
940
Total Applications
across all art units

Statute-Specific Performance

§101
25.8%
-14.2% vs TC avg
§103
42.5%
+2.5% vs TC avg
§102
21.8%
-18.2% vs TC avg
§112
3.5%
-36.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 905 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Amendment In response to the Office Action mailed June 3, 2025, applicant submitted an amendment filed on December 3, 2025, in which the applicant amended and requested reconsideration. Response to Arguments Applicants argue that the prior art cited fails to teach the claims as amended. However, Diamant teaches translating using artificial intelligence (p. 0023, 0033, 0109). Therefore, Applicants arguments have been considered, but are not persuasive. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-5, 9-11, 14-16, 18-19, 21 and 24 is/are rejected under 35 U.S.C. 103 as being unpatentable over Diamant et al. (PGPUB 2019/0341050), hereinafter referenced as Diamant in view of Lin et al. (PGPUB 2014/0358516), hereinafter referenced as Lin. Regarding claims 1 and 9, Diamant discloses a system and method, hereinafter referenced as a system for using cloud structures in real time speech and translation involving multiple languages, comprising: a computer server including a processor and a memory for storing an application that configures the computer server to perform functions including (application; p. 0155-0158): receiving audio content in a first spoken language from a first user’s speaking device (speaker speaking Chinese; p. 0061-0062), transcribing the audio content into transcribed audio content in the first spoken language (p. 0023-0024, 0060-0064) receiving a second language preference from a second user’s client device and a second language preference from a second client device, the first and second language preference differing from the spoken language and each other (preferred language; p. 0061-0062), a first translation engine configured to translate by artificial intelligence (p. 0023, 0033, 0109) and receive a digital transmission of the transcribed audio content and the first language preference (translate and send data; p. 0061-0062), but does not specifically teach a receiving a first language preference from a first client device associated with a second spoken language, wherein the first language preference associated with the second spoken language differs from the first spoken language, receiving a second language preference from a second client device associated with a third spoken language, wherein the second language preference associated with the third spoken language differs from the first spoken language and the second spoken language; a first translation engine configured to receive a digital transmission of the transcribed audio content and the first language preference, a second translation engine configured to receive a digital transmission of the transcribed audio content and the second language preference, wherein the application further configures the computer server to receive the first and second translated transcribed content from the first and second translation engines, to send the first translated transcribed content to the first client device, to send the second translated transcribed content to the second client device, and to develop a context for the audio content in order to improve transcription and translation of the audio content from the first speaking device. Lin discloses a system comprising: receiving a second user's language preference from a second user's client device associated with a second spoken language, wherein the second user's language preference associated with the second spoken language differs from the first spoken language (user 102 and 108 speaks different languages; p. 0019), receiving a third user's language preference from a third user's client device associated with a third spoken language (third language), wherein the third user's language preference associated with the third spoken language differs from the first spoken language and the second spoken language (p. 0046-0047); a first translation engine configured to receive a digital transmission of the transcribed audio content and the second user's language preference, wherein the first translation engine is cloud-based and configured to translate the transcribed audio content from the first spoken language into a second user's translated transcribed content (fig. 2, element 208 and fig. 3; p. 0033-0034, 0040-0044); and a second translation engine configured to receive a digital transmission of the transcribed audio content and the third user's language preference, wherein the second translation engine is cloud-based and configured the transcribed audio content from the first spoken language into a third user's translated transcribed content (fig. 2, element 212 and fig. 3; p. 0035-0036, 0040-0044), wherein the application further configures the computer server to receive the second user's translated transcribed content and the third user's translated transcribed content, to send the second user's translated transcribed content to the second user's client device, to send the third user's translated transcribed content to the third user's client device(fig. 2-3 with p. 0033-0036, 0040-0044), and to develop first user's context from audio content of the first user, to develop a second user's context from audio content of the second user, and to blend the first user's context with the second user's context to generate a blended context (fig. 3), that improves transcription and translation for all users Therefore, it would have been obvious to one of ordinary skill of the art, before the effective filing date of the claimed invention, to modify the method as described above, to provide a flexible, convenient system. Regarding claims 2 and 14, Diamant discloses a system wherein the application further configures the computer server to carry the context forward across content provided by additional speaking devices and spoken languages beyond the first spoken language (context; p. 0077). Regarding claim 3, Diamant discloses a system wherein the application further configures the computer server to maintain a running transcript of the spoken audio content and permits client devices to submit annotations to the transcript (annotations; p. 0124). Regarding claim 4, Diamant discloses a system wherein the submitted annotations at least one of summarize, explain, add to, and question portions of transcripts highlighted by the annotation (summarize aspects of the conference transcription; p. 0024, 0082, 0124). Regarding claim 5, Diamant discloses a system wherein the application further configures the computer server to selectively rely on a third translation engine to supplement translation actions of the first translation engine (add data; p. 0082, 0124). Regarding claim 10, Diamant discloses a system wherein actions of establishing and adjusting the context are based on factors comprising at least one of subject matter of the third user’s translated transcribed audio, settings in which the portions are spoken, audiences of the portions including at least one client device requesting translation into the third language, and cultural considerations of users of the at least one client device (population, distinct characteristics, accents; p. 0059-0062). Regarding claim 11, Diamant discloses a system wherein the factors further include cultural and linguistic nuances associated with translation of the first language to the third language and translation of the second language to the third language (population, distinct characteristics, accents; p. 0059). Regarding claim 15, it is interpreted and rejected for similar reasons as set forth above in claims 1 and 9. In addition, Lin discloses a system for using cloud structures in real time speech and translation involving multiple languages and transcript development, comprising: a translation engine configured to receive a call (translation p. 0033-0037); and receiving at least one tag in the translated transcribed content placed by the client device (tags; p. 0023-0025), wherein the tag is associated with a portion of the translated transcribed content, receiving error commentary associated with the tag, wherein the error commentary alleges an error in the tagged portion of the translated transcribed content, and correcting the tagged portion of the translated transcribed content in the transcript in accordance with the error commentary (correcting error translation; p. 0031). Regarding claim 16, Diamant discloses a system wherein the application further configures the computer server to verify the commentary prior to correcting the portion in the transcript (edit transcript; p. 0111). Regarding claim 18, Diamant discloses a system wherein the error alleged concerns at least one of translation, contextual issues, and idiomatic issues (correct transcript; p. 0111). Regarding claim 19, it is interpreted and rejected for similar reasons as set forth above. In addition, Lin discloses a system wherein the application further configures the computer server to call up a third translation engine in the cloud structure that translates the transcribed audio content into a fourth user’s translated transcribed content in a fourth spoke language differing from the first, second and third spoken languages (different/multiple languages; p. 0019, 0033-0035, 0048). Regarding claim 21, it is interpreted and rejected for similar reasons as set forth above. In addition, Lin discloses a system further comprising: generating second user’s translated audio content in the second user’s language preference from the third user’s translated transcribed content (fig. 2-3); and generating third user’s translated audio content in the third user’s language preference from the third user’s translated transcribed content (fig. 2-3). Regarding claim 24, it is interpreted and rejected for similar reasons as set forth above. In addition, Diamant discloses transcribed data (p. 0023-0024, 0060-0064). Furthermore, Lin further comprising: receiving a request, from another client device, for translation into a fourth spoken language differing from the first, second, and third spoken languages (different/multiple languages; p. 0019, 0033-0035, 0048); translating, by using a third translation engine in the cloud structure (fig. 2), the first audio content into a fourth user’s translated audio content of text in the fourth spoken language differing from the first, second, and third spoken languages (different/multiple languages; p. 0019, 0033-0035, 0048); translating, by using a fourth translation engine in the cloud structure, the second audio content into more of the fourth user’s translated audio content of text in the fourth spoken language differing from the first, second, and third spoken languages (different/multiple languages; p. 0019, 0033-0035, 0048). Claims 6-8, 17 and 20, 22-23 is/are rejected under 35 U.S.C. 103 as being unpatentable over Diamant in view of Lin and in further view of Thomson et al. (PGPUB 2020/0175987), hereinafter referenced as Thomson. Regarding claim 6, Diamant in view of Lin disclose a system as described above, but does not specifically teach a system wherein the application further configures the computer server to selectively blend translated content provided by the first translation engine with translated content provided by the second translation engine. Thomson discloses a system wherein the processor receives audio content comprising human speech spoken in a first language (language translation; p. 1587), translates the content into a second language (p. 1587), displays the translated content in a transcript displayed on a client device viewable by a user speaking the second language (display; p. 0097); receives at least one tag in the translated content placed by the client device, the tag associated with a portion of the content (annotations; p. 1064), receives commentary associated with the tag, the commentary alleging an error in the portion of the content (spelling and language translation errors; p. 0175, 1013), corrects the portion of the content in the transcript in accordance with the commentary (correcting spelling and language translation errors; p. 0175, 1013), wherein the application verifies the commentary prior to correcting the portion in the transcript (verify; p. 0897-0899, 0682, 0935); wherein the error alleged concerns at least one of translation, contextual issues, and idiomatic issues (correcting spelling and language translation; p. 0175, 1013); wherein the application sends the audio content to at least a first cloud-based translation engine for the translation (p. 0205, 1013-1016, 1691, 1701); wherein the application configures the computer server to selectively blend translated content provided by the first translation engine with translated content provided by the second translation engine (fusing multiple transcription; p. 0097, 0140-0160, 1704), to provide a more accurate transcription. Therefore, it would have been obvious to one of ordinary skill of the art, before the effective filing date of the claimed invention, to modify the method as described above, to provide a finalized, summarized and accurate translation. Regarding claim 7, it is interpreted and rejected for similar reasons as set forth above. In addition, Thomson discloses a system wherein the application further configures the computer server to selectively blend translated content based on factors comprising at least one of the first spoken language and the first and second language preferences, subject matter of the content, voice characteristics of the spoken audio content, demonstrated listening abilities and attention levels of users of the second user’s client device, and technical quality of transmission (quality; p. 1023-1031). Regarding claim 8, it is interpreted and rejected for similar reasons as set forth above in claim 6. In addition, Diamant discloses a system wherein the application further configures the computer server to dynamically build a model of translation based at least on at least one of the factors, on locations of users of the client devices, and on observed attributes of the translation engines (location; p. 0023). Furthermore, Thomson discloses a system wherein the application dynamically builds a model of translation based at least on at least one of the factors, on locations of users of the client devices, and on observed attributes of the translation engines (p. 0510, 0735, 1094). Regarding claim 17, Diamant discloses a system wherein users of a plurality of client devices hearing the audio content and reading the transcript additionally provide summaries (summarize aspects of the conference transcription; p. 0024, 0082, 0124), annotations (annotations; p. 0124). In addition, Thompson discloses a system wherein users of a plurality of client devices hearing the audio content and reading the first or second transcript additionally provide summaries (summarize; p. 0168, 0521, 0590), annotations (annotations; p. 1064) and highlighting to the first or second transcript (p. 0161, 0492, 0720-0721), to provide assist with clearly identifying data. Regarding claim 20, it is interpreted and rejected for similar reasons as set forth above. In addition, Thomson discloses a system wherein the application further sends the audio content to a second cloud-based translation engine for the translation and selectively blends translated content provided by the first translation engine and the second translation engine with translated content provided by the third translation engine (fusing multiple transcription with different engines; p. 0097, 0140-0160, 1704). Regarding claim 22, it is interpreted and rejected for similar reasons as set forth above. In addition, Thomson wherein the second user’s translated audio content is generated by the second user’s client device (figure 1 with first language/acoustic mode; p. 1552); and the third user’s translated audio content is generated by the third user’s client device (figure 1 with second language/acoustic mode; p. 1552). Regarding claim 23, it is interpreted and rejected for similar reasons as set forth above. In addition, Thomson wherein the second user’s translated audio content is generated by the first translation engine (figure 1 with first language/acoustic mode; p. 1552); and the third user’s translated audio content is generated by the second translation engine (figure 1 with first language/acoustic mode; p. 1552). Conclusion Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to JAKIEDA R JACKSON whose telephone number is (571)272-7619. The examiner can normally be reached Mon - Fri 6:30a-2:30p. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached on 571.272.5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /JAKIEDA R JACKSON/Primary Examiner, Art Unit 2657
Read full office action

Prosecution Timeline

Aug 13, 2020
Application Filed
Mar 12, 2022
Non-Final Rejection — §103
Apr 06, 2022
Response after Non-Final Action
Aug 16, 2022
Response Filed
Dec 14, 2022
Final Rejection — §103
Apr 20, 2023
Request for Continued Examination
Apr 26, 2023
Response after Non-Final Action
May 09, 2023
Non-Final Rejection — §103
Jul 12, 2023
Applicant Interview (Telephonic)
Jul 12, 2023
Examiner Interview Summary
Aug 15, 2023
Response Filed
Nov 01, 2023
Final Rejection — §103
May 07, 2024
Request for Continued Examination
May 10, 2024
Response after Non-Final Action
May 18, 2024
Non-Final Rejection — §103
Nov 20, 2024
Applicant Interview (Telephonic)
Nov 22, 2024
Examiner Interview Summary
Nov 23, 2024
Response Filed
Jan 17, 2025
Final Rejection — §103
May 23, 2025
Request for Continued Examination
May 27, 2025
Response after Non-Final Action
May 30, 2025
Non-Final Rejection — §103
Dec 03, 2025
Response Filed
Mar 05, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12603079
PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER
2y 5m to grant Granted Apr 14, 2026
Patent 12603088
TRAINING A DEVICE SPECIFIC ACOUSTIC MODEL
2y 5m to grant Granted Apr 14, 2026
Patent 12598092
SYSTEMS, METHODS, AND APPARATUS FOR NOTIFYING A TRANSCRIBING AND TRANSLATING SYSTEM OF SWITCHING BETWEEN SPOKEN LANGUAGES
2y 5m to grant Granted Apr 07, 2026
Patent 12597427
CONFIGURABLE NATURAL LANGUAGE OUTPUT
2y 5m to grant Granted Apr 07, 2026
Patent 12597418
AUDIO SIGNAL PROCESSING DEVICE AND METHOD FOR SYNCHRONIZING SPEECH AND TEXT BY USING MACHINE LEARNING MODEL
2y 5m to grant Granted Apr 07, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

9-10
Expected OA Rounds
74%
Grant Probability
89%
With Interview (+15.4%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 905 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month