Last updated: May 29, 2026
Application No. 17/890,454
SPEECH TRANSMISSION FROM A TELECOMMUNICATION ENDPOINT USING PHONETIC CHARACTERS

Non-Final OA §101§103
Filed
Aug 18, 2022
Examiner
BECKER, TYLER JUSTIN
Art Unit
2657
Tech Center
2600 — Communications
Assignee
Avaya Management L.P.
OA Round
4 (Non-Final)
Interview Optional

— +16.5% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 75% grant rate with +16.5% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 20 resolved cases, 2023–2026
Examiner Intelligence

BECKER, TYLER JUSTIN View full profile →
Grants 75% — above average
Career Allowance Rate
15 granted / 20 resolved
+13.0% vs TC avg
Strong +16% interview lift
Without
With
+16.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
11 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
1.2%
-38.8% vs TC avg
§103
90.4%
+50.4% vs TC avg
§102
3.6%
-36.4% vs TC avg
§112
4.8%
-35.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 20 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
	The amendment filed on November 4th, 2025 has been entered. Claims 1, 3-5, 9-17, and 19-20 have been amended. Claims 1-20 are pending and have been examined. Applicant’s amendments to the claims have overcome the rejections under 35 U.S.C. 112 previously set forth in the non-final office action mailed July 10th, 2025.

Response to Arguments
Applicant's arguments regarding the rejection of claim 1 under 35 U.S.C. 103 filed November 4th, 2025 have been fully considered but they are not persuasive. The applicant argues that the amended limitation of claim 1, originally from dependent claim 4, is not disclosed or taught by the references. The examiner respectfully disagrees. While none of the cited references expressly disclose storing a string of phonetic characters, Maeda does disclose storing a string of characters, and the examiner holds that it would have been obvious to one of ordinary skill the art before the effective filing date of the claimed invention to have converted the string to a string of phonetic characters as taught by Krishnan. As such, based on the communication system of Maeda, as modified by the teachings of Krishnan, it would be obvious to transmit and store a string of phonetic characters rather than a string of text characters.

Applicant’s arguments with respect to claim(s) 11-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-6, 8-9, 11-16, and 18-20 rejected under 35 U.S.C. 101 because 
Regarding claim 1, the claim recites elements (a) “receiving audio including speech from a user at the first endpoint”, (b) “translating the speech to a string of phonetic characters”, (c) “transmitting the string to the second endpoint over the communication session”, (d) “in the second endpoint: receiving the string”, and (e) “storing the string for access after the communication session.” These may be practically performed in the human mind with pen and paper. For example, two people can be speaking, but decide to pass written notes if the environment becomes too noisy to hear. Therefore, elements (a)-(e) are the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Under its broadest reasonable interpretation when read in light of the specification, the elements encompass mental processes. Accordingly, the claim recites an abstract idea (Step 2A, Prong one).
The judicial exception is not integrated into a practical application. The claim does not recite any additional elements. Even when viewed in combination, the claim elements do not integrate the judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception (Step 2A: YES).
The claim does not include any other additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above, elements (a)-(e) amount to no more than a mental process, and there are no other additional elements. Even when considered in combination, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer, and do not provide an inventive concept (step 2B).

Claim 2 depends on claim 1, and thus recites the limitations of claim 1, with the additional element (f) “before transmitting the string, determining that audio quality of a communication channel with the second endpoint does not satisfy a quality criterion.”
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas. The additional element of claim 2 does not preclude the steps of claim 1 from practically being performed in the human mind. Element (f) further modifies the abstract idea by disclosing determining the audio quality does not satisfy a quality criterion. Here, element (f) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 2 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 3 depends on claim 2, and thus recites the limitations of claim 2, with the additional elements (g) “before determining that the audio quality does not satisfy the quality criterion, receiving prior audio captured from the user” and (h) “transmitting the prior audio over the communication session to the second endpoint.”
For the reasons discussed above for claim 2, the claim 2 limitations recite abstract ideas. The additional elements of claim 3 do not preclude the steps of claim 2 from practically being performed in the human mind. Elements (g) and (h) further modify the abstract idea by disclosing receiving and transferring audio before determining the audio quality. Here, elements (g) and (h) fall under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 3 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 4 depends on claim 1, and thus recites the limitations of claim 1, with the additional element (i) “in the second endpoint, upon receiving a request to playback the recreated audio, playing the recreated audio of sounds represented by the string.”
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas. The additional element of claim 3 does not preclude the steps of claim 1 from practically being performed in the human mind. Element (i) further modifies the abstract idea by disclosing recreating the audio at a user’s request. Here, element (i) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 4 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 5 depends on claim 1, and thus recites the limitations of claim 1, with the additional elements (j) “determining that the user has a first accent that is different from a second accent of a second user of the second endpoint” and (k) “changing one or more of the phonetic characters to adjust sounds represented by the string from the first accent to the second accent.”
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas. The additional elements of claim 5 do not preclude the steps of claim 1 from practically being performed in the human mind. Elements (j) and (k) further modify the abstract idea by disclosing determining the accent of each user and modifying the string based on the accents. Here, elements (j) and (k) fall under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 5 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 6 depends on claim 5, and thus recites the limitations of claim 5, with the additional element (l) “receiving a user instruction to enable adjusting the sounds from the first accent to the second accent.”
For the reasons discussed above for claim 5, the claim 5 limitations recite abstract ideas. The additional element of claim 6 does not preclude the steps of claim 5 from practically being performed in the human mind. Element (l) further modifies the abstract idea by disclosing adjusting the sounds based on accent based on instructions from the user. Here, element (l) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 6 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 8 depends on claim 1, and thus recites the limitations of claim 1, with the additional element (m) “wherein the phonetic characters are characters in the International Phonetic Alphabet.”
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas. The additional element of claim 8 does not preclude the steps of claim 1 from practically being performed in the human mind. Element (m) further modifies the abstract idea by disclosing the phonetic characters are characters in the International Phonetic Alphabet. Here, element (m) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 8 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 9 depends on claim 1, and thus recites the limitations of claim 1, with the additional element (n) “determining voice communication on an audio channel of the communication session does not satisfy a quality criterion prior to transmitting the string.”
For the reasons discussed above for claim 1, the claim 1 limitations recite abstract ideas. The additional element of claim 9 does not preclude the steps of claim 1 from practically being performed in the human mind. Element (n) further modifies the abstract idea by disclosing determining the voice communication does not satisfy a quality criterion before transmitting the string. Here, element (n) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 9 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Regarding claim 11, the claim recites elements (a) “receive a string of phonetic characters from a first endpoint on a communication session with the apparatus, wherein the phonetic characters are translated from speech captured from a user at the first endpoint”, (b) “generate recreated audio of sounds represented in the string”, and (c) “processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.” These may be practically performed in the human mind with pen and paper. For example, two people can be speaking, but decide to pass written notes if the environment becomes too noisy to hear, read the string of letters out loud, and determine what a person’s response to a prompt was based on the spoken string. Therefore, elements (a)-(c) are the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Under its broadest reasonable interpretation when read in light of the specification, the elements encompass mental processes. Accordingly, the claim recites an abstract idea (Step 2A, Prong one).
The judicial exception is not integrated into a practical application. The claim recites the additional elements (d) “one or more computer readable storage media”, (e) “a processing system operatively coupled with the one or more computer readable storage media” and (f) “program instructions stored on the one or more computer readable storage media that, when read and executed by the processing system, direct the apparatus”. Here elements (d)-(f) account for computing components recited at a high level of generality (MPEP 2106.04(a)(2)(III)(C)). Even when viewed in combination, the claim elements do not integrate the judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception (Step 2A: YES).
The claim does not include any other additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above, elements (a)-(e) amount to no more than a mental process, and elements (d)-(f) amount to no more than generic computing components. Even when considered in combination, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer, and do not provide an inventive concept (step 2B).

Claim 12 depends on claim 11, and thus recites the limitations of claim 11, with the additional element (g) “before transmitting the string, the first endpoint determines that audio quality of a communication channel with the apparatus does not satisfy a quality criterion.”
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas. The additional element of claim 12 does not preclude the steps of claim 11 from practically being performed in the human mind. Element (g) further modifies the abstract idea by disclosing determining the voice communication does not satisfy a quality criterion before transmitting the string. Here, element (g) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 12 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 13 depends on claim 12, and thus recites the limitations of claim 12, with the additional element (h) “before the first endpoint determines that the audio quality does not satisfy the quality criterion, receive prior audio captured from the user over the communication channel.”
For the reasons discussed above for claim 12, the claim 12 limitations recite abstract ideas. The additional element of claim 13 does not preclude the steps of claim 12 from practically being performed in the human mind. Element (h) further modifies the abstract idea by disclosing receiving audio before determining the audio quality. Here, element (h) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 13 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 14 depends on claim 11, and thus recites the limitations of claim 11, with the additional element (i) “store the string for access after the communication session.”
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas. The additional element of claim 14 does not preclude the steps of claim 11 from practically being performed in the human mind. Element (i) further modifies the abstract idea by disclosing storing the string. Here, element (i) falls under the mental process of collecting data (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 14 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 15 depends on claim 11, and thus recites the limitations of claim 11, with the additional element (j) “wherein the first endpoint determines that the user has a first accent that is different from a second accent and changes one or more of the phonetic characters to adjust the sounds from the first accent to the second accent.”
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas. The additional element of claim 15 does not preclude the steps of claim 11 from practically being performed in the human mind. Element (j) further modifies the abstract idea by disclosing determining the accent of each user and modifying the string based on the accents. Here, element (j) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 15 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 16 depends on claim 15, and thus recites the limitations of claim 15, with the additional element (k) “wherein the first endpoint receives a user instruction to enable adjusting the sounds from the first accent to the second accent.”
For the reasons discussed above for claim 15, the claim 15 limitations recite abstract ideas. The additional element of claim 16 does not preclude the steps of claim 15 from practically being performed in the human mind. Element (k) further modifies the abstract idea by disclosing adjusting the sounds based on accent based on instructions from the user. Here, element (k) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 16 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 18 depends on claim 11, and thus recites the limitations of claim 11, with the additional element (l) “wherein the phonetic characters are characters in the International Phonetic Alphabet.”
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas. The additional element of claim 18 does not preclude the steps of claim 11 from practically being performed in the human mind. Element (l) further modifies the abstract idea by disclosing the phonetic characters are characters in the International Phonetic Alphabet. Here, element (l) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 18 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Claim 19 depends on claim 11, and thus recites the limitations of claim 11, with the additional element (m) “wherein the string is transmitted in response to the first endpoint determining voice communication over the communication session does not satisfy a quality criterion.”
For the reasons discussed above for claim 11, the claim 11 limitations recite abstract ideas. The additional element of claim 19 does not preclude the steps of claim 11 from practically being performed in the human mind. Element (m) further modifies the abstract idea by disclosing determining the voice communication does not satisfy a quality criterion before transmitting the string. Here, element (m) falls under the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Accordingly, the claim recites a judicial exception (Step 2A).
Claim 19 does not recite any additional elements and therefore, the claim is not practically integrated into a practical application and does not amount to significantly more than a judicial exception (Step 2A Prong two and Step 2B).

Regarding claim 20, the claim recites elements (a) “receive a string of phonetic characters from a first endpoint on a communication session with the apparatus, wherein the phonetic characters are translated from speech captured from a user at the first endpoint”, (b) “generate recreated audio of sounds represented in the string”, and (c) “processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.” These may be practically performed in the human mind with pen and paper. For example, two people can be speaking, but decide to pass written notes if the environment becomes too noisy to hear, read the string of letters out loud, and determine what a person’s response to a prompt was based on the spoken string. Therefore, elements (a)-(c) are the mental process of collecting data, evaluating it, and outputting the results of the evaluation (MPEP 2106.04(a)(2)(III)(A)). Under its broadest reasonable interpretation when read in light of the specification, the elements encompass mental processes. Accordingly, the claim recites an abstract idea (Step 2A, Prong one).
The judicial exception is not integrated into a practical application. The claim recites the additional element (d) “One or more computer readable storage media having program instructions stored thereon that, when read and executed by a processing system, direct the processing system”. Here element (d) accounts for computing components recited at a high level of generality (MPEP 2106.04(a)(2)(III)(C)). Even when viewed in combination, the claim elements do not integrate the judicial exception into a practical application (Step 2A, Prong Two: NO), and the claim is directed to the judicial exception (Step 2A: YES).
The claim does not include any other additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above, elements (a)-(e) amount to no more than a mental process, and element (d) amounts to no more than generic computing components. Even when considered in combination, these additional elements represent mere instructions to implement an abstract idea or other exception on a computer, and do not provide an inventive concept (step 2B).

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-4 and 7-9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Maeda et al. (US Pat. No. 7,286,979 B2 hereinafter Maeda), in view of Krishnan, Arjun (US Pat. Pub. No. 2005/0273327 A1 hereinafter Krishnan).
Regarding claim 1, Maeda discloses a method comprising: in a first endpoint of a communication session between a first endpoint and a second endpoint: receiving audio including speech from a user at the first endpoint (Maeda, Fig. 1; Col. 2, lines 43-45: " An A/D converting unit 101 converts an analog voice signal obtained by a microphone 100 into a digital voice signal."); and transmitting the string to the second endpoint over the communication session (Maeda, Fig. 1; Col. 2, lines 61-63: "The transmitting unit 107 transmits both voice data and character data, or character data via a communication network 3 to the communication terminal 2."); and in the second endpoint: receiving the string; and storing the string for access after the communication session (Maeda, Fig. 1; Col. 3, lines 18-24: “the character decoding unit 202 decodes the digital signal supplied from the receiving unit 200 so as to derive character information, and then, sends the derived character information to both the display unit 105 and the recording unit 205. The recording unit 205 records therein the character information sent from the character decoding unit 202.”). However, Maeda fails to expressly recite translating the speech to a string of phonetic characters.
Krishnan teaches translating the speech to a string of phonetic characters (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”).
Maeda and Krishnan are analogous arts because they both belong to the same field of communication systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the communication system of Maeda to incorporate the teachings of Krishnan to use the International Phonetic Alphabet when converting speech to text strings. Using a standardized phonetic alphabet in the system increases the clarity and useability of the system (Krishnan, [0026]). This is important to do to make sure users, particularly those looking to understand the technical implementation of the system, can fully understand how the system works.

Regarding claim 2, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Maeda further discloses before transmitting the string, determining that audio quality of a communication channel with the second endpoint does not satisfy a quality criterion (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication.").

Regarding claim 3, the rejection of claim 2 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Maeda further discloses before determining that the audio quality does not satisfy the quality criterion, receiving prior audio captured from the user; and transmitting the prior audio over the communication session to the second endpoint (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication."; Here, voice communication occurs before checking the error rate.).

Regarding claim 4, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Maeda further discloses in the second endpoint, upon receiving a request to playback recreated audio, playing the recreated audio of sounds represented by the string (Maeda, Col. 7, lines 1-17: “a description will be made of such a method capable of retrieving voice data recorded on either the recording unit 106 or the recording unit 205 and of reproducing the retrieved voice data. In the case that the voice data is reproduced, the user enters a keyword (step S100). The character data 7 is retrieved based upon this keyword (step S101). The retrieving operation is repeatedly carried out until the keyword can be found out ("NG" in step S101). When the keyword is found out ("OK" in step S101), the time information 9 contained in this found character 7 is derived (step S102). Next, a retrieving operation is carried out as to such a voice data containing the same time information 9 as the derived time information 9 (step S103). When the voice data 8 having the same time information 9 is found out ("OK" in step S103), the reproducing operation is commenced from the data portion of the found voice data 8 (step S104).”).

Regarding claim 7, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Maeda further discloses transmitting each of the characters in real-time (Maeda, Col. 5, lines 49-56: "in such a case that the communication condition on the reception side is deteriorated, even when the communication terminal provided on the reception side is not equipped with the voice/character converting function, since the voice communication is switched to the character communication in the communication terminal provided on the transmission side, interruptions in communications may be prevented."). However, Maeda fails to expressly recite phonetic characters.
Krishnan further teaches the use of phonetic characters (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”). The same motivation of claim 1 applies equally to claim 7.

Regarding claim 8, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Krishnan further teaches wherein the phonetic characters are characters in the International Phonetic Alphabet (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”). The same motivation of claim 1 applies equally to claim 8.

Regarding claim 9, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. Maeda further discloses determining voice communication on an audio channel of the communication session does not satisfy a quality criterion prior to transmitting the string (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication.").

Claim(s) 5-6 is/are rejected under 35 U.S.C. 103 as being unpatentable over Maeda, in view of Krishnan, as applied to claims 1-4 and 7-9 above, and further in view of LeVoit, Violet (US Pat. Pub. No. 2018/0277132 A1 hereinafter LeVoit).
Regarding claim 5, the rejection of claim 1 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. However, Maeda, in view of Krishnan, fails to expressly recite determining that the user has a first accent that is different from a second accent of a second user of the second endpoint; and changing one or more of the phonetic characters to adjust sounds represented by the string from the first accent to the second accent.
LeVoit teaches determining that the user has a first accent that is different from a second accent of a second user of the second endpoint (LeVoit, Fig. 8, 806 & 808; [0131]: "Process 800 continues to 806, where the media guidance application compares the first accent type of the human speech with preferences stored in a user profile."); and changing one or more of the phonetic characters to adjust sounds represented by the string from the first accent to the second accent (LeVoit, Fig. 8, 820; [0138]: when the two accents are determined to be different, the system compares the similarity of each phoneme of a given word, and when necessary, replaces the phonemes to match the user’s accent).
Maeda, Krishnan, and LeVoit are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan, to incorporate the teachings of LeVoit to detect and change phonetic characters in text communication to accommodate for the user’s accent. Changing the phonemes in communication data based on different accents increases the accessibility of the system (LeVoit, [0002]). This improves the overall user experience of the system.

Regarding claim 6, the rejection of claim 5 is incorporated. Maeda, in view of Krishnan and LeVoit, discloses all of the elements of the current invention as stated above. LeVoit further discloses wherein determining that the user has the first accent that is different from the second accent comprises: receiving a user instruction to enable adjusting the sounds from the first accent to the second accent (LeVoit, Fig. 8, 806; [0131]: "Process 800 continues to 806, where the media guidance application compares the first accent type of the human speech with preferences stored in a user profile.").
Maeda, Krishnan, and LeVoit are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan, to incorporate the teachings of LeVoit to enable accent adjustment based on user input. Allowing the user to specify when accents should be adjusted increases the accessibility of the system (LeVoit, [0002]). This improves the overall user experience of the system.

Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Maeda, in view of Krishnan, as applied to claims 1-4 and 7-9 above, and further in view of Chisu et al. (US Pat. Pub. No. 2018/0373488 A1 hereinafter Chisu).
Regarding claim 10, the rejection of claim 9 is incorporated. Maeda, in view of Krishnan, discloses all of the elements of the current invention as stated above. However, Maeda, in view of Krishnan, fails to expressly recite wherein the quality criterion is satisfied when an amount of packet loss for the audio channel is below a threshold.
Chisu teaches wherein the quality criterion is satisfied when an amount of packet loss for the audio channel is below a threshold (Chisu, [0023]: “Alternatively or additionally, poor audio quality may be detected when delays between sequential data packets carrying audio data increase to a point where individual data packets might be dropped before the audio data can be extracted and played back at the computing device. In these instances, the poor audio quality indicates that a user is unlikely to hear call audio played back at the computing device.”).
Maeda, Krishnan, and Chisu are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan, to incorporate the teachings of Chisu to monitor the audio quality of a call between two or more users. This allows the system to adapt if it detects low audio quality between users (Chisu, [0009]). This helps ensure that the communication between users is complete and effective.

Claim(s) 11-14 and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Maeda, in view of Krishnan and Biswas et al. (US Pat. Pub. No. 2023/016990 A1 hereinafter Biswas).
Regarding claim 11, Maeda discloses an apparatus comprising: one or more computer readable storage media; a processing system operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media that, when read and executed by the processing system, direct the apparatus to (Maeda, Col. 3, lines 38-40: "A control unit 207 shown in FIG. 1 may execute the below-mentioned control process operation in accordance with a program stored in the recording unit 106."): receive a string of characters from a first endpoint on a communication session with the apparatus, wherein the characters are translated from speech captured from a user at the first endpoint (Maeda, Fig. 1; Col. 3, lines 18-24: “the character decoding unit 202 decodes the digital signal supplied from the receiving unit 200 so as to derive character information, and then, sends the derived character information to both the display unit 105 and the recording unit 205. The recording unit 205 records therein the character information sent from the character decoding unit 202.”); generate recreated audio of sounds represented in the string (Maeda, Col. 8, lines 14-21: "such a data produced by converting voice as a character may be recorded on the recording units 106 and 205 in combination with a voice signal and a picture signal. As a consequence, since a character may be employed as a keyword in a retrieving operation, for example, a moving picture containing voice such as a conversation and a news delivery may be easily read out from the recording units 106 and 205 so as to be reproduced."). However, Maeda fails to expressly recite phonetic characters; and processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.
Krishnan teaches the use of phonetic characters (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”).
Maeda and Krishnan are analogous arts because they both belong to the same field of communication systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the communication system of Maeda to incorporate the teachings of Krishnan to use the International Phonetic Alphabet when converting speech to text strings. Using a standardized phonetic alphabet in the system increases the clarity and useability of the system (Krishnan, [0026]). This is important to do to make sure users, particularly those looking to understand the technical implementation of the system, can fully understand how the system works. However, Maeda in view of Krishnan, fails to expressly recite processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.
Biswas teaches processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt (Biswas, Fig. 6; [0043]: "FIG. 6 provides an example of an emotionally-aware voice response generation process flow used in accordance with one or more embodiments of the present disclosure. The emotionally-aware voice response generation process flow 600 can be performed by emotionally-aware voice response generator 100 in response to voice input 102 from a user via user interface 104."; [0044]: "Some non-limiting examples of automated voice response applications include voice bot applications, interactive voice response applications, or other dialog-based applications.").
Maeda, Krishnan, and Biswas are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan, to incorporate the teachings of Biswas to determine a response of the user to an interactive voice response prompt. This allows an automated system to interact with a user, rather than requiring another human to interact with the user (Biswas, [0001]). This ensures that a user can interact with a variety of different systems through the communication system. 

Regarding claim 12, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Maeda further discloses wherein before transmitting the string, the first endpoint determines that audio quality of a communication channel with the apparatus does not satisfy a quality criterion (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication.").

Regarding claim 13, the rejection of claim 12 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Maeda further discloses wherein the program instructions direct the apparatus to: before the first endpoint determines that the audio quality does not satisfy the quality criterion, receive prior audio captured from the user over the communication channel (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication.").

Regarding claim 14, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Maeda further discloses wherein the program instructions direct the apparatus to: store the string for access after the communication session (Maeda, Fig. 1; Col. 3, lines 18-24: “the character decoding unit 202 decodes the digital signal supplied from the receiving unit 200 so as to derive character information, and then, sends the derived character information to both the display unit 105 and the recording unit 205. The recording unit 205 records therein the character information sent from the character decoding unit 202.”).

Regarding claim 17, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Maeda further discloses wherein each of the phonetic characters is received in real-time (Maeda, Col. 5, lines 49-56: "in such a case that the communication condition on the reception side is deteriorated, even when the communication terminal provided on the reception side is not equipped with the voice/character converting function, since the voice communication is switched to the character communication in the communication terminal provided on the transmission side, interruptions in communications may be prevented.").

Regarding claim 18, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Krishnan further teaches wherein the phonetic characters are characters in the International Phonetic Alphabet (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”). The same motivation for claim 11 applies equally to claim 18.

Regarding claim 19, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. Maeda further discloses wherein the string is transmitted in response to the first endpoint determining voice communication over the communication session does not satisfy a quality criterion (Maeda, Fig. 3; Col. 5, lines 17-23: "as shown in FIG. 3, in the case that the communication error rate is low, the voice communication may be carried out (step S211), whereas in the case that the communication error rate is high, the voice-to-character converting operation by the character converting unit 103 may be commenced (step S213) to execute the character communication.").

Regarding claim 20, Maeda discloses one or more computer readable storage media having program instructions stored thereon that, when read and executed by a processing system, direct the processing system to (Maeda, Col. 3, lines 38-40: "A control unit 207 shown in FIG. 1 may execute the below-mentioned control process operation in accordance with a program stored in the recording unit 106."): receive a string of characters from a first endpoint on a communication session with the apparatus, wherein the characters are translated from speech captured from a user at the first endpoint (Maeda, Fig. 1; Col. 3, lines 18-24: “the character decoding unit 202 decodes the digital signal supplied from the receiving unit 200 so as to derive character information, and then, sends the derived character information to both the display unit 105 and the recording unit 205. The recording unit 205 records therein the character information sent from the character decoding unit 202.”); generate recreated audio of sounds represented in the string (Maeda, Col. 8, lines 14-21: "such a data produced by converting voice as a character may be recorded on the recording units 106 and 205 in combination with a voice signal and a picture signal. As a consequence, since a character may be employed as a keyword in a retrieving operation, for example, a moving picture containing voice such as a conversation and a news delivery may be easily read out from the recording units 106 and 205 so as to be reproduced."). However, Maeda fails to expressly recite phonetic characters; and processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.
Krishnan teaches the use of phonetic characters (Krishnan, [0026]: “the input speech can be converted to phonemes in the International Phonetic Alphabet of the International Phonetic Association (as shown in FIG. 3) in accordance with any of a number of different phonetic transcription techniques, as such are well known to those skilled in the art.”).
Maeda and Krishnan are analogous arts because they both belong to the same field of communication systems. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the communication system of Maeda to incorporate the teachings of Krishnan to use the International Phonetic Alphabet when converting speech to text strings. Using a standardized phonetic alphabet in the system increases the clarity and useability of the system (Krishnan, [0026]). This is important to do to make sure users, particularly those looking to understand the technical implementation of the system, can fully understand how the system works. However, Maeda in view of Krishnan, fails to expressly recite processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt.
Biswas teaches processing the recreated audio to determine a response of the user to an Interactive Voice Response (IVR) prompt (Biswas, Fig. 6; [0043]: "FIG. 6 provides an example of an emotionally-aware voice response generation process flow used in accordance with one or more embodiments of the present disclosure. The emotionally-aware voice response generation process flow 600 can be performed by emotionally-aware voice response generator 100 in response to voice input 102 from a user via user interface 104."; [0044]: "Some non-limiting examples of automated voice response applications include voice bot applications, interactive voice response applications, or other dialog-based applications.").
Maeda, Krishnan, and Biswas are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan, to incorporate the teachings of Biswas to determine a response of the user to an interactive voice response prompt. This allows an automated system to interact with a user, rather than requiring another human to interact with the user (Biswas, [0001]). This ensures that a user can interact with a variety of different systems through the communication system. 

Claim(s) 15-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Maeda, Krishnan, and Biswas, as applied to claims 11-14 and 17-20 above, and further in view of LeVoit.
Regarding claim 15, the rejection of claim 11 is incorporated. Maeda, in view of Krishnan and Biswas, discloses all of the elements of the current invention as stated above. However, Maeda, in view of Krishnan and Biswas, fails to expressly recite wherein the first endpoint determines that the user has a first accent that is different from a second accent and changes one or more of the phonetic characters to adjust the sounds from the first accent to the second accent.
LeVoit teaches wherein the first endpoint determines that the user has a first accent that is different from a second accent (LeVoit, Fig. 8, 806 & 808; [0131]: "Process 800 continues to 806, where the media guidance application compares the first accent type of the human speech with preferences stored in a user profile."); and changes one or more of the phonetic characters to adjust the sounds from the first accent to the second accent (LeVoit, Fig. 8, 820; [0138]: when the two accents are determined to be different, the system compares the similarity of each phoneme of a given word, and when necessary, replaces the phonemes to match the user’s accent).
Maeda, Krishnan, Biswas, and LeVoit are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan and the voice response generation system of Biswas, to incorporate the teachings of LeVoit to detect and change phonetic characters in text communication to accommodate for the user’s accent. Changing the phonemes in communication data based on different accents increases the accessibility of the system (LeVoit, [0002]). This improves the overall user experience of the system.

Regarding claim 16, the rejection of claim 15 is incorporated. Maeda, in view of Krishnan, Biswas, and LeVoit, discloses all of the elements of the current invention as stated above. LeVoit further discloses wherein the first endpoint receives a user instruction to enable adjusting the sounds from the first accent to the second accent (LeVoit, Fig. 8, 806; [0131]: "Process 800 continues to 806, where the media guidance application compares the first accent type of the human speech with preferences stored in a user profile.").
Maeda, Krishnan, Biswas, and LeVoit are analogous arts because they all belong to the same field of endeavor of communication systems. It would have been obvious for one of ordinary skill in the art before the effective filling date of the claimed invention to have modified the communication system of Maeda, as modified by the message transmitting system of Krishnan and the voice response generation system of Biswas, to incorporate the teachings of LeVoit to enable accent adjustment based on user input. Allowing the user to specify when accents should be adjusted increases the accessibility of the system (LeVoit, [0002]). This improves the overall user experience of the system.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to TYLER J BECKER whose telephone number is (703)756-1271. The examiner can normally be reached M-Th, 7:15am-5:45pm PT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Washburn can be reached at (571) 272-5551. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/TYLER BECKER/              Examiner, Art Unit 2657                                                                                                                                                                                          

/DANIEL C WASHBURN/               Supervisory Patent Examiner, Art Unit 2657
Read full office action
Prosecution Timeline

Show 7 earlier events
Jun 23, 2025
Response after Non-Final Action
Jul 10, 2025
Non-Final Rejection mailed — §101, §103
Oct 21, 2025
Interview Requested
Oct 27, 2025
Examiner Interview Summary
Oct 27, 2025
Applicant Interview (Telephonic)
Nov 04, 2025
Response Filed
Jan 16, 2026
Final Rejection mailed — §101, §103
Mar 12, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

18/346,232
Patent 12632657
Joint Speech and Text Streaming Model for ASR
2y 10m to grant Granted May 19, 2026
18/274,767
Patent 12614560
REVERBERATION REMOVAL DEVICE, PARAMETER ESTIMATION DEVICE, REVERBERATION REMOVAL METHOD, PARAMETER ESTIMATION METHOD, AND PROGRAM
2y 9m to grant Granted Apr 28, 2026
18/484,927
Patent 12597433
SPEECH SIGNAL ENHANCEMENT METHOD AND APPARATUS, AND ELECTRONIC DEVICE
2y 5m to grant Granted Apr 07, 2026
18/334,771
Patent 12585893
Full Media Translator
2y 9m to grant Granted Mar 24, 2026
17/692,070
Patent 12518777
SYSTEMS AND METHODS FOR AUTHENTICATION USING SOUND-BASED VOCALIZATION ANALYSIS
3y 10m to grant Granted Jan 06, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

4-5
Expected OA Rounds
75%
Grant Probability
92%
With Interview (+16.5%)
2y 7m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 20 resolved cases by this examiner. Grant probability derived from career allowance rate.