DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 2/11/2026 has been entered.
Response to Amendment
The rejections under 35 U.S.C. §112(a) of claim 21 are withdrawn in view of the cancellation of claim 21.
Examiner acknowledges the amendments to the claims received on 2/11/2026 have been entered, and that no new matter has been added.
Response to Arguments
Argument 1: Applicant argues on page 10 in the filing on 2/11/2026 that the cited prior art does not teach the newly amended portions of claim 1.
Response to Argument 1: Argument 1 is moot in view of new grounds of rejection. The scope of the amendment has changed and new art has been applied.
This meets the claim limitations as currently claimed, and Applicant's Argument 1 filed on 2/11/2026 is moot in view of new grounds of rejection necessitated by the applicant’s amendment. Applicant’s remaining statements regarding the remaining independent and dependent claims are moot or not persuasive for the reasons stated above.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1-7, 9-17, and 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Screen captures from YouTube video clip entitled “Potter’s integrated voice system,” 10 pages, uploaded Jan 25, 2022 by user “Potter Signal,” URL: https://www.youtube.com/watch?v=OdSDol9Bjls (hereinafter “Potter”), included in form 892 dated 5/23/2025, in view of Thuy, Phuong. " How to Use Third-Party Text-to-Speech Voices in ActivePresenter 9" [published 8/10/2023], [online], [retrieved on 11/19/2024]. Retrieved from the internet <URL: https://atomisystems.com/tutorials/ap9/use-third-party-text-to-speech-voices-activepresenter-9/ (hereinafter “Thuy”), included in form 892 dated 5/23/2025, in view of Lindqvist, Patent Application Publication number US 20250197166 A1 (hereinafter “Lindqvist”).
Claim 1: Potter teaches “A method comprising:
providing a user interface (Potter Fig. 2-10 shows an interface) for configuring an alarm control panel (Potter Fig 1 “fire alarm systems”) to generate audio voice messages (Potter Fig. 2-3 “support for predefined messages and tones, imported audio, and in-software recorded audio”), wherein the user interface displays at least a text field for inputting text (Potter Fig. 6 shows text box with text input: “Attention, Attention, this is a fire alarm. Calmy proceed to the nearest exit. Thank you”), a list for selecting audio configuration settings (Potter Fig 5 shows a dropdown list of “voice” and “speaking rate” configurations), a first icon for generating text-to-speech (Potter Fig. 6 shows “Add audio source,” “Text to speech,” “Voice,” “Speaking Rate,” “Text,”
PNG
media_image1.png
18
27
media_image1.png
Greyscale
playback icon,
PNG
media_image2.png
18
18
media_image2.png
Greyscale
save icon, “OK” button, any of which relate to generating text-to-speech), and a second icon for saving (Potter Fig. 6 shows
PNG
media_image2.png
18
18
media_image2.png
Greyscale
save icon);
obtaining, via the text field in the user interface, text to be converted to an audio voice message (Potter Fig. 6 shows text box with text input: “Attention, Attention, this is a fire alarm. Calmy proceed to the nearest exit. Thank you”);
obtaining, via the list in the user interface, audio configuration settings for generating the audio voice message (Potter Fig 5 shows a dropdown list of “voice” and “speaking rate” configurations);…
obtaining,… a speech response generated using a neural voice network based on the text and the audio configuration settings (Potter Fig. 5’s text overlay “Wavenet text-to-speech.” Wavenet is a TTS system based on neural network);
storing the speech response in an audio file library, in response to receiving a selection of the second icon via the user interface, for playback in the alarm control panel (Potter Fig. 5 shows an item “Text-to-speech – Fire Alarm” is visible in the library of audio sources, indicating that the voice message generated with text-to-speech is saved under that name, e.g. with the save button).”
Potter is silent regarding “transmitting, to a cloud platform in response to receiving a selection of the first icon via the user interface, a text-to-speech request comprising the text to be converted and the audio configuration settings;” and obtaining “from the cloud platform.”
Thuy teaches “transmitting, to a cloud platform (i.e. you will be able to access external cloud voices from different voice providers to create your own audio track [Thuy pg 2]… click More Voice… to access other cloud voices [Thuy pg 3]) in response to receiving a selection of the first icon via the user interface, a text-to-speech request comprising the text to be converted and the audio configuration settings (i.e. 1. Select a voice in the Available Voice list. 2. Enter a text in the Preview text box. 3. Click Speak to listen to the voice [Thuy pg 5]);
obtaining, from the cloud platform, a speech response generated using a neural voice network based on the text and the audio configuration settings (i.e. 1. Select a voice in the Available Voice list. 2. Enter a text in the Preview text box. 3. Click Speak to listen to the voice [Thuy pg 5]);”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention/combination of Potter to include the feature of having the ability to use a cloud service as disclosed by Thuy.
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Potter and Thuy are silent regarding “detecting, by the alarm control panel, an alarm triggered at a location; generating, using a text-to-speech generator directly integrated into the alarm control panel, a second speech response that is indicative of the alarm and the location where the alarm is triggered, responsive to detecting the alarm; and causing, by the alarm control panel, one or more speakers to play an announcement comprising the second speech response indicative of the alarm and the location where the alarm is triggered.”
Lindqvist teaches “detecting, by the alarm control panel, an alarm triggered at a location (i.e. When the alarm is triggered… in the elevator… comprising the elevator location [Lindqvist 0061, Fig. 4A-4B]);
generating, using a text-to-speech generator directly integrated into the alarm control panel (i.e. The alarm phone 416 may be configured to perform at 456 a text-to-speech conversion [Lindqvist 0061, Fig. 4A-4B] note: this is performed by the alarm itself, rather than a cloud service), a second speech response that is indicative of the alarm and the location where the alarm is triggered, responsive to detecting the alarm (i.e. alarm phone 416 may be configured to perform at 456 a text-to-speech conversion of the elevator information stored in the memory 414 to generate an audio file comprising the elevator location in an audio form [Lindqvist 0061, Fig. 4A-4B]… a call may then be established from the elevator 410 to a service entity 418, for example, a rescue operator, for example, automatically when the alarm is triggered [Lindqvist 0061, Fig. 4A-4B] note: a call to a rescue operator is indicative of the alarm. The text-to-speech audio includes the location of the elevator. Note2: Fig 4B element 456 converts text to speech in response to the triggered alarm in element 454); and
causing, by the alarm control panel, one or more speakers to play an announcement comprising the second speech response indicative of the alarm and the location where the alarm is triggered (i.e. alarm phone 416 may be configured to perform at 456 a text-to-speech conversion of the elevator information stored in the memory 414 to generate an audio file comprising the elevator location in an audio form [Lindqvist 0061, Fig. 4A-4B]… a call may then be established from the elevator 410 to a service entity 418, for example, a rescue operator, for example, automatically when the alarm is triggered… audio file… can be played back to the service entity 418 [Lindqvist 0061, Fig. 4A-4B] note: alarm phone causes a speaker to announce audio to a rescue operator (the call/audio itself is indicative of an alarm), where the audio includes the location of the elevator).”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention/combination of Potter and Thuy to include the feature of having the ability to generate a speech alarm with location information as disclosed by Lindqvist.
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit of providing more information to the affected users.
Claim 2: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “further comprising: outputting the speech response, in response to receiving a selection of a third icon via the user interface, wherein the user interface further displays the third icon for playback (Potter Fig. 5 shows a playback icon to the left of the item “Text-to-speech – Fire Alarm,” which plays the pattern that includes “Text-to-speech – Fire Alarm,”).” Thuy teaches “further comprising: outputting the speech response, in response to receiving a selection of a third icon via the user interface, wherein the user interface further displays the third icon for playback (i.e. 1. Select a voice in the Available Voice list. 2. Enter a text in the Preview text box. 3. Click Speak to listen to the voice [Thuy pg 5]).”
One would have been motivated to combine Potter and Thuy and Lindqvist, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Claim 3: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “further comprising: deleting the speech response, in response to receiving a selection of a fourth icon via the user interface, wherein the user interface further displays the fourth icon for deletion (Potter Fig. 5 shows “Delete Audio Source” analogously next to the “Add Audio Source” button, which creates the text-to-speech files).”
Claim 4: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Thuy teaches “wherein the text-to-speech request comprises an authentication key, and wherein the speech response is generated from the cloud platform based on the authentication key being validated by the cloud platform (i.e. to make voices accessible, you first have to get authentication from the voice providers… dialog popping up allows you to enter authentication keys… to get these access keys, you need to create an account in each corresponding provider [Thuy 3-4, Fig. on pg 4]).”
One would have been motivated to combine Potter and Thuy and Lindqvist, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Claim 5: Potter and Thuy and Lindqvist teach all the limitations of claim 4, above. Thuy teaches “further comprising: displaying an error message based on the authentication key not being validated by the cloud platform (i.e. Step 2: After entering the keys, click the Test Authentication button to check if your keys are valid [Thuy pg 4] note: as this is a test for validity, an error message is displayed if the keys are not valid).”
One would have been motivated to combine Potter and Thuy and Lindqvist, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Claim 6: Potter and Thuy and Lindqvist teach all the limitations of claim 6, above. Potter teaches “wherein the audio configuration settings include at least one of: language (Potter Fig. 5 shows “Juan (Spanish)” and “Maria (Portuguese)” as voice options), gender (Potter Fig. 5 shows Voices dropdown list including gendered names, including “David” and “Elizabeth”), tone, accent, pre-set profile, pitch, speech rate (Potter Fig. 5 shows “Speaking rate” dropdown), cloud platform engine, or audio quality setting.”
Claim 7: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “wherein the audio configuration settings are obtained by loading a pre-configured audio voice profile (Potter Fig. 5 shows Voices dropdown list loads pre-configured audio voice profiles).”
Claim 9: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “obtaining, via an additional text field via the user interface (Potter Fig. 5-6 shows “Add Audio Source” button which can be clicked on (again) to add additional text-to-speech to the library at the bottom of Fig. 5-6. This includes an additional text box for the additional text-to-speech input), in response to receiving a selection of a fifth icon via the user interface (Potter Fig. 5-6 shows an OK button, which saves the text-to-speech as an audio source to the library at the bottom of Fig. 5-6), an additional text to be converted to an audio voice message and additional audio configuration settings for generating the audio voice message (Potter Fig. 6 shows a text box and voice dropdown list and speaking rate dropdown list, which is again displayed for any additional text-to-speech), wherein the user interface further displays the additional text field and the fifth icon for adding additional speech (Potter Fig. 5-6 shows an additional text-to-speech box and the OK button);…
obtaining…, an additional speech response generated using the neural voice network based on the additional text and the additional audio configuration settings (Potter Fig. 5’s text overlay “Wavenet text-to-speech.” Wavenet is a TTS system based on neural network); and
storing the additional speech response in an audio file library for playback in the alarm control panel, in response to receiving a sixth user input via the user interface, wherein the user interface comprises a sixth icon for playback (Potter Fig. 5 shows an item “Text-to-speech – Fire Alarm” is visible in the library of audio sources, indicating that the voice message generated with text-to-speech is saved under that name, e.g. with the save button).”
Thuy teaches “transmitting a text-to-speech request to a cloud platform (i.e. you will be able to access external cloud voices from different voice providers to create your own audio track [Thuy pg 2]… click More Voice… to access other cloud voices [Thuy pg 3]) in response to receiving an additional selection of the first icon via the user interface, the text-to-speech request comprising the additional text to be converted and the additional audio configuration settings (i.e. 1. Select a voice in the Available Voice list. 2. Enter a text in the Preview text box. 3. Click Speak to listen to the voice [Thuy pg 5]);…
obtaining, from the cloud platform, an additional speech response generated using the neural voice network based on the additional text and the additional audio configuration settings (i.e. 1. Select a voice in the Available Voice list. 2. Enter a text in the Preview text box. 3. Click Speak to listen to the voice [Thuy pg 5]);”
One would have been motivated to combine Potter and Thuy and Lindqvist, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Claim 10: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “wherein the text to be converted to the audio voice message is obtained by dragging and dropping pre-generated phrases from a second portion of the user interface, wherein the user interface further displays the second portion displaying a list of available pre-generated phrases (Potter Fig. 2 shows window “Add Audio Source” and a currently selected option of “predefined messages” showing text to be converted. Upon adding these “predefined messages” as “added audio sources,” they will show up in the audio source list shown in Potter Fig. 5 bottom half. Then the predefined messages are dragged and dropped to be obtained in the playback area).”
Claim 11: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter teaches “further comprising: adding tones to the speech response by selecting the tones from a third portion of the user interface, wherein the user interface further displays the third portion displaying a list of available tones (Potter Fig. 5 shows “Temporal 3 – 520Hz” in a list of audio sources in the bottom half of the screen. In the top half of the screen, it shows that “Temporal 3 – 520Hz” has been added to the beginning of the speech response “Text-to-speech – Fire Alarm”).”
Claim 12: Potter and Thuy and Lindqvist teach computing device, comprising:
one or more memories, individually or in combination, having instructions; and
one or more processors each coupled to at least one of the one or more memories (Potter Fig. 2-10 teach software running on a computer. Components of generic computers include processors and memory) and configurable to perform operations corresponding to the method of claim 1; therefore, it is rejected under the same rationale.
Claim 13: Claim 13 is similar in content and in scope to claim 2, thus it is rejected under the same rationale.
Claim 14: Claim 14 is similar in content and in scope to claim 3, thus it is rejected under the same rationale.
Claim 15: Potter and Thuy and Lindqvist teach all the limitations of claim 12, above. Thuy teaches “wherein the text-to-speech request comprises a authentication key, and wherein the speech response is generated from the cloud platform based on the authentication key being validated by the cloud platform (i.e. to make voices accessible, you first have to get authentication from the voice providers… dialog popping up allows you to enter authentication keys… to get these access keys, you need to create an account in each corresponding provider [Thuy 3-4, Fig. on pg 4]), and wherein the one or more processors each coupled to at least one of the one or more memories and configurable to further execute the instructions to:
display an error message based on the authentication key not being validated by the cloud platform (i.e. Step 2: After entering the keys, click the Test Authentication button to check if your keys are valid [Thuy pg 4] note: as this is a test for validity, an error message is displayed if the keys are not valid).”
One would have been motivated to combine Potter and Thuy and Lindqvist, before the effective filing date of the invention because it provides the benefit to perform text-to-speech conversion externally, which saves processing power at the client device.
Claim 16: Claim 16 is similar in content and in scope to claim 6, thus it is rejected under the same rationale.
Claim 17: Claim 17 is similar in content and in scope to claim 7, thus it is rejected under the same rationale.
Claim 19: Claim 19 is similar in content and in scope to claim 9, thus it is rejected under the same rationale.
Claim 20: Potter and Thuy and Lindqvist teach a computer program product configured to configure audio voice messages, the computer program product comprising one or more non-transitory computer-readable media, having instructions stored thereon that when executed by one or more processors (Potter Fig. 2-10 teach software running on a computer. Components of generic computers include processors and memory) cause to perform operations corresponding to the method of claim 1; therefore, it is rejected under the same rationale.
Claim 8 and 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Potter, in view of Thuy, in view of Lindqvist, in view of Gattis, Patent Application Publication number US 20170300289 A1, (hereinafter “Gattis”).
Claim 8: Potter and Thuy and Lindqvist teach all the limitations of claim 1, above. Potter and Thuy and Lindqvist teach voice audio (see claim 1, above). Potter and Thuy and Lindqvist teach audio configuration settings (see claim 1, above). Lindqvist teaches “exporting the speech response to an alarm control panel (i.e. audio file may be transmitted to the alarm phone 416 that stores the audio file in the memory 414 at 432 [Lindqvist 0057, Fig. 4A]).”
Potter and Thuy and Lindqvist are silent regarding “further comprising: storing the audio configuration settings as an audio voice profile.”
Gattis teaches “further comprising: storing the audio configuration settings as an audio voice profile (i.e. a user interface where a user can adjust and save equalization settings… and select between different audio themes or profiles to use. The control device 422 may be used to control the content device 402 and/or audio processing server 412 to adjust and save equalization settings for different types of audio… user can create an equalization setting by selecting which frequencies or frequency ranges are to be amplified relative to other frequency ranges and save the equalization setting as an audio theme or profile [Gattis 0036]);”
It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention/combination of Potter and Thuy and Lindqvist to include the feature of having the ability to save audio settings as disclosed by Gattis.
One would have been motivated to do so, before the effective filing date of the invention because it provides the benefit to use saved settings in future audio files, reducing manual input.
Claim 18: Claim 18 is similar in content and in scope to claim 8, thus it is rejected under the same rationale.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Cree (US 20150348399 A1) listed on 892 is related to describing location in a status of a device (e.g. an alarm), created from text-to-speech audio.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SAMUEL SHEN whose telephone number is (469)295-9169 and email address is samuel.shen@uspto.gov. The examiner can normally be reached Monday-Thursday, 7:00 am - 5:00 pm CT.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Fred Ehichioya can be reached on (571) 272-4034. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/S.S./Examiner, Art Unit 2179
/IRETE F EHICHIOYA/Supervisory Patent Examiner, Art Unit 2179