DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because reference characters "504" and "603" have both been used to designate Receiving module. Reference characters "503" and "604" have both been used to designate Display module. Further revision is required, additional typos should be fixed accordingly.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
35 U.S.C. 112(a) or pre-AIA 35 U.S.C. 112, requires the specification to be written in “full, clear, concise, and exact terms.” The specification is replete with terms which are not clear, concise and exact. The specification should be revised carefully in order to comply with 35 U.S.C. 112(a) or pre-AIA 35 U.S.C. 112. Examples of some unclear, inexact or verbose terms used in the specification are: Many important terms that are cited in Drawings and in the Detailed Description a couple of times, are not cited throughout the rest of paragraphs in the Detailed Description. In paragraph 0037, it discloses Fig. 1 and labels “cloud server 130” with reference number 130. Then, further down in paragraph 0049 and on, it mentions “cloud server” multiple times, with no proper label. As well as many other paragraphs without a proper label for “cloud server”. In paragraph 0037, it labels “user terminal 120” with reference number 120. Then it mentions “user terminal” multiple times throughout the detailed disclosure without a proper label. As well as many other paragraphs without a proper label for “user terminal”, like paragraph 0044, 0045, 0047, etc. This goes for many different terms throughout the detailed disclosure that do not have a proper label reference. Many inconsistent label numberings for terms that were previously labeled. For example, “subtitle display apparatus” is referenced by different numbers, 500, 600. This goes for many different terms throughout the detailed disclosure that have inconsistent reference numbering. Further revision is required, additional typos should be fixed accordingly.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 8 recites “The method according to claim 7, further comprising: performing, by the augmented reality glasses, parameter adjustment by using a depth learning model or a machine learning model, wherein the parameter adjustment comprises at least one of following: beamforming parameter adjustment or signal processing parameter adjustment, wherein the capturing an ambient audio by using a beamforming technology comprises: capturing the ambient audio by using an adjusted beamforming parameter; and wherein the performing, by the augmented reality glasses, signal processing on the ambient audio comprises: performing, by the augmented reality glasses, signal processing on the ambient audio based on an adjusted signal processing parameter”. The claim limitation includes “and”, which is not clear if the claim should be interpreted as beamforming parameter or signal processing. Or if the claim should be interpreted as beamforming parameter and signal processing. This claim lacks clear claim language and therefore, this claim is rendered as indefinite. Further revision is required to provide a clear explanation about the claimed subject matter.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 10-12, 14, 19, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by GOLDSTEIN (No. US-20210174823-A1 “Goldstein”).
Regarding claim 10, Goldstein teaches “A subtitle display method, comprising: acquiring, by a user terminal, audio information, wherein the audio information comprises an ambient audio or a phone audio sent by augmented reality glasses connected to the user terminal, or an audio stream of an audio or a video played by the user terminal;” (The present invention may also enable the hearing-impaired users to enjoy audio programs transmitted through podcast on the Internet. The controller device 206 may connect to the Internet and download podcast from the desired websites; 0035);
“uploading, by the user terminal, the audio information to a cloud server, so that the cloud server performs voice transcription on the audio information to acquire a text corresponding to the audio information;” (See Figure 3, step 302 Capture speech; the audio conversion feature; 0029; See Figure 5, Audio Capturing unit; The method may be implemented, for example, by executing a sequence of machine-readable instructions. The instructions can reside in various types of signal-bearing or data storage media … accessible by, or residing within, the components of the network device; 0037): The data storage media can be a place to store the audio captured. Therefore, one skilled in the art would clearly recognize the data storge media as a place to upload and transcribe onto like a cloud server of the claimed subject matter.
“receiving, by the user terminal, the text, sent by the cloud server, corresponding to the audio information; and” (See Figure 5, speech conversion unit; The audio input are captured in audio files and sent to the speech conversion unit 504 for speech-text conversion; 0031);
“displaying, by the user terminal by using subtitles, the text corresponding to the audio information.” (The audio program will be converted to the text and the text displayed on the AR glasses as described above; 0035);
Regarding claim 11, Goldstein teaches “The method according to claim 10, further comprising: sending, by the user terminal, the text corresponding to the audio information to the augmented reality glasses, so that the augmented reality glasses display, by using subtitles, the text corresponding to the audio information.” (the processor in the wearable device then converts the speech to text and visibly displays the text in the AR glasses so the user can see which individual is speaking and what is being said; 0019);
Regarding claim 12, Goldstein teaches “The method according to claim 10, wherein the displaying, by the user terminal by using subtitles, the text corresponding to the audio information comprises:
displaying, by the user terminal in a floating box by using subtitles, the text corresponding to the audio information.” (the text file is displayed with an out of range indicator on the display screen of the augmented reality glasses, and if the speaker's position is within the visual range, the text file is displayed on the screen of the AR glasses adjacent to the position of the speaker on the display screen; 0005);
Regarding claim 14, Goldstein teaches “The method according to claim 10, further comprising:
sending, by the user terminal, a subtitle adjustment instruction to the augmented reality glasses, so that the augmented reality glasses perform adjustment of a position, a size, or a color on the subtitles according to the subtitle adjustment instruction.” (If it is determined that the speaker is not in front of the speaker, turning information will be displayed to the user, step 406. The captured speech will be displayed to the user in different color or with some indicator when the speaker is not within the visual range of the user; 0030);
Regarding claim 19, Goldstein teaches “An electronic device, comprising:
a processor; and” (the processor in the wearable device; 0019);
“a memory, configured to store executable instructions of the processor,”
(The computer readable medium can be the memory of the server; 0034);
(The method may be implemented, for example, by executing a sequence of machine-readable instructions; 0037);
“wherein the processor is configured to execute the subtitle display method according to claim 10” (the processor in the wearable device then converts the speech to text and visibly displays the text in the AR glasses; 0019);
Regarding claim 20, Goldstein teaches “A subtitle display system, comprising augmented reality glasses, a user terminal, and a cloud server,” (The controller 206 may have a user interface to allow the user to control the AR glasses 202; 0028);
(where the program directs a server; 0034);
(The method may be implemented, for example, by executing a sequence of machine-readable instructions. The instructions can reside in various types of signal-bearing or data storage media … accessible by, or residing within, the components of the network device; 0037):
The data storage media can be a place to store the audio captured. Therefore, it would have been obvious for an ordinary skilled person in the art to substitute data storge media as a place to upload and transcribe onto like a cloud server of the claimed subject matter.
“wherein the user terminal is configured to obtain audio information, the audio information comprises an ambient audio or a phone audio sent by the augmented reality glasses, or an audio stream of an audio or a video played by the user terminal;” (The present invention may also enable the hearing-impaired users to enjoy audio programs transmitted through podcast on the Internet. The controller device 206 may connect to the Internet and download podcast from the desired websites; 0035);
“the user terminal is further configured to upload the audio information to the cloud server, so that the cloud server performs voice transcription on the audio information to acquire a text corresponding to the audio information and returns the text corresponding to the audio information to the user terminal;” (See Figure 3, step 302 Capture speech; the audio conversion feature; 0029; See Figure 5, Audio Capturing unit;);
“the user terminal is further configured to display, by using subtitles, the text corresponding to the audio information, and send the text corresponding to the audio information to the augmented reality glasses; and” (The text of the captured audio speech and the position information of the speaker are displayed on the display unit 502; 0031);
(the speeches will be converted to text in real time and displayed to the user on his AR glasses 202; 0033);
“the augmented reality glasses are configured to display, by using subtitles, the text corresponding to the audio information.” (The audio program will be converted to the text and the text displayed on the AR glasses as described above; 0035);
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-7, 9, 13 and 15-18 are rejected under 35 U.S.C. 103 as being unpatentable over GOLDSTEIN in view of LIPMAN (No. US-20210312940-A1 “Lipman”).
Regarding claim 1, Goldstein teaches “A subtitle display method, applied to augmented reality glasses, wherein the augmented reality glasses are connected to a user terminal, and the method comprises:” (a method for displaying text on augmented reality glasses with a plurality of microphones; 0005);
(The controller 206 may have a user interface to allow the user to control the AR glasses 202; 0028);
“acquiring, by the augmented reality glasses based on the phone audio, a text corresponding to the phone audio; and” (converting said audio file into a text file; 0005);
“displaying, by the augmented reality glasses by using subtitles, the text corresponding to the phone audio.” (the text file is displayed on the screen of the AR glasses adjacent to the position of the speaker on the display screen; 0005);
Goldstein does not teach “capturing, by the augmented reality glasses, a phone audio sent by the user terminal, wherein the phone audio is used to represent audio information generated during an incoming call or an outgoing call”.
Lipman teaches “capturing, by the augmented reality glasses, a phone audio sent by the user terminal, wherein the phone audio is used to represent audio information generated during an incoming call or an outgoing call;” (Where glasses with display 100 are used to receive incoming phone calls or function with communications module 150 to make outgoing phone calls; 0042);
The motivation for the above is to have a user-friendly device that is versatile enough to also answer phone calls.
Goldstein and Lipman are analogous art as both of them are related to augmented reality glasses with caption text for hearing-impaired people.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Goldstein by capturing, by the augmented reality glasses, a phone audio sent by the user terminal, wherein the phone audio is used to represent audio information generated during an incoming call or an outgoing call by Lipman and use that with Goldstein’s method for displaying subtitles from spoken audio for augmented reality glasses.
Regarding claim 2, Goldstein teaches “The method according to claim 1, further comprising:
when a phone status is call in progress, receiving, by the augmented reality glasses, a phone audio capture instruction sent by the user terminal,” (See Figure 3; FIG. 3 depicts a process 300 for capturing speech; 0011);
(The method may be implemented, for example, by executing a sequence of machine-readable instructions; 0037);
“wherein the capturing, by the augmented reality glasses, a phone audio sent by the user terminal comprises:” (Having an array of microphones (properly positioned along the temples to either side of the glasses) can precisely capture the speech taking place around the user; 0021);
“capturing, by the augmented reality glasses, the phone audio according to the phone audio capture instruction.” (capturing audible speech into an audio file from the person speaking, converting said audio file into a text file; 0005);
Regarding claim 3, Goldstein teaches “The method according to claim 1, wherein the acquiring, by the augmented reality glasses based on the phone audio, a text corresponding to the phone audio comprises:
sending, by the augmented reality glasses, the phone audio to the user terminal, so that the user terminal acquires, based on the phone audio, the text corresponding to the phone audio and sends the text corresponding to the phone audio to the augmented reality glasses.” (The array of microphones serves as an audio sensor that captures all audible sound, sends the sound data to a processor that filters out noise, identifies individual speech; 0019);
(the processor in the wearable device then converts the speech to text and visibly displays the text in the AR glasses; 0019);
Regarding claim 4, Goldstein teaches “The method according to claim 1, wherein the acquiring, by the augmented reality glasses based on the phone audio, a text corresponding to the phone audio comprises:
uploading, by the augmented reality glasses, the phone audio to a cloud server, so that the cloud server performs voice transcription on the phone audio to acquire the text corresponding to the phone audio and returns the text corresponding to the phone audio to the augmented reality glasses.” (The audio input are captured in audio files and sent to the speech conversion unit 504 for speech-text conversion; 0031);
(The controller unit 510 may also save the audio files and the text files of the converted speeches in the storage unit 512 for later retrieval; 0031);
(The method may be implemented, for example, by executing a sequence of machine-readable instructions. The instructions can reside in various types of signal-bearing or data storage media … accessible by, or residing within, the components of the network device; 0037):
The data storage media can be a place to store the audio captured. Therefore, it would have been obvious for an ordinary skilled person in the art to substitute data storge media as a place to upload and transcribe onto like a cloud server of the claimed subject matter.
Regarding claim 5, Goldstein teaches “The method according to claim 1, further comprising:
acquiring, by the augmented reality glasses, a text corresponding to an audio stream of an audio or a video played by the user terminal; and” (At the social gathering, the user may be talking to multiple friends. The speeches from these friends will be captured by the microphones 208 attached to the AR glasses 202 and the speeches will be converted to text in real time; 0033);
“displaying, by the augmented reality glasses by using subtitles, the text corresponding to the audio stream of the audio or the video.” (the speeches will be converted to text in real time and displayed to the user on his AR glasses 202; 0033);
Regarding claim 6, Goldstein teaches “The method according to claim 1, further comprising:
capturing, by the augmented reality glasses by using a linear microphone array, an ambient audio by using a beamforming technology;” (The array of microphones serves as an audio sensor that captures all audible sound; 0019);
(The controller 206 may be attached through wires to the AR glasses 202 or wirelessly through Bluetooth waves; 0028);
“acquiring, by the augmented reality glasses based on the ambient audio, a text corresponding to the ambient audio; and” (After the speech recognition, the recognized speech is converted to text, step 308; 0029);
“displaying, by the augmented reality glasses by using subtitles, the text corresponding to the ambient audio.” (converted to text, step 308, and displayed by the controller 206 on the AR glasses, step 310; 0029);
Regarding claim 7, Goldstein teaches “The method according to claim 6, further comprising:
performing, by the augmented reality glasses, signal processing on the ambient audio, wherein the signal processing comprises at least one of following: filtering processing, noise reduction processing, or echo cancellation processing,” (If the audio conversion is turned on, the speech is captured, step 302, and the captured speech is filtered to eliminate noises, step 304; 0029);
“wherein the acquiring, by the augmented reality glasses based on the ambient audio, a text corresponding to the ambient audio comprises:” (After the speech recognition, the recognized speech is converted to text, step 308; 0029);
“acquiring, by the augmented reality glasses based on an ambient audio obtained after the signal processing, the text corresponding to the ambient audio.” (The audio program will be converted to the text and the text displayed on the AR glasses; 0035);
Regarding claim 9, Goldstein fails to teach “receiving, by the augmented reality glasses, a subtitle adjustment instruction sent by the user terminal, and performing adjustment of a position, a size, or a color on the subtitles according to the subtitle adjustment instruction”.
However, Lipman teaches “The method according to claim 1, further comprising:
receiving, by the augmented reality glasses, a subtitle adjustment instruction sent by the user terminal, and performing adjustment of a position, a size, or a color on the subtitles according to the subtitle adjustment instruction.” (In some embodiments, mobile device 151 may be used to configure parameters of display 109 including, for example and without limitation, display brightness, positioning of elements of display 109, font type, font size, text color, and what elements of display 109 are enabled; 0041);
The motivation above is for user friendly customization of the display of the subtitles in the augmented reality glasses.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Goldstein by receiving, by the augmented reality glasses, a subtitle adjustment instruction sent by the user terminal, and performing adjustment of a position, a size, or a color on the subtitles according to the subtitle adjustment instruction by Lipman and use that with Goldstein’s method for displaying subtitles for augmented reality glasses.
Regarding claim 13, Goldstein fails to teach all of claim 13. However, Lipman teaches “The method according to claim 10, further comprising:
when there is an incoming call or an outgoing call, monitoring, by the user terminal, a phone status; and” (On/off switch 161 may also be used to answer an incoming phone call, make an outgoing phone call, or terminate a call; 0044);
“when the phone status is call in progress, sending, by the user terminal, a phone audio capture instruction to the augmented reality glasses, so that the augmented reality glasses capture the phone audio according to the phone audio capture instruction.” (executing a sequence of machine-readable instructions; 0037);
(Glasses with display 100 may display text of the spoken words of the person on the incoming call as described above with respect to other speakers; 0042);
The motivation above is for user friendly customization of the display of the subtitles in the augmented reality glasses.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Goldstein by when there is an incoming call or an outgoing call, monitoring, by the user terminal, a phone status; and when the phone status is call in progress, sending, by the user terminal, a phone audio capture instruction to the augmented reality glasses, so that the augmented reality glasses capture the phone audio according to the phone audio capture instruction by Lipman and use that with Goldstein’s method for displaying subtitles for augmented reality glasses.
Regarding claim 15, Goldstein teaches “An electronic device, comprising:
a processor; and” (the processor in the wearable device; 0019);
“a memory, configured to store executable instructions of the processor,” (The computer readable medium can be the memory of the server; 0034);
(The method may be implemented, for example, by executing a sequence of machine-readable instructions; 0037);
“wherein the processor is configured to execute the subtitle display method according to claim 1.” (the processor in the wearable device then converts the speech to text and visibly displays the text in the AR glasses; 0019);
Regarding claim 16, Goldstein teaches “The electronic device according to claim 15, wherein the electronic device comprises augmented reality glasses”. (A wearable device with augmented reality glasses; Abstract);
Regarding claim 17, Goldstein teaches “The electronic device according to claim 16, wherein the augmented reality glasses comprise a linear microphone array, and the linear microphone array comprises a plurality of microphone sensors distributed along a straight line”. (The apparatus comprises a frame, display lens connected to the frame, a plurality of microphones connected to the frame, the plurality of microphones capturing a speech; 0006);
Regarding claim 18, Goldstein teaches “The electronic device according to claim 16, wherein the augmented reality glasses comprise wearing glasses for a hearing-impaired person”. (an augmented reality apparatus for hearing-impaired people; 0006);
Claim(s) 8 is rejected under 35 U.S.C. 103 as being unpatentable over GOLDSTEIN in view of LIPMAN and in further view of OLWAL (No. US-20230132041-A1 “Olwal”).
Regarding claim 8, Goldstein and Lipman fail to teach all of claim 8. However, Olwal teaches
“The method according to claim 7, further comprising:
performing, by the augmented reality glasses, parameter adjustment by using a depth learning model or a machine learning model, wherein the parameter adjustment comprises at least one of following: beamforming parameter adjustment or signal processing parameter adjustment,” (The adaptive beamforming may be further aided by signal processing to separate sound sources (i.e., sound separation); 0023);
(Returning to FIG. 1 , the audio/behavior event correlator may be realized through one or more machine learning models that are trained (e.g., during a training process) on various behavior events correlated with audio events; 0035);
“wherein the capturing an ambient audio by using a beamforming technology comprises:
capturing the ambient audio by using an adjusted beamforming parameter; and” (For example, the precision (e.g., beam width) of beamforming may be adjusted based on an interest of a user; 0053);
“wherein the performing, by the augmented reality glasses, signal processing on the ambient audio comprises:
performing, by the augmented reality glasses, signal processing on the ambient audio based on an adjusted signal processing parameter” (noise cancellation algorithms may be adjusted based on interest of a user; 0053);
The motivation above is for easier capturing and processing of audio with a learning model.
Therefore, it would have been obvious for an ordinary skilled person in the art before the effective filing date of claimed invention to have modified Goldstein by performing, by the augmented reality glasses, parameter adjustment by using a depth learning model or a machine learning model, wherein the parameter adjustment comprises at least one of following: beamforming parameter adjustment or signal processing parameter adjustment, wherein the capturing an ambient audio by using a beamforming technology comprises: capturing the ambient audio by using an adjusted beamforming parameter; and wherein the performing, by the augmented reality glasses, signal processing on the ambient audio comprises: performing, by the augmented reality glasses, signal processing on the ambient audio based on an adjusted signal processing parameter by Olwal and use that with Goldstein’s method for displaying subtitles for augmented reality glasses.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US-20240137622-A1 (Pohl) – Discloses a device and a method for selective presentation of subtitles which are capable to increase the immersive experience of a user by selectively presenting subtitles depending on a language skill of the use.
WO-2020125493-A1 (Wang) – Discloses an augmented-reality smart glasses system comprising a wearable augmented-reality remote video system and a video call method.
KR-20080003494-A (Jang) – Discloses an Internet caption video phone and a real-time text transmission relay system are provided to enable hearing-impaired people to receive call contents in real time to have a call with a normal person.
US-20210136508-A1 (Donley) – Discloses systems and methods for classifying beamformed signals for binaural audio playback.
Surale, Hemant et al., "ARcall: Real-Time AR Communication using Smartphones and Smartglasses", March 13 2022, Association for Computing Machinery, pp. 46-57 (Year: 2022) – Discloses a novel Augmented Reality-based real-time communication system that enables an immersive, delightful, and privacy-preserving experience between a smartphone user and a smart glasses wearer.
A. M. Ridha and W. Shehieb, "Assistive Technology for Hearing-Impaired and Deaf Students Utilizing Augmented Reality", 2021 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), pp. 1-5 (Year: 2021) – Discloses an intelligent software solution developed for affordable augmented reality glasses that will assist students in their educational journey with real-time transcribing, speech emotion recognition, sound indications features, as well as classroom assistive tools.
Jingya Li, "Augmented Reality Visual-Captions: Enhancing Captioning Experience for Real-Time Conversations", July 09 2023, Springer, Cham, vol 14037., pp. 380-396 (Year: 2023) – Discloses an AR-based visual-captions, which aims to help people better receive information in live conversations.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BRIGITER D PROTAZI whose telephone number is (571)272-7995. The examiner can normally be reached Monday - Friday 7:30-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Said A Broome can be reached at 5712722931. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/B.D.P./Examiner, Art Unit 2612
/Said Broome/Supervisory Patent Examiner, Art Unit 2612