DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-3, 7-9 are rejected under 35 U.S.C. 102 (a1)as being anticipated by Vaquero Aviles-Casco et al., (US 2020/0279568 A1).
As per claims 1 and 9, Vaquero Aviles-Casco et al., teach a voice registration device/method (0080, 0138 – voice enrollment) comprising:
an acquisition unit that acquires a voice signal of an utterance voice of a speaker (fig.4, 0081);
a detection unit that detects, from the voice signal, a first utterance section of the speaker and a second utterance section different from the first utterance section (0082, 0090, - voice detection, speaker change detection process);
a sensing unit that compares a voice signal of the first utterance section with a voice signal of the second utterance section and senses switching from the speaker to another speaker different from the speaker (0094, 0128 - 0129, 0138); and
a registration unit that registers the voice signal of the speaker in a database based on the sensing of the switching by the sensing unit (0138 – 0139).
As per claim 2, Vaquero Aviles-Casco et al., teach the voice registration device according to claim 1, further comprising: a similarity calculation unit that calculates similarity between two different voice signals, wherein the acquisition unit further acquires speaker information capable of identifying the speaker, the similarity calculation unit acquires a registration voice signal associated with speaker information identical to the acquired speaker information among respective pieces of speaker information of a plurality of speakers registered in the database, and calculates first similarity between the registration voice signal and the first utterance section and second similarity between the registration voice signal and the second utterance section, and the sensing unit senses the switching from the speaker to the another speaker based on a change between the first similarity and the second similarity (0090 – “The initial diarisation block 68 may include a speaker change detection block 70. The speaker change detection block 70 may then perform a speaker change detection process on the received audio signal. This speaker change detection process may then generate a speaker change flag whenever it is determined that the identity of the person speaking has changed.”, 0094, 0174).
As per claim 3, Vaquero Aviles-Casco et al., teach the voice registration device according to claim 2, wherein the sensing unit detects the switching from the speaker to the another speaker in a case where it is determined that the similarity is not equal to or greater than a threshold value (0090 – “The initial diarisation block 68 may include a speaker change detection block 70. The speaker change detection block 70 may then perform a speaker change detection process on the received audio signal. This speaker change detection process may then generate a speaker change flag whenever it is determined that the identity of the person speaking has changed.”, 0094, 0174).
As per claim 7, Vaquero Aviles-Casco et al., teach the voice registration device according to claim 1, wherein each of the first utterance section and the second utterance section includes at least the same utterance section (0082).
As per claim 8, Vaquero Aviles-Casco et al., teach the voice registration device according to claim 2, wherein the speaker information is a telephone number of a voice collecting device that collects the utterance voice (0192).
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 4-6 are rejected under 35 U.S.C. 103 as being unpatentable over Vaquero Aviles-Casco et al., (US 2020/0279568 A1) in view of Aguilar Alas et al., (US 2021/0104245 A1).
As per claim 4, Vaquero Aviles-Casco et al., teach the voice registration device according to claim 1. However Vaquero Aviles-Casco et al., do not specifically teach an emotion identification unit that identifies at least one type of emotion included in the voice signal, and a deletion unit that deletes an utterance section including the emotion based on an identification result by the emotion identification unit, wherein the detection unit detects the first utterance section and the second utterance section of the speaker based on the voice signal from which the utterance section including the emotion is deleted as per claim 4.
Aguilar Alas et al., do teach the claimed emotion identification unit that identifies at least one type of emotion included in the voice signal, and a deletion unit that deletes an utterance section including the emotion based on an identification result by the emotion identification unit, wherein the detection unit detects the first utterance section and the second utterance section of the speaker based on the voice signal from which the utterance section including the emotion is deleted emotion identification unit that identifies at least one type of emotion included in the voice signal, and a deletion unit that deletes an utterance section including the emotion based on an identification result by the emotion identification unit, wherein the detection unit detects the first utterance section and the second utterance section of the speaker based on the voice signal from which the utterance section including the emotion is deleted (0023-0025, Fig.8, 0028, 0117, 0139). Therefore it would have been obvious to one of ordinary skill in the art to incorporate emotion detection as taught by Aguilar Alas et al., in the device of Vaquero Aviles-Casco et al., because, this would provide user a model that processes acoustic and lexical information during runtime to determine user’s sentiment (0025).
As per claim 5, Vaquero Aviles-Casco et al., in view of Aguilar-Alas teach the voice registration device according to claim 1, further comprising: an emotion identification unit that identifies at least one type of emotion included in the voice signal, and an input unit that receives an operation as to whether to delete an utterance section including the emotion based on an identification result by the emotion identification unit, wherein in a case where the input unit receives an operation to delete an utterance section, the detection unit deletes the utterance section including the emotion and detects the first utterance section and the second utterance section of the speaker based on the voice signal from which the utterance section including the emotion is deleted (Aguilar Alas et al., 0023-0025, Fig.8, 0028, 0117, 0139).
As per claim 6, Vaquero Aviles-Casco et al., in view of Aguilar Alas et al., teach the voice registration device according to claim 4, further comprising: a conversion unit that converts the voice signal acquired by the acquisition unit to have a predetermined utterance rate, wherein the emotion identification unit identifies the emotion using a voice signal converted to have the predetermined utterance rate (Aguilar Alas et al., 0023-0025, Fig.8, 0028, 0117, 0139).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Please see attached form PTO-892.
The following is applicable prior art toward Applicant claimed invention.
Lee et al., (US 2020/0118544 A1) teach an intelligent voice recognizing method, a voice recognizing apparatus, and an intelligent computing device. The an intelligent voice recognizing method according to an embodiment of the present disclosure receives a voice, acquires a sequential start language uttered sequentially with a utterance language from the voice, and sets the sequential start language as an additional start language other than a basic start language when the sequential start language is recognized as a start language of a voice recognizing apparatus, thereby being able to authenticate a user and recognize a voice even through a seamless scheme voice that is uttered in an actual situation. According to the present disclosure, one or more of the voice recognizing device, intelligent computing device, and server may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
Choi (US 2019/0362726 A1) teaches an electronic apparatus includes an inputter comprising input circuitry, a voice receiver comprising voice receiving circuitry, a storage, and a processor configured to: provide a guide prompting a user utterance based on user authentication being performed according to user information input through the inputter, generate a speaker recognition model corresponding to the user information based on a voice corresponding to the guide being received through the voice receiver, store the speaker recognition model in the storage, and identify a user corresponding to a voice received through the voice receiver based on the speaker recognition model updated by comparing a voice received through the voice receiver with the speaker recognition model.
Krupka et al., (US 2019/0341055 A1) teach a method of voice identification enrollment comprising, during a meeting in which two or more human speakers speak at different times, determining whether one or more conditions of a protocol for sampling meeting audio used to establish human speaker voiceprints are satisfied, and in response to determining that the one or more conditions are satisfied, selecting a sample of meeting audio according to the protocol, the sample representing an utterance made by one of the human speakers. The method further comprises establishing, based at least on the sample, a voiceprint of the human speaker.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to VIJAY B CHAWAN whose telephone number is (571)272-7601. The examiner can normally be reached 7-5 Monday thru Thursday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at 571-272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/VIJAY B CHAWAN/Primary Examiner, Art Unit 2658