DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 7, 10-11, 17, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Yurick et al. (USPG 2007/0011012, hereinafter Yurick) in view of Blanchflower (USPN 10176392, hereinafter Blanchflower).
Regarding claims 1, 11, and 20, Yurick discloses a display device, method, and CRM comprising: a memory storing one or more instructions; and at least one processor configured to execute the one or more instructions stored in the memory to: obtain a first character string by using a character recognition model to determine whether there is at least one character on a play screen of content and recognizing a character string including the at least one character as the first character string in response to determining that there is the at least one character on the play screen of the content (figure 2, step 100, OCR), obtain a second character string including at least one character by using a speech recognition model to determine whether there is speech in audio data included in a play section of the content where there is the at least one character (figure 2, step 80, speech recognition), and recognizing the speech and converting the recognized speech into a character string as the second character string in response to determining that there is the speech in the audio data (figure 2, step 80, speech recognition), and compare the first character string with the second character string (paragraph 34, comparing words of speech recognition and character recognition processes).
Yurick fails to explicitly disclose, however, Blanchflower teaches update the character recognition model based on a mismatched part (process in figure 2 and/or claim 8; find and resolve the difference between two text (mismatched part) and then update the OCR model).
Since Yurick and Blanchflower are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of resolving differences between two text to update the OCR model. One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).
Regarding claims 7 and 17, Yurick further discloses wherein the at least one processor is configured to execute the one or more instructions stored in the memory to: repeat, multiple times, a procedure for using the speech recognition model to recognize the speech and convert the recognized speech into a character string, and obtain a most frequent value of the converted character string as the second character string (paragraph 29, “the engine can produce both an acoustic score (that represents how well it matches the acoustic model for that phoneme or word) and a language model score (which uses word context and frequency information to find probable word choices and sequences). The acoustic score and language model score can be combined to produce an overall score for the best hypothesis words as well as alternative words within the given utterance”; the speech recognition process operates repeatedly on received speech in the same of subsequent sessions).
Regarding claim 10, Yurick fails to explicitly disclose, however, Blanchflower further teaches the display device of claim 1, wherein the at least one processor is configured to execute the one or more instructions stored in the memory to: determine whether a function of automatically updating the character recognition model is activated (figure 2, step 240 and/or col. 7, lines 31-42, the updating of the character recognition model has to be activated/allowed before the model can be updated).
Since Yurick and Blanchflower are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of updating the OCR model. One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).
Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Yurick in view of Blanchflower, and further in view of Dong et al. (USPG 2023/0197085, hereinafter Dong).
Regarding claims 6 and 16, the modified Yurick fails to explicitly disclose, however, Dong further teaches wherein the speech recognition model is an artificial intelligence (Al) model and comprises a first speech recognition model and a second speech recognition model, and the at least one processor is configured to execute the one or more instructions stored in the memory to obtain the second character string by using the first speech recognition model to determine whether there is speech in audio data included in a playsection where there is the at least one character (paragraphs 32, 36, 58-59, determining a location of the audio and select a corresponding recognition model for use to recognize speech), using the second speech recognition model to recognize the speech in response to determining that there is the speech in the audio data, and converting the recognized speech into a character string as the second character string (paragraphs 32, 36, 58-59, determining a location of the audio and select a corresponding recognition model for use to recognize speech).
Since the modified Yurick and Dong are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of selecting a location-specific recognition model to perform speech recognition. One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).
Claims 8 and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Yurick in view of Blanchflower, and further in view of Mahyar et al. (USPN 11216684, hereinafter Mahyar).
Regarding claims 8 and 18, the modified Yurick fails to explicitly disclose, however, Mahyar teaches wherein the at least one processor is configured to execute the one or more instructions stored in the memory to: determine whether the first character string and the second character string are recognized in a same language (col. 5, lines 44-50, “when the audio content is in a language other than the burned-in subtitles, by further translating the text from the OCR algorithm and/or translating the text from the speech recognition algorithm such that the two texts are in the same language”).
Since the modified Yurick and Mahyar are analogous in the art because they are from the same field of endeavor, it would have been obvious to one of ordinary skill in the art before the effective filing date of the invention to use the known technique of determining whether or not the audio text and OCR text are of the same language. One of ordinary skill in the art would have recognized that the results of the combination were predictable since the use of that known technique provides the rationale to arrive at a conclusion of obviousness. See KSR International Co. v. Teleflex Inc., 82 USPQ2d 1385 (U.S. 2007).
Allowable Subject Matter
Claims 2-5, 9, 12-15, and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Maeda et al. (USPG 2022/0382964) teaches a method of comparing speech-recognized text to character recognition text to determine whether or not they match.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HUYEN X VO whose telephone number is (571)272-7631. The examiner can normally be reached M-F, 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bhavesh Mehta can be reached at 571-272-7453. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HUYEN X VO/Primary Examiner, Art Unit 2656