Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This office action is in response to correspondence filed 01/26/26 regarding application 17/947,248, in which claims 1 and 11 were amended and claims 4 and 14 were cancelled. Claims 1-3, 5-13, and 15-20 are pending and have been considered.
Response to Arguments
The examiner agrees with Applicant on page 1 that the amendments to claims 1 and 11 do not add new matter.
The examiner respectfully disagrees with Applicant’s assertions on pages 2-3 regarding Popat. Applicant argues that “The Popat reference mentions frequency, but is not a frequency with which a character appears within domain resources. Rather, it is the frequency with which one character precedes other characters within a language”. In response, the it is noted that the frequency with which one character precedes another character is still a frequency with which a character appears within domain resources within a certain context or usage, i.e. a “usage frequency”.
Applicant’s arguments regarding the 35 U.S.C. 103 rejections of claims 1-3, 5-13, and 15-20 based on Popat, Ratnakar, and Wu, that the combination of references does not disclose or suggest all the elements of independent claims 1 and 11 are moot based on the new grounds for rejection, based in part on the newly discovered reference to Chan, and necessitated by Applicant’s amendments.
Priority Document Still Missing
Applicant’s priority claim to prior application PCT/CN2022/119176, filed 09/16/2022 is acknowledged, but a copy of the PCT/CN2022/119176 application has still not been received.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-3, 5, 7-13, 15, and 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Popat, Ashok (US 20130259378) in view of Ratnakar et al. (US 20100188419), in further view of Chan et al. (“LEXTALE_CH: A Quick, Character-Based Proficiency Test for Mandarin Chinese”. Proceedings of the 42nd annual Boston University Conference on Language Development, ed. Anne B. Bertolini and Maxwell J. Kaplan, 114-130. Somerville, MA: Cascadilla Press, 2018).
Consider claim 1, Popat discloses a computerized method (computer-implemented method, [0010]), comprising:
obtaining a text object from a computing resource, the object being configurable for display on at least one client device (obtaining text from optical character recognition engine, [0025], the text configurable for display on e.g. display 210, [0036]);
analyzing the text object for intelligibility (using quality score to identify the presence of garbage text, e.g., unintelligible text produced by attempting to recognize characters in a picture, [0006], [0009], [0043], [0056]), including:
determining a usage frequency of each character in the text object based on a frequency that the character appears in a set of domain resources associated with an enterprise (enumerated frequency, i.e. usage rate of the character being used to precede another token in the corpora, for the each token, which are a Unicode characters, the corpora associated with each writing system, i.e. enterprise, [0042])
applying a weight to each character in the text object based on the usage frequency (determining language-conditional character probabilities for each character, [0044], each token is a Unicode character, [0042], the probabilities based on frequencies in a corpora specific to the different languages, i.e. domain resources, [0041]-[0042]);
determining a total weight for the text object based on the weight applied to each character (LLCL module assigns weights to neighboring characters, identifies the set of language-conditional character probabilities, and combines to probabilities to generate a local language-conditional likelihood, [0051]-[0052]);
determining a viability rate for words in the text object (average word quality over a given period of time, for example, [0058-0060]); and
generating an intelligibility analysis for the text object based on the total weight and viability rate (e.g. identifying periods of time associated with low text quality scores based on the average word quality over the period, which is generated using the local language conditional likelihoods, [0057-0059]); and
identifying an operational change at the computing resource based on the intelligibility analysis (identifying a period of time associated with poor text quality that is concurrent with an update in code for software used by the OCR engine, [0059]).
Popat does not specifically mention effectuating an operational change.
Ratnakar discloses effectuating an operational change (a garbage score is calculated for each text segment of the OCR’d document, and if the garbage score is above a threshold, the text is automatically replaced with the image, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat by effectuating an operational change in order to minimize the impact of OCR errors, as suggested by Ratnakar ([0006]). Doing so would have led to predictable results of increased ability to use and enjoy the advantages of OCR’d text, as suggested by Ratnakar ([0006]). The references cited are analogous art in the same field of text processing.
Popat and Ratnakar do not specifically mention assigning each character to a predetermined usage frequency tier based on the usage frequency of each character.
Chan discloses assigning each character to a predetermined usage frequency tier based on the usage frequency of each character (frequency distribution tiers of 90 Chinese characters in occurrences per million characters, Table 1, page 117).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat and Ratnakar by assigning each character to a predetermined usage frequency tier based on the usage frequency of each character as in Chan; and applying a weight to each character in the text object as in Popat based on the usage frequency tier to which the character is assigned, as in Chan, in order to utilize the well known properties suggested by Chan that characters from low-frequency tiers tend to increase difficulty of character recognition, while characters from high-frequency tiers tend to decrease difficulty of character recognition, as suggested by Chan (page 117), predictably resulting in a tiered contribution toward text intelligibility from the standpoint of the reader recognizing characters, as suggested by Chan (page 117). The references cited are analogous art in the same field of text processing.
Consider claim 11, Popat discloses a computing system (computer of Fig. 2, [0016]), comprising:
a memory (memory 206, Fig. 2); and
a processor coupled to the memory (processor 202, Fig. 2) and configured to analyze a text object from a computing resource for intelligibility (using quality score to identify the presence of garbage text, e.g., unintelligible text produced by attempting to recognize characters in a picture, [0006], [0009], [0043], [0056]) according to a process that includes:
determining a usage frequency of each character in the text object based on a frequency that the character appears in a set of domain resources associated with an enterprise (enumerated frequency, i.e. usage rate of the character being used to precede another token in the corpora, for the each token, which are a Unicode characters, the corpora associated with each writing system, i.e. enterprise, [0042])
applying a weight to each character in the text object based on the usage frequency (determining language-conditional character probabilities for each character, [0044], each token is a Unicode character, [0042], the probabilities based on frequencies in a corpora specific to the different languages, i.e. domain resources, [0041]-[0042]);
determining a total weight for the text object based on the weight applied to each character (LLCL module assigns weights to neighboring characters, identifies the set of language-conditional character probabilities, and combines to probabilities to generate a local language-conditional likelihood, [0051]-[0052]);
determining a viability rate for words in the text object(average word quality over a given period of time, for example, [0058-0060]); and
generating an intelligibility analysis for the text object based on the total weight and viability rate (e.g. identifying periods of time associated with low text quality scores based on the average word quality over the period, which is generated using the local language conditional likelihoods, [0057-0059]); and
identifying an operational change at the computing resource based on the intelligibility analysis (identifying a period of time associated with poor text quality that is concurrent with an update in code for software used by the OCR engine, [0059]).
Popat does not specifically mention effectuating an operational change.
Ratnakar discloses effectuating an operational change (a garbage score is calculated for each text segment of the OCR’d document, and if the garbage score is above a threshold, the text is automatically replaced with the image, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat by effectuating an operational change for reasons similar to those for claim 1.
Popat and Ratnakar do not specifically mention assigning each character to a predetermined usage frequency tier based on the usage frequency of each character.
Chan discloses assigning each character to a predetermined usage frequency tier based on the usage frequency of each character (frequency distribution tiers of 90 Chinese characters in occurrences per million characters, Table 1, page 117).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat and Ratnakar by assigning each character to a predetermined usage frequency tier based on the usage frequency of each character as in Chan; and applying a weight to each character in the text object as in Popat based on the usage frequency tier to which the character is assigned, as in Chan for reasons similar to those for claim 1.
Consider claim 2, Popat discloses the computing resource includes at least one of a web server or an app server (OCR Engine connected via network 114, considered a web or app server, Fig 1, [0024]).
Consider claim 3, Popat discloses the modeled Unicode weights are determined with a machine learning algorithm (the weights for Unicode characters are determined by language model training module 312, Fig 3, [0042]).
Consider claim 5, Popat discloses the total weight of the text object is an average of all weights applied to the characters in the text object (the LLCL Module 342 averages the set of language-conditional character probabilities, [0053]).
Consider claim 7, Popat discloses the intelligibility analysis includes: determining each language used in the text object (for different languages or writing systems, generating language-conditional character probabilities specifying the likelihood a character is written in a language, [0041], [0044]); and calculating a separate viability rate for each language (assessing language-specific quality at the word-level using the local language conditional likelihoods, [0056], which are language specific).
Consider claim 8, Popat does not, but Ratnakar discloses the operational change effectuating at the computing resource includes altering a display output to the at least one client device to inform a user of a potential intelligibility issue (altering the display such that the text is automatically replaced with the image, informing the user to the high garbage score, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat such that the operational change effectuating at the computing resource includes altering a display output to the at least one client device to inform a user of a potential intelligibility issue for reasons similar to those for claim 1.
Consider claim 9, Popat does not, but Ratnakar discloses the operational change effectuating at the computing resource includes outputting an alert condition to an administrator of the computing resource (altering the display such that the text is automatically replaced with the image, alerting the user, who is administering the OCR, to the high garbage score, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat such that the operational change effectuating at the computing resource includes outputting an alert condition to an administrator of the computing resource for reasons similar to those for claim 1.
Consider claim 10, Popat discloses the set of domain resources share a common domain with the computing resource, the common domain being selected from a group consisting of: a web portal, a business, an industry, a technology area, a social media platform, a field of endeavor, an enterprise, a government agency, and an industry (OCR Engine and Text Evaluation Engine, which includes language corpora dataset 322, are both networked computer equipment, Figs. 1-3, [0024], [0041]; they are considered to share a common domain of “computing”, a technology area).
Consider claim 12, Popat discloses the computing resource includes at least one of a web server or an app server (OCR Engine connected via network 114, considered a web or app server, Fig 1, [0024]).
Consider claim 13, Popat discloses the modeled Unicode weights are determined with a machine learning algorithm (the weights for Unicode characters are determined by language model training module 312, Fig 3, [0042]).
Consider claim 15, Popat discloses the total weight of the text object is an average of all weights applied to the characters in the text object (the LLCL Module 342 averages the set of language-conditional character probabilities, [0053]).
Consider claim 17, Popat discloses the intelligibility analysis includes: determining each language used in the text object (for different languages or writing systems, generating language-conditional character probabilities specifying the likelihood a character is written in a language, [0041], [0044]); and calculating a separate viability rate for each language (assessing language-specific quality at the word-level using the local language conditional likelihoods, [0056], which are language specific).
Consider claim 18, Popat does not, but Ratnakar discloses the operational change effectuating at the computing resource includes altering a display output to the at least one client device to inform a user of a potential intelligibility issue (altering the display such that the text is automatically replaced with the image, informing the user to the high garbage score, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat such that the operational change effectuating at the computing resource includes altering a display output to the at least one client device to inform a user of a potential intelligibility issue for reasons similar to those for claim 1.
Consider claim 19, Popat does not, but Ratnakar discloses the operational change effectuating at the computing resource includes outputting an alert condition to an administrator of the computing resource (altering the display such that the text is automatically replaced with the image, alerting the user, who is administering the OCR, to the high garbage score, [0009]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat such that the operational change effectuating at the computing resource includes outputting an alert condition to an administrator of the computing resource for reasons similar to those for claim 1.
Consider claim 20, Popat discloses the set of domain resources share a common domain with the computing resource, the common domain being selected from a group consisting of: a web portal, a business, an industry, a technology area, a social media platform, a field of endeavor, an enterprise, a government agency, and an industry (OCR Engine and Text Evaluation Engine, which includes language corpora dataset 322, are both networked computer equipment, Figs. 1-3, [0024], [0041]; they are considered to share a common domain of “computing”, a technology area).
Claims 6 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Popat, Ashok (US 20130259378) in view of Ratnakar et al. (US 20100188419), Chan et al. (“LEXTALE_CH: A Quick, Character-Based Proficiency Test for Mandarin Chinese”. Proceedings of the 42nd annual Boston University Conference on Language Development, ed. Anne B. Bertolini and Maxwell J. Kaplan, 114-130. Somerville, MA: Cascadilla Press, 2018), in further view of Wu, Hao (US 20200064977).
Consider claim 6, Popat discloses the viability rate for words in the text object is determined by: selecting a set of characters in the text object (an ordered sequence of characters, [0065]); evaluating characters adjacent each selected character to judge whether a word or phrase can be formed using each selected character (language-conditional character probabilities to evaluate word or word sequence probability, [0047]); generating the viability rate based on a number of selected characters that are judged to form words or phrases (average word quality over a given period of time, for example, [0058-0060]).
Popat, Ratnakar, and Chan do not specifically mention randomly selecting a set of characters in the text object.
Wu discloses randomly selecting a set of characters in the text object (characters randomly selected, [0082]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat, Ratnakar, and Chan by randomly selecting a set of characters in the text object in order to increase accuracy of UI problem detection, as suggested by Wu ([0005]), predictably avoiding the time consuming process of manual testing, as suggested by Wu ([0003]). The references cited are analogous art in the same field of text processing.
Consider claim 16, Popat discloses the viability rate for words in the text object is determined by: selecting a set of characters in the text object (an ordered sequence of characters, [0065]); evaluating characters adjacent each selected character to judge whether a word or phrase can be formed using each selected character (language-conditional character probabilities to evaluate word or word sequence probability, [0047]); generating the viability rate based on a number of selected characters that are judged to form words or phrases (average word quality over a given period of time, for example, [0058-0060]).
Popat, Ratnakar, and Chan do not specifically mention randomly selecting a set of characters in the text object.
Wu discloses randomly selecting a set of characters in the text object (characters randomly selected, [0082]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Popat, Ratnakar, and Chan by randomly selecting a set of characters in the text object for reasons similar to those for claim 6.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20120254181 Schofield discloses using frequency of occurrence of characters to detect a language fingerprint for language classification of text
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jesse Pullias whose telephone number is 571/270-5135. The examiner can normally be reached on M-F 8:00 AM - 4:30 PM. The examiner’s fax number is 571/270-6135.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Andrew Flanders can be reached on 571/272-7516.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/Jesse S Pullias/
Primary Examiner, Art Unit 2655 02/12/26