Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 10/26/2025 has been entered.
Response to Arguments
An RCE was submitted along with amendments to the independent claims. Applicant first mentions that the 112 (b) rejection that was raised for independent claims 1 and 13 has been fixed, examiner agrees and rescinds the 112 (b) rejection. Applicant also argues that the Ren reference does not teach the presented limitations “a specification unit configured to specify, from among the handwritten characters, a handwritten character whose circumscribed rectangle has a height that is higher than a value determined based on circumscribed rectangles of other handwritten characters; and an exclusion unit configured to exclude, from the handwritten area image, an image corresponding to the specified handwritten character.”, after analysis of the arguments, examiner agrees and removes the Ren reference for the independent claims and presents a new reference Yoshida that teaches the newly amended limitation along with the other limitations. Applicant also argues the 101 rejection by presenting that “amended Claim 1 to recite an image processing system comprising: "one or more processors, connected to one or more memories, the one or more processors being configured to operate" as the various subsequently recited units.”, however examiner disagrees and shows reasoning. The purported technical solution can be done mentally with the use of generic computer components and insignificant extra-solution activity. The proposed amendments/claims does no more than generally link a judicial exception (mental processes) to a particular field of use such as image processing (imaging) / imaging apparatus, thus it fails to add an inventive concept to the claims. Please see MPEP 2106.05(h) for more information about merely indicating a field of use or technological environment (which can be used interchangeably) in which to apply a judicial exception.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 11 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Regarding claim 11, the claim recites “a unit configured to, for each circumscribed rectangle of a plurality of the handwritten character extracted by the extraction unit, generate a single line image in which a height of a circumscribed rectangle that is a determination target is made to be a standard; a unit configured to compare a height of a circumscribed rectangle of a handwritten character included in the generated single line image and a threshold based on the height of the circumscribed rectangle that is the determination target and counts the number of circumscribed rectangles that is greater than or equal to the threshold and the number of circumscribed rectangles that is less than the threshold; and a unit configured to specify the handwritten character that is the determination target for which the number of circumscribed rectangles greater than or equal to the threshold is larger than the number of circumscribed rectangle that is less than the threshold.” It is unclear from the context of the claim which circumscribed character is chosen as the determination target, the claim specifies that a single line image is generated based on a height that is a determination target made standard, and then it mentions that a threshold based on the determination target is made, in the last limitations it then also specifies that a determination target is specified based on the count. One of ordinary skill in the art would ask “Are there several determination targets chosen?”, “What is the standard of the determination target?”, “If a single line image is generated for each character separately, how can a determination target be made standard if another determination target is then chosen after?” “Is the determination target chosen in the beginning or the end?, Are they different?” Therefore one of ordinary skill in the art would not be able to apprise the scope of the claim for reasons regarding clarity.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
35 U.S.C. 101 requires that a claimed invention must fall within one of the four eligible categories of invention (i.e. process, machine, manufacture, or composition of matter) and must not be directed to subject matter encompassing a judicially recognized exception as interpreted by the courts. MPEP 2106. Three categories of subject matter are found to be judicially recognized exceptions to 35 U.S.C. § 101 (i.e. patent ineligible) (1) laws of nature, (2) physical phenomena, and (3) abstract ideas. MPEP 2106(II). To be patent-eligible, a claim directed to a judicial exception must as whole be integrated into a practical application or directed to significantly more than the exception itself (MPEP 2106). Hence, the claim must describe a process or product that applies the exception in a meaningful way, such that it is more than a drafting effort designed to monopolize the exception.
Claims 1-9 and 11-13 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., an abstract idea) without integration into a practical application or recitation of significantly more. In the analysis below, the system of claim 1 is considered representative of independent claims 1 and 13 since all of the independent claims recite identical steps despite being directed to different statutory matter. Furthermore, each of independent claims 1 and 13 are directed to one of the four statutory categories of eligible subject matter; thus, the claims pass Step 1 of the Subject Matter Eligibility Test (See flowchart in MPEP 2106).
Step 2A, Prong 1 Analysis
The independent claims are directed to “acquire a target image by scanning a document that includes handwritten characters”, “extract from the target image, a handwritten area image indicating an approximate shape of the handwritten characters”, “specify, from among the handwritten characters, a handwritten character whose circumscribed rectangle has a height that is higher than a value determined based on circumscribed rectangles of other handwritten characters;”, “exclude, from the handwritten area image, an image corresponding to the specified handwritten character”, “determine, a line boundary of handwritten characters using the handwritten area image from which the specified handwritten character is excluded ” and “separate, based on the determined line boundary, a handwritten area corresponding to the handwritten area image into a plurality of lines.”
Each of the above steps can be performed mentally. An individual can acquire a mental image that is handwritten by seeing a document (a piece of paper) and then writing it on paper. The individual can then draw a box around the handwriting and each character to then specify the shape of each character by then drawing a circumscribed rectangle around each letter to then measure with a ruler and write down the value of each rectangle to choose the highest one based on value or the highest in the piece of paper. The chosen one can then be excluded and written on another paper, then determine a line boundary by observing the handwriting and drawing a line between each character. Then finally separate each character by drawing a box from the line boundaries drawn. As such, the description in independent claims 1 and 13 is an abstract idea namely, a mental process. Accordingly, the analysis under prong one of step 2A of the Subject Matter Eligibility Test does not result in a conclusion of eligibility (See flowchart in MPEP 2106).
Additional Elements
The additional element recited in independent claim 1 is the step of “an acquisition unit configured to ”, “an extraction unit configured to ”, ”a specification unit configured to” , “an exclusion unit configured to”, “a determination unit configured to”, “a separation unit configured to” and “one or more processors, connected to one or more memories, the one or more processors being configured to operate as”.
Step 2A, prong 2 analysis
The above-identified additional elements do not integrate the judicial exception into a practical application.
Each of the additional elements (“an acquisition unit configured to ”, “an extraction unit configured to ”, ”a specification unit configured to” , “an exclusion unit configured to”, “a determination unit configured to”, “a separation unit configured to” and “one or more processors, connected to one or more memories, the one or more processors being configured to operate as”) amounts to merely using a computer as a tool to perform the claimed mental process. Implementing an abstract idea on a computer does not integrate a judicial exception into a practical application (See MPEP 2106.05(f)).
Moreover, the additional elements of the claims do not recite an improvement in the functioning of a computer or other technology or technical field, the claimed steps are not performed using a particular machine, the claimed steps do not effect a transformation, and the claims do not apply the judicial exception in any meaningful way beyond generically linking the use of the judicial exception to a particular technological environment (See MPEP 2106.04(d)). Therefore, the analysis under prong two of step 2A of the Subject Matter Eligibility Test does not result in a conclusion of eligibility (See flowchart in MPEP 2106).
Step 2B
Finally, the claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception.
Each of the other additional elements “an acquisition unit configured to ”, “an extraction unit configured to ”, ”a specification unit configured to” , “an exclusion unit configured to”, “a determination unit configured to”, “a separation unit configured to” and “one or more processors, connected to one or more memories, the one or more processors being configured to operate as”) are generic computer features which perform generic computer functions that are well-understood, routine, and conventional and do not amount to more than implementing the abstract idea with a computerized system. Thus, taken alone, the additional elements do not amount to significantly more than the above-identified judicial exception (the abstract idea).
Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of a computer or improves any other technology. Their collective functions merely provide conventional computer implementation, and mere implementation on a generic computer does not add significantly more to the claims. Accordingly, the analysis under step 2B of the Subject Matter Eligibility Test does not result in a conclusion of eligibility (See flowchart in MPEP 2106).
For all of the foregoing reasons, independent claims 1 and 13 do not recite eligible subject matter under 35 USC 101.
Dependent Claims
Dependent claims 2-9, 11-12 are dependent on independent claim 1. Therefore, they include all of the limitations of claim 1 and recite the same abstract idea of a mental process which can be performed in the mind.
Claim 2 recites “generate a learning model using learning data associating a handwritten character image and a handwritten area image that are extracted from an original sample image, wherein the extraction unit extracts the handwritten character image and the handwritten area image using the learning model“. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can create a guide with pen and paper on how to determine specific handwriting and draw a box around each word. Accordingly, claim 2 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 3 recites “set from the original sample image a handwritten character image and a handwritten area in accordance with a user input,” and “generates, for each character in the handwritten character image set by the setting unit, ground truth data for a handwritten area image by overlapping an expansion image subjected to an expansion process in a horizontal direction and a reduction image in which a circumscribed rectangle encompassing a character of the handwritten character image has been reduced in a vertical direction, and generates a learning model using the generated ground truth data” The claim is part of the abstract idea because this step can be done mentally by an individual. A human can create a guide using pen and paper to use it as a comparison for other written words in which each contain a redrawn rectangle that has been reduced vertically and expanded horizontally. Accordingly, claim 3 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 4 recites “overlaps an image for which an expansion process in a horizontal direction and a reduction process in a vertical direction have been performed on a circumscribed rectangle encompassing a character of the extracted handwritten character image and a line connecting a center of gravity of the circumscribed rectangle between adjacent circumscribed rectangles, and extracts a result as the handwritten area image.”. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can draw again the images and circumscribed rectangles to be reduced vertically and expanded horizontally and then connect the center of each rectangle by drawing a line that goes through the center of each rectangle. Accordingly, claim 4 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 5 recites “specifies a line connecting the center of gravity of the circumscribed rectangle of each character between adjacent circumscribed rectangles, specifies a space between two specified lines as a candidate interval in which there is a line boundary, and determines as a boundary in the candidate interval a line whose frequency of a pixel indicating a handwritten area is the lowest.”. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can specify the space between two specified lines by writing two words, one below the other, and then draw a box between the two. Accordingly, claim 5 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 6 recites ”wherein in a case where a height of the handwritten area that is a processing target is higher than a predetermined threshold based on an average of a height of a circumscribed rectangle corresponding to each of a plurality of characters included in the handwritten area… determines that handwriting of a plurality of lines is included in the handwritten area. ”. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can measure the height of each drawn rectangle to then calculate an average and use it as a limit, then if a written word passes that limit, it is included as part of the handwriting when reading it. Accordingly, claim 6 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 7 recites “for each handwritten area separated by the separation unit, perform an OCR process on a corresponding handwritten character image and output text data that corresponds to a handwritten character.” The claim is part of the abstract idea because this step can be done mentally by an individual. A human can observe characters on a piece of paper and then write it on another piece of paper. Accordingly, claim 7 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 8 recites “extracts a printed character image included in the target image and a printed character area encompassing a printed character, and the character recognition unit further performs an OCR process on the printed character image included in the printed character area and outputs text data corresponding to a printed character.” The claim is part of the abstract idea because this step can be done mentally by an individual. A human can observe characters on a piece of paper and then write it on another piece of paper. Accordingly, claim 8 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 9 recites “estimate relevance between a result of recognition of a handwritten character and a result of recognition of a printed character by the character recognition unit using at least one of content of text data according to the recognition results and positions of the handwritten character and the printed character in the target image.” The claim is part of the abstract idea because this step can be done mentally by an individual. A human can estimate the relevancy by comparing two handwritten words to see how similar they are based on the character and the location of each written words. Accordingly, claim 9 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 11 recites “generate a single line image in which a height of a circumscribed rectangle that is a determination target is made to be a standard”, “compare a height of a circumscribed rectangle of a handwritten character included in the generated single line image and a threshold based on the height of the circumscribed rectangle that is the determination target and counts the number of circumscribed rectangles that is greater than or equal to the threshold and the number of circumscribed rectangles that is less than the threshold”, and “specify the handwritten character that is the determination target for which the number of circumscribed rectangles greater than or equal to the threshold is larger than the number of circumscribed rectangle that is less than the threshold.”. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can choose which circumscribed rectangle drawn to use as a standard when drawing other rectangles. A human can also compare each drawn rectangle to write down both the numbers of rectangles that surpass or does not surpass the established limit. Afterwards, read again to observe and specify the outlier by highlighting it. Accordingly, claim 11 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim 12 recites “the threshold is set to a value that is approximately half the height of the circumscribed rectangle that is the determination target.”. The claim is part of the abstract idea because this step can be done mentally by an individual. A human can establish a threshold by measuring the height of a drawn circumscribed rectangle and then calculate the half of the number that was measured. Accordingly, claim 12 does not include additional elements that integrate the judicial exception into a practical application or amount to significantly more than the judicial exception.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1,2,3,7,8,9,11 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Ikeda et al. hereafter Ikeda (US Publication No. 20210056336 A1) in view of Yoshida et al., hereafter Yoshida (US Publication No. US 20230306184 A1) and further in view of Salameh, (US Publication No. US 20190332860 A1).
As per claim 1, Ikeda teaches “An image processing system comprising: one or more processors, connected to one or more memories, the one or more processors being configured to operate as:
an acquisition unit configured to acquire a target image by scanning a document that includes handwritten characters.” (Paragraph 75 talks about acquiring a “target image” which is a document that has handwriting e.g form. “The image processing apparatus 101 generates image data (to be referred to as a “document sample image” hereinafter) by scanning a document such as a form. The image processing apparatus 101 obtains a plurality of document sample images by scanning a plurality of documents. These documents include a handwritten document. The image processing apparatus 101 transmits the document sample images to the learning apparatus 102 via the network 105. Furthermore, when converting a form into text, the image processing apparatus 101 obtains image data (to be referred to as a “processing target image” hereinafter) as a processing target by scanning a document including a handwritten character (handwritten symbol or handwritten figure)” Ikeda.)
“an extraction unit configured to extract from the target image a handwritten area image indicating an approximate shape of the handwritten characters;” (Examiner interprets a handwritten character image as any image that includes handwritten characters. Examiner interprets a handwritten area image as an area that includes handwritten characters, an approximate shape of a handwritten character is interpreted as any shape demonstrating or encompassing handwritten characters. Fig 4E shows an example of “processing target image”, Fig 4F shows an example of “a handwritten character image” and Fig 4F and 4H show an example of “ a handwritten area image indicating an approximate shape of a handwritten character)
(Paragraph 77 talks about extracting handwritten pixels (handwritten characters) to create a handwriting extracted image (an image that includes handwriting) “… At this time, the image processing apparatus 101 extracts (specifies) handwriting pixels (pixel positions) in the processing target image by inference by the neural network for handwriting pixel extraction using the learning result generated by the learning apparatus 102, thereby obtaining a handwriting extracted image.” Ikeda)
(Paragraph 77 also talks about estimating a handwritten area image that indicates an approximate shape ( the width and height of the handwritten area). “ The estimated handwriting area is represented as position information indicating the position of the area where the handwritten character string is entered. For example, the handwriting area is expressed as information formed by a specific pixel position (for example, the upper left coordinates of the handwritten character area) of the area and the width and height from the specific pixel position. A plurality of handwriting areas may be obtained in accordance with the number of items entered in a form.” Ikeda)
“a determination unit configured to determine a line boundary of handwritten characters using the handwritten area image from which the specified handwritten character is excluded ;” Examiner interprets “a line boundary” as any edge used to encompass a handwritten area. Fig 4F shows a handwritten area being surrounded by a plurality of lines, which are also the line boundaries. The handwritten area is encompassed by vertical and horizontal lines. The “broken line frames” in this case is the line boundary. (“[0340] In S357, the image conversion unit 114 performs handwriting extraction for the processing target image based on the learning model for handwriting pixel extraction obtained in S356, and also performs handwriting area estimation for the thus obtained handwriting extracted image using the learning model for handwriting area estimation. FIG. 4F shows examples of handwriting areas obtained by the handwriting area estimation.
[0341] Referring to FIG. 4F, as indicated by broken line frames, handwriting areas 481 to 485 are obtained as handwriting areas on a handwriting extracted image 480, and each handwriting area includes one handwritten character line.” Ikeda)
Ikeda does not teach “a specification unit configured to specify, from among the handwritten characters, a handwritten character whose circumscribed rectangle has a height that is higher than a value determined based on circumscribed rectangles of other handwritten characters;”, “an exclusion unit configured to exclude, from the handwritten area image, an image corresponding to the specified handwritten character” and “a separation unit configured to separate based on the determined line boundary, a handwritten area corresponding to the handwritten area image into a plurality of lines.”
Yoshida teaches “a specification unit configured to specify, from among the handwritten characters, a handwritten character whose circumscribed rectangle has a height that is higher than a value determined based on circumscribed rectangles of other handwritten characters;” (See paragraph 196-199 and fig. 8. Paragraph 196 says “ When the distance L1 is smaller than the threshold (or smaller than or equal to the threshold), the display position control unit 25 determines whether the first text data 101 and the handwritten data 03 overlap when viewed in a horizontal direction or a vertical direction. In FIG. 8 (a), the coordinates of the upper left corner of the circumscribing rectangle of the first text data 101A are (x.sub.1, y.sub.1) and the coordinates of the lower right corner are (x.sub.2, y.sub.2). Therefore, it is determined that the first text data 101 and the handwritten data 03 overlap when viewed in a horizontal direction, when y.sub.1 or y.sub.2 falls within the height (between y.sub.3 and y.sub.4) of the circumscribing rectangle of the handwritten data 03, i.e., is greater than or equal to y.sub.3 and smaller than or equal to y.sub.4.” As seen in fig. 8 and in the paragraphs, it covers the Broadest Reasonable interpretation of the claim language as it shows a case in which the height is higher than a value from the other circumscribed rectangles, when the overlapping decision is made, the handwritten character is implicitly specified. Yoshida) “an exclusion unit configured to exclude, from the handwritten area image, an image corresponding to the specified handwritten character” (See paragraphs 196-199 and fig. 8. As seen in fig. 8 (a), when the circumscribed rectangle of a character is specified, (unit 03 in this case), that original image from the handwritten area is excluded, as it results in fig. 8 (b) (unit 03 was removed and another image was put in). Yoshida)
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida to specify a handwritten character whose circumscribed rectangle has a height that is higher than a value from the other circumscribed characters to exclude it. The modification would have been motivated by the desire of correctly displaying the specified character for it to be read by correcting overlapping as suggested by Yoshida (See paragraphs the occurring of fig. 8 and paragraphs 196-199 [0198]… When the display position control unit 25 thus determines that overlapping when viewed in a horizontal direction occurs, second text data 102 that is converted from the handwritten data 03 is continuously displayed without a space (a “space” means a space that is used to represent a word separation or a space from another character) at the right edge of the first text data 101A, using text data information depicted in Table 1 (d). [0199] “[0199] Here, a character size of text data is automatically determined according to a size of a circumscribing rectangle of handwritten data. Therefore, a size of the first text data 101A does not necessarily correspond to a size of the second text data 102…This allows the display apparatus 2 to display aligned text data for a user to easily read.” Yoshida).
Salameh teaches “a separation unit configured to separate based on the determined line boundary handwritten area corresponding to the handwritten area image into a plurality of lines.” (Fig 2A, 2B and 2C show how each handwritten area is separated based on the line boundary (which are also a plurality of lines) Fig 2B and Fig 2C shows each area separated. Salameh)
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda and Yoshida with the teachings of Salameh to determine a line boundary and line direction to then separate each area. The modification would have been motivated by the desire of correctly dividing each character in order to read the complete correct character to have improved accuracy as suggested by Salameh (“[0017] Traditional OCR techniques focus on separated characters (e.g., typed English characters). Accordingly, the error rate of the traditional OCR techniques in recognizing connected characters is much lower in languages using the Latin alphabet (e.g., English, French, German), than languages using the Arabic alphabet (e.g., Arabic, Farsi, Azerbaijani, Pashto, Kurdish, Lurish, Mandinka, etc.). Higher error rates can be costly. For example, an incorrect OCR can cause a wrong classification, or even loss of data. As another example, a wrongly recognized character may cause misleading information that can cause unintended consequences. Further, additional technical resources may need be expended to improve accuracy of OCR with such languages.” Salameh).
Claim 13 is rejected under the same analysis as claim 1. (See figs. 1 and 2A-2D)
As per claim 2, Ikeda in view of Yoshida and Salameh already teaches “The image processing system according to claim 1,”, however only Ikeda teaches “ wherein the extraction unit further extracts a handwritten character image from the target image, the one or more processors being configured to operate as:
a learning unit configured to generate a learning model using learning data associating a handwritten character image and a handwritten area image that are extracted from an original sample image”
“wherein the extraction unit extracts the handwritten character image and the handwritten area image using the learning model generated by the learning unit.” (Ikeda uses a learning apparatus to generate learning data that includes a handwritten image and handwritten area image. “[0076] The learning apparatus 102 functions as an image accumulation unit 115 that accumulates the document sample images generated by the image processing apparatus 101. The learning apparatus 102 also functions as a learning data generation unit 112 that generates learning data from the thus accumulated images. The learning data is data to be used to cause each of the neural networks for handwriting extraction and handwriting area estimation to perform learning.”
The abstract also describes that is uses a learning model that extracts handwriting (pixels) and estimates a handwriting area. Abstract, “using a first learning model for extracting the pixels of the handwritten character, estimates a handwriting area including the handwritten character using a second learning model for estimating the handwriting area, and performs handwriting OCR processing based on the generated first image and the estimated handwriting area.” Ikeda )
As per claim 3 Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 2” however only Ikeda teaches “the one or more processors being configured to operate as
“a setting unit configured to set from the original sample image a handwritten character image and a handwritten area in accordance with a user input”, (Ikeda teaches that a user can input a handwritten image and handwritten area by creating a ground truth. Paragraph 146 “In step S644, the CPU 231 determines whether the ground truth data input by the user is ground truth data of handwriting extraction. If the user performs an operation of instructing to create ground truth data of handwriting extraction (the user selects the extraction button 525), the CPU 231 determines YES and the process transitions to step S645; otherwise, that is, if the ground truth data input by the user is ground truth data of handwriting area estimation (the user selects the estimation button 526), the process transitions to step S646.” Ikeda)
“wherein the learning unit generates, for each character in the handwritten character image set by the setting unit, ground truth data for a handwritten area image”(Paragraph 160 “In step S706, the CPU 231 extracts part of the handwriting area estimation ground truth image read out in step S703, thereby generating a ground truth label image (to be referred to as a “handwriting area estimation ground truth label image” hereinafter) to be used for learning data of handwriting area estimation.” Ikeda)
“by overlapping an expansion image subjected to an expansion process in a horizontal direction and a reduction image in which a circumscribed rectangle encompassing a character of the handwritten character image has been reduced in a vertical direction,”
Examiner interprets “expansion image” as a rectangle that includes handwritten characters.
Fig 5F shows a handwriting area (a rectangle) including handwritten characters that can be reduced or expanded “[0349] A ground truth data addition instruction by the user in S305 of FIG. 3A is issued via this screen. The user performs an operation based on display contents of the ground truth data creation screen to instruct to create ground truth data. As shown in FIG. 5F, the ground truth data creation screen 560 includes an image display area 561, an image selection button 562, an enlargement button 563, a reduction button 564, a selection button 565, an excluding button 566, and a save button 567.” Ikeda)
( Paragraph 352 and 353 teach that it can include the whole character or exclude to use only a single character. [0352] “The user operates a cursor via the input device 236, as shown in FIG. 5F, to select an area including handwriting in the sample image displayed in the image display area 561. At this time, the user makes a selection so the handwritten characters included in the selected rectangular area do not have a line position non-coincidence multi-column. That is, the user performs an operation so that the selected rectangular area includes only one character line in one column. Upon receiving this operation, the learning data generation unit 112 records, as ground truth data, the area selected by this operation. That is, the ground truth data includes an area (one-line handwriting area) on the sample image.” Ikeda)
(“[0353] On the other hand, if the excluding button 566 is selected, the user can select, as an exclusion target, part of the selected handwriting area, such as a printed character or symbol included in the selected handwriting area.” Ikeda)
“and generates a learning model using the generated ground truth data.” (Paragraph 84 “The ground truth data is data to be used for learning of a neural network.”)
As per claim 7, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 1,” , however Ikeda also teaches “ the one or more processors being configured to operate as:
a character recognition unit configured to, for each handwritten area separated by the separation unit, perform an OCR process on a corresponding handwritten character image and output text data that corresponds to a handwritten character.” (Paragraph 72 talks about performing an OCR on a handwritten area ( “corresponding handwritten character image”), to the output (“converted into text”) “[0072] Furthermore, estimation of an area including a handwritten character will be referred to as “handwriting area estimation” hereinafter. An area obtained by handwriting area estimation will be referred to as a “handwriting area” hereinafter. A handwriting area in a scan image can be recognized by handwriting OCR and converted into text.” Ikeda)
As per claim 8, Ikeda in view of Yoshida and Salameh already teaches “The image processing system according to claim 7”, however Salameh also teaches “wherein the extraction unit further extracts a printed character image included in the target image and a printed character area encompassing a printed character, and
the character recognition unit further performs an OCR process on the printed character image included in the printed character area and outputs text data corresponding to a printed character.” (Fig 4 shows extracted printed characters and the OCR output is 402. Salameh)
As per claim 9, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 8,”, however Ikeda also teaches “further comprising:
an estimation unit configured to estimate relevance between a result of recognition of a handwritten character and a result of recognition of a printed character by the character recognition unit using at least one of content of text data according to the recognition results and positions of the handwritten character and the printed character in the target image.” (Paragraph demonstrates that is estimates the relevance (relationship) between a result of a handwritten character and a result of a printed character“, the “using at least one of content of text data… and positions…” in Ikeda is the semantic relationship and positional relationship. “[0193] In step S961, the CPU 261 integrates the handwriting OCR result and the printed character OCR result respectively received from the handwriting OCR unit 116 and the printed character OCR unit 117. The CPU 261 estimates the relationship between the handwriting OCR result and the printed character OCR result by evaluating the positional relationship between the original handwriting areas and printed character areas and the semantic relationship between the text data as the handwriting OCR result and the printed character OCR result.” Ikeda)
Claims 11 is rejected under 35 U.S.C. 103 as being unpatentable over Ikeda in view of Yoshida and Salameh and further in view of Ren et. al., hereafter Ren (US Publication No. 20200242389 A1).
As per claim 11, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 1”, however only Yoshida teaches “wherein the specification unit includes: a unit configured to, for each circumscribed rectangle of a plurality of the handwritten character extracted by the extraction unit, generate a single line image in which a height of a circumscribed rectangle that is a determination target is made to be a standard;” (See paragraphs 196-199 and fig. 8, “[0198]… When the display position control unit 25 thus determines that overlapping when viewed in a horizontal direction occurs, second text data 102 that is converted from the handwritten data 03 is continuously displayed without a space (a “space” means a space that is used to represent a word separation or a space from another character) at the right edge of the first text data 101A, using text data information depicted in Table 1 (d). More specifically, the display position control unit 25 causes the upper right corner of the circumscribing rectangle of the first text data 101A to be coincident with the upper left corner of the circumscribing rectangle of the second text data 102, and causes the lower right corner of the circumscribing rectangle of the first text data 101A to be coincident with the lower left corner of the circumscribing rectangle of the second text data 102, when displaying the second text data 102. FIG. 8 (b) depicts the second text data 102 aligned with the first text data 101A. ” As seen in fig. 8 (b), a single line image is generated based on the circumscribed rectangle height chosen to be the standard. Yoshida) “a unit configured to compare a height of a circumscribed rectangle of a handwritten character included in the generated single line image and a threshold based on the height of the circumscribed rectangle that is the determination target” (See paragraphs 196-200 and fig. 8. The heights in the cases are also interpreted as thresholds based on the height of the determination target. “[0197] The display position control unit 25 may not only determine whether the first text data 101A overlaps with the handwritten data 03 at least in part, but may add an overlap rate to the predetermined conditions. A rate of overlapping when viewed in a horizontal direction is calculated as (y.sub.2−y.sub.3)/(y.sub.2−y.sub.1) for a case of y.sub.1<y.sub.3<y.sub.2 and is calculated as (y.sub.4−y.sub.1)/(y.sub.2−y.sub.1) for a case of y.sub.1<y.sub.4<y.sub.2, for example. In this case, the display position control unit 25 determines that the predetermined condition (ii) is satisfied when the two sets of text data overlap when viewed in a horizontal direction and the overlap rate is greater than or equal to a threshold (or is greater than the threshold).” Yoshida) , however Ikeda in view of Yoshida and Salameh does not teach “and counts the number of circumscribed rectangles that is greater than or equal to the threshold and the number of circumscribed rectangles that is less than the threshold; and”, “a unit configured to specify the handwritten character that is the determination target for which the number of circumscribed rectangles greater than or equal to the threshold is larger than the number of circumscribed rectangle that is less than the threshold.”
Ren teaches “a unit configured to compare a height of a circumscribed rectangle of a handwritten character included in the generated single line image and a threshold based on the height of the circumscribed rectangle that is the determination target and counts the number of circumscribed rectangles that is greater than or equal to the threshold and the number of circumscribed rectangles that is less than the threshold; and” (Fig. 5 shows the number of rectangles included as actual characters based on a threshold that determines if a character is included or not. “[0046] FIG. 5 illustrates another example of a character recognition result by the character recognition apparatus 10. This example corresponds to the input image 100 exemplarily illustrated in FIG. 3. For the eleven characters disposed side by side in the order from the left in the input image 100, the noise determination unit 16 determines that the eleven characters correspond to actual characters based on the three components of the size, the distance from the nearest character, and the confidence level.” Ren also teaches the use of a range that uses the character size (circumscribed rectangle size) to determine a threshold (limit) “[0063] In an example, a classification model includes information indicating a range of values of an element when a character is an actual character, for each of three elements (that is, coordinate components) of the size of a character in a feature vector of a character to be recognized, the distance from the nearest character, and the confidence level. The information indicating the range is information indicating the upper limit and the lower limit of the range. Alternatively, the information indicating the range may determine one of the upper limit and the lower limit. In this case, the other one of the upper limit and the lower limit is a value of the lowermost limit or the uppermost limit of values available for the element. In this example, if the value of at least one element of elements of the feature vector obtained by the feature-vector calculation unit 14 for each of a character having a character recognition result, that is, the size of the character, the distance from the nearest character, and the confidence level, is not within the range corresponding to the element, the noise determination unit 16 determines that the character corresponds to a noise. In contrast, if all the three elements of the feature vector fall within the ranges corresponding to the respective elements, the character is an actual character.” Ren ) “a unit configured to specify the handwritten character that is the determination target for which the number of circumscribed rectangles greater than or equal to the threshold is larger than the number of circumscribed rectangle that is less than the threshold.” (See fig. 5, and paragraphs 46 with 63 . Fig. 5 also shows the number of rectangles that are counted as being outliers. Paragraph 46 “In addition, the noise determination unit 16 determines the residual five character recognition results in the frame 112 as noises based on the size, the distance from the nearest character (that is, from “custom-character (ban)”), and the confidence level of each character recognition result.” Ren)
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida, Salameh and Ren to count the circumscribed rectangles and specify the determination target based on the larger number of circumscribed rectangles and the threshold. The modification would have been motivated by the desire of correctly specifying handwritten characters and not noise as suggested by Ren (See fig. 5, which shows the number counted as outliers along with paragraphs 46, 53 and 63, those with lower values based on circumscribed rectangle values are considered as noise and are removed from the image, therefore the goal is to have only handwritten characters in the result. “[0046]… For the eleven characters disposed side by side in the order from the left in the input image 100, the noise determination unit 16 determines that the eleven characters correspond to actual characters based on the three components of the size, the distance from the nearest character, and the confidence level.” Ren).
Claims 4,5 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Ikeda in view of Yoshida and Salameh and further in view of Nakao et al., hereafter Nakao (US Publication No. US 6064769 A)
As per claim 4, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 1, wherein the extraction unit overlaps an image for which an expansion process in a horizontal direction and a reduction process in a vertical direction have been performed on a circumscribed rectangle encompassing a character of the extracted handwritten character image”, however Ikeda in view of Salameh does not teach “and a line connecting a center of gravity of the circumscribed rectangle between adjacent circumscribed rectangles, and extracts a result as the handwritten area image.”
Nakao teaches “ and a line connecting a center of gravity of the circumscribed rectangle between adjacent circumscribed rectangles, and extracts a result as the handwritten area image.” (Examiner interprets center of gravity as the center position. Column 5 Lines 20-35 “a rectangle center computing unit for computing center position of the circumscribed rectangles from the position data extracted by the rectangle position extracting unit; a slant computing unit for obtaining a slant value of a line connecting centers of rectangles computed by the rectangle center computing unit; and a first linking unit for judging whether an absolute value of the slant value is not less than a first threshold value and linking a pair of arrays of continuous first pixels of the rectangles if the slant value is not less than the first threshold value.” “With such construction, partly-divided characters such as "i" and "j" are correctly extracted.” Nakao)
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida, Salameh and Nakao to connect the center of each adjacent circumscribed rectangle. The modification would have been motivated by the desire of correctly dividing each character in order to read the complete correct character and have improved accuracy as suggested by Nakao (See Column 5 Lines 20-35. “With such construction, partly-divided characters such as "i" and "j" are correctly extracted.” Nakao)
As per claim 5, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 3, wherein the determination unit specifies…” and “specifies a space between two specified lines as a candidate interval in which there is a line boundary, and determines as a boundary in the candidate interval a line whose frequency of a pixel indicating a handwritten area is the lowest.” (See Fig. 4C and Fig. 4D and Paragraph 124 talks about the interval in between handwritten areas (candidate interval) not being a handwritten area, this being the area with the lowest frequency of pixels (being 0) . “ Pixels corresponding to the handwriting area selected by the user have a value (for example, 255) (the same applies to the following) indicating a handwriting area. The remaining pixels have a value (for example, 0) (the same applies to the following) indicating not a handwriting area. Such image as the ground truth data of handwriting area estimation will be referred to as a “handwriting area estimation ground truth image” hereinafter. FIG. 4D shows an example of the handwriting area estimation ground truth image.” Ikeda), however Ikeda in view of Yoshida and Salameh does not teach “specifies a line connecting the center of gravity of the circumscribed rectangle of each character between adjacent circumscribed rectangles”.
Nakao teaches “specifies a line connecting the center of gravity of the circumscribed rectangle of each character between adjacent circumscribed rectangles” (Examiner interprets center of gravity as the center position. Column 5 Lines 20-35 “a rectangle center computing unit for computing center position of the circumscribed rectangles from the position data extracted by the rectangle position extracting unit; a slant computing unit for obtaining a slant value of a line connecting centers of rectangles computed by the rectangle center computing unit; and a first linking unit for judging whether an absolute value of the slant value is not less than a first threshold value and linking a pair of arrays of continuous first pixels of the rectangles if the slant value is not less than the first threshold value… With such construction, partly-divided characters such as "i" and "j" are correctly extracted.” Nakao)
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida, Salameh and Nakao to specify connect the center of each adjacent circumscribed rectangle. The modification would have been motivated by the desire of correctly dividing each character in order to read the complete correct character and have improved accuracy as suggested by Nakao (See Column 5 Lines 20-27, “…With such construction, partly-divided characters such as "i" and "j" are correctly extracted.” Nakao)
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Ikeda in view of Yoshida, Salameh, Ren and further in view of Nakao et al., hereafter Nakao (US Publication No. US 6064769 A)
As per claim 12, Ikeda in view of Yoshida, Salameh and Ren already teaches “the image processing system according to claim 11 wherein the threshold is set…”. Ikeda in view of Yoshida, Salameh and Ren does not teach “wherein the threshold is set to a value that is approximately half the height of the circumscribed rectangle that is the determination target.”
Nakao teaches “wherein the threshold is set to a value that is approximately half the height of the circumscribed rectangle that is the determination target.” (Column 12 Lines 1-4 “ In the character recognition apparatus, the threshold value used in the part extracting unit may be a half of a maximum value of heights of the circumscribed rectangles of the character images.” Nakao)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida, Salameh and Nakao to set the threshold to half the height of the circumscribed rectangle. The modification would have been motivated by substituting Yoshida’s threshold method (simply based on the height of the circumscribed rectangle) by Nakao’s threshold method (half the height. (Column 12 Lines 1-4 Nakao). It would have been predictable that Nakao’s threshold function could have been used and succeeded instead of Yoshida’s since they both have the purpose of presenting better readable text (See column 35 line 50-67 “Then, word rectangle extracting unit 4503 compares each obtained distance with threshold value d. The threshold value d is, for example, a half of the height of a standard character rectangle. If an obtained value is equal to the threshold value d or more, the character row image is divided into parts having the distance in between, and circumscribed rectangles of the parts (hereinafter called word rectangles) are respectively extracted. For example, distance d2 between character images "1" and "S" is more than threshold value d as shown in FIG. 46A, and word rectangles 4606 and 4607 are extracted from character row image 4601 as shown in FIG. 46B.” Nakao) See MPEP § 2143(b). In addition it is used to improve accuracy as seen in Column 12 Lines 1-10 “(80) With such construction, since the shape evaluation values are not obtained when it is judged that any of the candidate characters does not have any similar characters, the processing speed increases without decreasing the accuracy of character recognition for a text including similar characters.” Nakao).
Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Ikeda in view of Yoshida and Salameh and further in view of Minagawa (US Publication No. US 20220291828 A1)
As per claim 6, Ikeda in view of Yoshida and Salameh already teaches “the image processing system according to claim 1,”, however Ikeda in view of Salameh does not teach “wherein in a case where a height of the handwritten area that is a processing target is higher than a predetermined threshold based on an average of a height of a circumscribed rectangle corresponding to each of a plurality of characters included in the handwritten area, the determination unit determines that handwriting of a plurality of lines is included in the handwritten area.”
Minagawa teaches “wherein in a case where a height of the handwritten area that is a processing target is higher than a predetermined threshold based on an average of a height of a circumscribed rectangle corresponding to each of a plurality of characters included in the handwritten area, the determination unit determines that handwriting of a plurality of lines is included in the handwritten area.” (Paragraph 87 talks about using the circumscribed rectangles height, which also includes an average value, to expand and include more handwriting. Paragraph 94 then talks about using the size corresponding to a single character as the threshold (which paragraph 87 specifies that this character size can be an average value)
(“[0087] FIG. 7 is a diagram illustrating a method of determining a character size, which is a size corresponding to a single character, according to the present embodiment. The character size calculation unit 307 determines the largest width and the largest height in the stroke data already input as the handwriting input as a width and a height of the character size, which is a size corresponding to a single character. In other words, the character size calculation unit 307 obtains a circumscribed rectangle of each of strokes represented by the stroke data and extracts the maximum width and the maximum height. When expanding the frame 402 in the horizontal direction, the drawing object processing unit 308 expands the frame 402 by at least the maximum width to the right. When the frame 402 is expanded in a downward direction that is a direction perpendicular to the handwriting direction, the drawing object processing unit 308 expands the frame 402 by at least the maximum height. Note that the drawing, object processing unit 308 may adopt a minimum value, an average value, or a median value in alternative to a maximum value.” Minagawa) (“[0094] Next, the drawing object processing unit 308 determines whether there is space for a character to be written next in the frame 402 (S3). In the present embodiment, the character size, which is a size corresponding to a single character, calculated at step S2 is used as a threshold by the drawing object processing unit 308 to determine whether there is space or not. When space between the right end of the rightmost character of the handwriting input and the right side of the frame 402 is less than the character size, the drawing object processing unit 308 determines that there is not enough space (absence of the space).” Minagawa )
It would have been obvious to one of ordinary skill in the art before the effective filing
date of the claimed invention to combine the teachings of Ikeda with the teachings of Yoshida, Salameh and Minagawa to detect if an area of handwriting is included based on a threshold. The modification would have been motivated by the desire of correctly encompassing all of the characters in order to read the complete correct character and have improved accuracy as suggested by Minagawa (Paragraph 94 “In the present embodiment, the character size, which is a size corresponding to a single character, calculated at step S2 is used as a threshold by the drawing object processing unit 308 to determine whether there is space or not.” Minagawa)
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DYLAN J MENDEZ MUNIZ whose telephone number is (703)756-5672. The examiner can normally be reached M-F, 8AM - 5PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Moyer can be reached at (571) 272-9523. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DYLAN JOHN MENDEZ MUNIZ/Examiner, Art Unit 2675
/ANDREW M MOYER/Supervisory Patent Examiner, Art Unit 2675