DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of Claims
Applicant’s Amendments filed on 02/05/2026 has been entered and made of record.
Currently pending Claim(s)
1, 3–9, and 11–17
Independent Claim(s)
1, 9, 17
Canceled Claim(s)
2, 10
Response to Arguments
This office action is responsive to Applicant’s Arguments/Remarks Made in an Amendment received on February 5, 2026.
Regarding the rejections made under 35 USC 103, Applicant’s Arguments/Remarks with respect to independent claims 1, 9, and 17, on pages 10–14, have been fully considered but they are not persuasive.
The Applicant argues, in summary, that Lee (“Data Augmentations for Document Images”) does not teach the following limitation:
PNG
media_image1.png
139
813
media_image1.png
Greyscale
The Examiner respectfully disagrees. The Applicant argues, on page 11, the following:
PNG
media_image2.png
336
671
media_image2.png
Greyscale
The Applicant argues that Lee does not perform pixel-wise masking. However, Lee discloses bounding boxes or mask regions that has a full_value matrix that fills the binary mask with 0, meaning black, and 1, meaning white [pg. 4, left column of section DocCutout, second paragraph]. In addition, B is defined as the bounding box area that includes masking box coordinates to fill the mask region [pg. 4, left column of section DocCutout, second paragraph]. Under the broadest reasonable interpretation, Lee discloses pixel-wise masking because the bounding box region or mask region uses a fill_value matrix (F) that only fills the pixel values inside the designated region. Furthermore, concerning the Applicant’s arguments that Lee is incompatible with pixel-wise foreground manipulation due to the masking processing not altering pixels and preserving semantic consistency for training, the claims do not explicitly disclose altering pixels for pixel-wise foreground masking. If the Applicant believes that altering pixels, in combination with the remaining limitations in the claim, makes the claimed invention patentably distinct, the Examiner advises the Applicant to incorporate this element into the claim limitation.
The Applicant continues, on the top of page 12, as follows:
PNG
media_image3.png
122
686
media_image3.png
Greyscale
The Applicant argues that the claimed invention discloses the masks as “pixel-wise white patches applied only to the foreground.” However, the claim does not explicitly disclose only applying pixel-wise white patches to the foreground only while excluding the background. If the Applicant believes that this element, in combination with the remaining limitations in the claim, makes the claimed invention patentably distinct, the Examiner advises the Applicant to incorporate this into the claim limitation. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. In regards to Lee, under the broadest reasonable interpretation, the bounding boxes distinguish the foreground from the background and generate pixel-wise white patches [Figure 1b].
PNG
media_image4.png
855
559
media_image4.png
Greyscale
Figure 1b of Lee reference
The Applicant argues, on the middle of page 12, the following:
PNG
media_image3.png
122
686
media_image3.png
Greyscale
The Examiner respectfully disagrees. Lee discloses pixel-wise masking because it discloses filling pixel values with white or black values within a bounding box. Lee uses an element-wise multiplication operator to generate and fill the mask region, which discloses generating and filling mask regions on the pixel-wise level [pg. 4, left column of section DocCutout, second paragraph]. In addition, as previously stated, the claimed invention does not explicitly state applying the pixel-wise masking selectively to the foreground, and, in turn, excluding the background.
The Applicant argues, on the bottom of page 12 to the top of page 13, as follows:
PNG
media_image5.png
144
663
media_image5.png
Greyscale
PNG
media_image6.png
49
663
media_image6.png
Greyscale
Applicant arguments that Lee is performed before inference, during model training while the step of “randomly applying” to the final stylized image is a post-processing element. The Examiner respectfully disagrees. According to Applicant’s disclosure, randomly applying pixel-wise masks on the final stylized image generates an image sample for training. “One or more pixel-wise masks may be randomly applied on the final stylized image ‘IF’ to obtain an image sample [...] the image sample obtained by applying the one or more pixel-wise masks may correspond to an artificial masked image ‘IM’ [...] the artificial masked image ‘IM’ may be provided as an input to the image binarization model 108 in order to train the image binarization model 108” [¶032]. The disclosure explicitly states that the image sample generated from randomly applying pixel-wise masks is used for training the image binarization model. Lee also discloses training by creating image samples with data augmentation [pg. 1, Abstract; pg. 3, right column of section DocCutout, first paragraph].
The Applicant argues, on page 13, the following:
PNG
media_image6.png
49
663
media_image6.png
Greyscale
PNG
media_image7.png
358
672
media_image7.png
Greyscale
In response to Applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. Furthermore, there is motivation to combinate Lee with Huang and Hu because the motivation is grounded in data augmentation. Both Hu and Lee disclose data augmentation for generation of additional training images. Hu disclose augmentation on the input image and the later annotated image which is used to generate the final binarization result [¶00119-00121] while Lee discloses data augmentation to augment limited training data and improve generalization capabilities of models [pg. 1, left column of section 1 Introduction, first paragraph]. Therefore, it would have been obvious to someone of ordinary skill in the art to combined the references to obtain the invention.
Claim Objections
Claims 1, 9, and 17 are objected to because of the following informalities:
Claim 1: in line 23, “obtain an image sample” should be changed to “to obtain an image sample.”
Claim 9: in line 27, “obtain an image sample” should be changed to “to obtain an image sample.”
Claim 17: in line 26, “obtain an image sample” should be changed to “to obtain an image sample.”
Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 3, 8, 9, 11, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. ("Arbitrary style transfer in real-time with adaptive instance normalization." Proceedings of the IEEE international conference on computer vision. 2017) (hereafter, “Huang”) in view of Hu et al. (CN 112837329 B) (hereafter, “Hu”) and further in view of Lee et al. (Lee, Yunsung, Teakgyu Hong, and Seungryong Kim. "Data Augmentations for Document Images." SDU@ AAAI. 2021) (hereafter, “Lee”).
Regarding claim 1, Huang discloses receiving, by a TransferNet framework, a source image and a corresponding target image from an image dataset [using MS-COCO [36] as content images and a dataset of paintings mostly collected from WikiArt [39] as style images, pg. 1505, 6.2 Training] via at least one encoder model [Figure 2; our style transfer network T takes a content image c (the examiner interprets the content image as the source image) and an arbitrary style image s (the examiner interprets the style image as the target image) as inputs, pg. 1504, 6.1 Architecture], and wherein the TransferNet framework comprises the at least one encoder model, an Adaptive Instance Normalization (AdaIN) module, and a decoder model [Figure 2; adopt a simple encoder-decoder architecture ... after encoding ... we feed both feature maps to a AdaIN layer, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a source image feature map corresponding to the source image and a target image feature map corresponding to the target image via the at least one encoder model [Figure 2; After encoding the content and style images in feature space, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a rough stylized image feature map through the AdaIN module based on each of the source image feature map and the target image feature map [we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps, producing the target feature maps, pg. 1504, 6.1 Architecture], wherein the rough stylized image feature map comprises a combination of background of the source image and foreground of the target image [Figure 2 & 4; take a content image c and an arbitrary style image s as input, and synthesizes an output image that recombines content of the former and the style latter ... producing the target feature maps t ... decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]; and transforming, by the TransferNet framework, the rough stylized image feature map into an image form through the decoder model to obtain a rough stylized image [Figure 2; a randomly initialized decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture].
Huang fails to explicitly disclose wherein the target image is a pixel-level ground truth image of the source image, and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image; and randomly applying, by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample, wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image.
However, Hu teaches wherein the target image is a pixel-level ground truth image of the source image [acquire an image of an ancient Tibetan book document, and perform a binarization process on the image of the Tibetan ancient book document to determine a preliminary binarized image, para 0088], and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image [the second binarized Tibetan ancient book document image generation unit is used to utilize the Ojin binarization algorithm Binarization processing is performed on the reduced Tibetan ancient book document image to generate a second binarized Tibetan ancient book document image; the final binarization result map generation unit is used to integrate the first binarized image, para 00122].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu to more clearly and accurately display content in the image, as recognized by Hu.
Neither Huang nor Hu appears to explicitly disclose randomly applying, by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample, wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image.
However, Lee teaches randomly applying [Algorithm 1; we set the probability of applying augmentation to 0.5. DocCutOut has hyper-parameters called fill_value,
λ
, and patch_ratio, pg. 5, Parameters for DocCutout and DocCutMix, first paragraph], by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample [we propose novel augmentation methods, DocCutout and DocCutMix, that are more suitable for document images, by applying the transform to each word unit and thus preserving text semantic feature during augmentation, pg. 1, Abstract ... we mask the visual feature vectors of text images, pg. 4, DocCutOut, second paragraph], wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image [0 is replaced with the fill_value matrix F which represents the value to be filled in mask region. We experimented with F for 0, meaning black, and 1, meaning white, pg. 4, DocCutout, second paragraph].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and incorporate the teachings of Lee to create more realistic data and show better precision [pg. 6, Hyper-parameter search for our methods, second paragraph], as recognized by Lee.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 1.
Regarding claim 3, which claim 1 is incorporated, Huang fails to explicitly disclose training the image binarization model using the image sample for performing image binarization.
However, Hu teaches training the image binarization model using the image sample for performing image binarization [use the image annotation map of the Tibetan ancient book document and the Tibetan ancient book document image to train the improved U-Net network model, generate a trained U-net network model, para 0095].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu by training the image binarization model to train the network to focus on salient regions and suppress irrelevant background regions such as noise, as recognized by Hu.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 3.
Regarding claim 8, which claim 1 is incorporated, Huang discloses generating the rough stylized image feature map comprises: extracting, by the TransferNet framework, features corresponding to background of at least one image of the image dataset from a feature map of each of the at least one image and features corresponding to the foreground of the target image from the target image feature map [Figure 2; an AdaIN layer is used to perform style transfer in the feature space, pg. 1504 ... our style transfer network T takes a content image c and an arbitrary style image s as inputs, and synthesizes an output image that recombines the content of the former and the style latter ... after encoding the content and style images in feature space, we feed both feature maps to an AdaIN layer, 5. Adaptive Instance Normalization; pg. 1504, 6.1 Architecture]; and performing, by the TransferNet framework, channel-wise adjustments of mean and variance of the features corresponding to the background of the at least one image to match mean and variance of the features corresponding to the foreground of the target image to obtain the rough stylized image feature map [after encoding the content and style images in feature space, we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps ... AdaIN receives a content input x and a style input y, and simply aligns the channel-wise mean an variance of x to match those of y, pg. 1504, 6.1 Architecture; pg. 1504, 5. Adaptive Instance Normalization].
Regarding claim 9, Huang discloses receiving, by a TransferNet framework, a source image and a corresponding target image from an image dataset [using MS-COCO [36] as content images and a dataset of paintings mostly collected from WikiArt [39] as style images, pg. 1505, 6.2 Training] via at least one encoder model [Figure 2; our style transfer network T takes a content image c (the examiner interprets the content image as the source image) and an arbitrary style image s (the examiner interprets the style image as the target image) as inputs, pg. 1504, 6.1 Architecture], and wherein the TransferNet framework comprises the at least one encoder model, an Adaptive Instance Normalization (AdaIN) module, and a decoder model [Figure 2; adopt a simple encoder-decoder architecture ... after encoding ... we feed both feature maps to a AdaIN layer, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a source image feature map corresponding to the source image and a target image feature map corresponding to the target image via the at least one encoder model [Figure 2; After encoding the content and style images in feature space, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a rough stylized image feature map through the AdaIN module based on each of the source image feature map and the target image feature map [we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps, producing the target feature maps, pg. 1504, 6.1 Architecture], wherein the rough stylized image feature map comprises a combination of background of the source image and foreground of the target image [Figure 2 & 4; take a content image c and an arbitrary style image s as input, and synthesizes an output image that recombines content of the former and the style latter ... producing the target feature maps t ... decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]; and transforming, by the TransferNet framework, the rough stylized image feature map into an image form through the decoder model to obtain a rough stylized image [Figure 2; a randomly initialized decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture].
Huang fails to explicitly disclose a processor; a memory coupled to the processor, wherein the memory stores processor-executable instructions, wherein the target image is a pixel-level ground truth image of the source image, and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image.
However, Hu teaches a processor [the present invention is divided into two branches: GPU branch and CPU branch, para 00109]; a memory coupled to the processor [for the CPU branch ... the larger-capacity RAM can completely save the operation result of the CPU, para 00113], wherein the memory stores processor-executable instructions, wherein the target image is a pixel-level ground truth image of the source image [acquire an image of an ancient Tibetan book document, and perform a binarization process on the image of the Tibetan ancient book document to determine a preliminary binarized image, para 0088], and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image [the second binarized Tibetan ancient book document image generation unit is used to utilize the Ojin binarization algorithm Binarization processing is performed on the reduced Tibetan ancient book document image to generate a second binarized Tibetan ancient book document image; the final binarization result map generation unit is used to integrate the first binarized image, para 00122].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu to more clearly and accurately display content in the image, as recognized by Hu.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 9.
Regarding claim 11, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 3 renders obvious the steps of the system claim 11, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 3 is equally applicable to claim 11.
Regarding claim 16, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 8 renders obvious the steps of the system claim 16, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 8 is equally applicable to claim 16.
Regarding claim 17, (drawn to a non-transitory computer-readable medium) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 1 renders obvious the steps of the non-transitory computer-readable medium claim 17, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 1 is equally applicable to claim 17.
Claims 4 and 12 rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images"), as applied above, and Li et al. (“Page segmentation using convolutional neural network and graphical model,”, July 26–29, 2020, Proceedings 14 (pp. 231-245). Springer International Publishing) (hereafter, “Li”).
Regarding claim 4, which claim 3 is incorporated, neither Huang, Hu, nor Lee discloses wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs).
However, Li teaches wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs) [Figure 2 & 4; after down-stream encoding and up-stream decoding ... node features and edge features can go through multiple GAT layers, 3.3 Deep Feature Learning; 3.4 Context Integration].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Li for contextual information in a larger scope, as recognized by Li.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Li with Huang, Hu, and Lee to obtain the invention as specified in claim 4.
Regarding claim 12, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Li explained in the rejection of method claim 4 renders obvious the steps of the system claim 12, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 4 is equally applicable to claim 12.
Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images"), as applied above, and Suh et al. (Suh, Sungho, et al. "Two-stage generative adversarial networks for binarization of color document images." Pattern Recognition 130 (May 2022): 108810.) (hereafter, “Suh”).
Regarding claim 5, which claim 1 is incorporated, neither Huang, Hu, nor appears to explicitly disclose performing, by the image binarization model, image binarization of the image sample through a binarization technique to recover inter-connectivity between broken characters in the image sample.
However, Suh teaches performing, by the image binarization model, image binarization of the image sample through a binarization technique to recover inter-connectivity between broken characters in the image sample [Figure 5 & 11; in the local binarization process, regions of different colors can be removed from text components, and text-only regions can be extracted by using the image result from the document image enhancement stage, pg. 5, right column, 3.2. Document image binarization using multi-stage feature fusion, second paragraph ... the proposed method differentiates textual components from background more effectively than the state-of-the-art methods ... the proposed method not only removes the shadow and noise better but also preserves the textual components, pg. 6, right column, 4.3.1. Results on DIBCO datasets, second paragraph, third paragraph].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Suh for better extraction of strokes and prevent background misclassification, as recognized by Suh [pg. 2, right column, Introduction, second paragraph].
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Suh with Huang, Hu, and Lee to obtain the invention as specified in claim 5.
Regarding claim 13, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Suh explained in the rejection of method claim 5 renders obvious the steps of the system claim 13, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 5 is equally applicable to claim 13.
Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images") and Suh (“Two-stage generative adversarial networks for binarization of color document images”), as applied above, and Li (“Page segmentation using convolutional neural network and graphical model”).
Regarding claim 6, which claim 5 is incorporated, neither Huang, Hu, nor Lee discloses wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs).
However, Li teaches wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs) [Figure 2 & 4; after down-stream encoding and up-stream decoding ... node features and edge features can go through multiple GAT layers, 3.3 Deep Feature Learning; 3.4 Context Integration].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Li for contextual information in a larger scope, as recognized by Li.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Li with Huang, Hu, and Lee to obtain the invention as specified in claim 4.
Regarding claim 14, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Li explained in the rejection of method claim 6 renders obvious the steps of the system claim 14, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 6 is equally applicable to claim 14.
Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images”), as applied above, and further in view of Wang et al. ("CLAST: Contrastive learning for arbitrary style transfer." IEEE Transactions on Image Processing 31 (Oct 2022): 6761-6772) (hereafter, “Wang”).
Regarding claim 7, which claim 1 is incorporated, neither Huang, Hu, nor Lee appears to explicitly disclose wherein the at least one encoder model comprises a source encoder model and a target encoder model, and wherein the source encoder model is configured to receive the source image and the target encoder model is configured to receive the target image.
However, Wang teaches wherein the at least one encoder model comprises a source encoder model and a target encoder model [Figure 3; our style transfer network (STNet) consists of a content encoder Ec, a style encoder Es, pg. 6765; pg. 6766, E. Overall Framework], and wherein the source encoder model is configured to receive the source image and the target encoder model is configured to receive the target image [Figure 3; denoting the input content image as Ic, the style image as Is, we first use the Ec and Es to extract the content and style feature of Ic and Is, pg. 6766, E. Overall Framework].
It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Wang with a source encoder model and a target encoder model to prevent losing domain-specific information and visual degradation, as recognized by Wang.
Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Wang with Huang, Hu, and Lee to obtain the invention as specified in claim 7.
Regarding claim 15, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Wang explained in the rejection of method claim 7 renders obvious the steps of the system claim 15, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 7 is equally applicable to claim 15.
Conclusion
The art made of record and not relied upon is considered pertinent to applicant's disclosure:
US20190244060A1 to Dundar et al. discloses a style transfer neural network that generates stylized synthetic images based on real images that provide the style to transfer to synthetic images.
DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization to Biswas et al. discloses a two-level vision transformer architecture for document image binarization.
ColdBin: Cold Diffusion for Document Image Binarization to Saifullah et al. discloses an end-to-end framework for binarization of degraded document images based on cold diffusion.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TOLUWANI MARY-JANE IJASEUN whose telephone number is (571)270-1877. The examiner can normally be reached Monday - Friday 7:30AM-4PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TOLUWANI MARY-JANE IJASEUN/Examiner, Art Unit 2676
/Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676