Prosecution Insights
Last updated: April 19, 2026
Application No. 18/124,610

METHOD AND SYSTEM FOR IMAGE BINARIZATION OF DEGRADED DOCUMENT IMAGES

Final Rejection §103
Filed
Mar 22, 2023
Examiner
SHIFERAW, HENOK ASRES
Art Unit
2676
Tech Center
2600 — Communications
Assignee
Hcl Technologies Limited
OA Round
3 (Final)
90%
Grant Probability
Favorable
4-5
OA Rounds
1y 10m
To Grant
91%
With Interview

Examiner Intelligence

Grants 90% — above average
90%
Career Allow Rate
518 granted / 578 resolved
+27.6% vs TC avg
Minimal +2% lift
Without
With
+1.5%
Interview Lift
resolved cases with interview
Fast prosecutor
1y 10m
Avg Prosecution
19 currently pending
Career history
597
Total Applications
across all art units

Statute-Specific Performance

§101
12.3%
-27.7% vs TC avg
§103
72.7%
+32.7% vs TC avg
§102
6.2%
-33.8% vs TC avg
§112
4.0%
-36.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 578 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Status of Claims Applicant’s Amendments filed on 02/05/2026 has been entered and made of record. Currently pending Claim(s) 1, 3–9, and 11–17 Independent Claim(s) 1, 9, 17 Canceled Claim(s) 2, 10 Response to Arguments This office action is responsive to Applicant’s Arguments/Remarks Made in an Amendment received on February 5, 2026. Regarding the rejections made under 35 USC 103, Applicant’s Arguments/Remarks with respect to independent claims 1, 9, and 17, on pages 10–14, have been fully considered but they are not persuasive. The Applicant argues, in summary, that Lee (“Data Augmentations for Document Images”) does not teach the following limitation: PNG media_image1.png 139 813 media_image1.png Greyscale The Examiner respectfully disagrees. The Applicant argues, on page 11, the following: PNG media_image2.png 336 671 media_image2.png Greyscale The Applicant argues that Lee does not perform pixel-wise masking. However, Lee discloses bounding boxes or mask regions that has a full_value matrix that fills the binary mask with 0, meaning black, and 1, meaning white [pg. 4, left column of section DocCutout, second paragraph]. In addition, B is defined as the bounding box area that includes masking box coordinates to fill the mask region [pg. 4, left column of section DocCutout, second paragraph]. Under the broadest reasonable interpretation, Lee discloses pixel-wise masking because the bounding box region or mask region uses a fill_value matrix (F) that only fills the pixel values inside the designated region. Furthermore, concerning the Applicant’s arguments that Lee is incompatible with pixel-wise foreground manipulation due to the masking processing not altering pixels and preserving semantic consistency for training, the claims do not explicitly disclose altering pixels for pixel-wise foreground masking. If the Applicant believes that altering pixels, in combination with the remaining limitations in the claim, makes the claimed invention patentably distinct, the Examiner advises the Applicant to incorporate this element into the claim limitation. The Applicant continues, on the top of page 12, as follows: PNG media_image3.png 122 686 media_image3.png Greyscale The Applicant argues that the claimed invention discloses the masks as “pixel-wise white patches applied only to the foreground.” However, the claim does not explicitly disclose only applying pixel-wise white patches to the foreground only while excluding the background. If the Applicant believes that this element, in combination with the remaining limitations in the claim, makes the claimed invention patentably distinct, the Examiner advises the Applicant to incorporate this into the claim limitation. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. In regards to Lee, under the broadest reasonable interpretation, the bounding boxes distinguish the foreground from the background and generate pixel-wise white patches [Figure 1b]. PNG media_image4.png 855 559 media_image4.png Greyscale Figure 1b of Lee reference The Applicant argues, on the middle of page 12, the following: PNG media_image3.png 122 686 media_image3.png Greyscale The Examiner respectfully disagrees. Lee discloses pixel-wise masking because it discloses filling pixel values with white or black values within a bounding box. Lee uses an element-wise multiplication operator to generate and fill the mask region, which discloses generating and filling mask regions on the pixel-wise level [pg. 4, left column of section DocCutout, second paragraph]. In addition, as previously stated, the claimed invention does not explicitly state applying the pixel-wise masking selectively to the foreground, and, in turn, excluding the background. The Applicant argues, on the bottom of page 12 to the top of page 13, as follows: PNG media_image5.png 144 663 media_image5.png Greyscale PNG media_image6.png 49 663 media_image6.png Greyscale Applicant arguments that Lee is performed before inference, during model training while the step of “randomly applying” to the final stylized image is a post-processing element. The Examiner respectfully disagrees. According to Applicant’s disclosure, randomly applying pixel-wise masks on the final stylized image generates an image sample for training. “One or more pixel-wise masks may be randomly applied on the final stylized image ‘IF’ to obtain an image sample [...] the image sample obtained by applying the one or more pixel-wise masks may correspond to an artificial masked image ‘IM’ [...] the artificial masked image ‘IM’ may be provided as an input to the image binarization model 108 in order to train the image binarization model 108” [¶032]. The disclosure explicitly states that the image sample generated from randomly applying pixel-wise masks is used for training the image binarization model. Lee also discloses training by creating image samples with data augmentation [pg. 1, Abstract; pg. 3, right column of section DocCutout, first paragraph]. The Applicant argues, on page 13, the following: PNG media_image6.png 49 663 media_image6.png Greyscale PNG media_image7.png 358 672 media_image7.png Greyscale In response to Applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. Furthermore, there is motivation to combinate Lee with Huang and Hu because the motivation is grounded in data augmentation. Both Hu and Lee disclose data augmentation for generation of additional training images. Hu disclose augmentation on the input image and the later annotated image which is used to generate the final binarization result [¶00119-00121] while Lee discloses data augmentation to augment limited training data and improve generalization capabilities of models [pg. 1, left column of section 1 Introduction, first paragraph]. Therefore, it would have been obvious to someone of ordinary skill in the art to combined the references to obtain the invention. Claim Objections Claims 1, 9, and 17 are objected to because of the following informalities: Claim 1: in line 23, “obtain an image sample” should be changed to “to obtain an image sample.” Claim 9: in line 27, “obtain an image sample” should be changed to “to obtain an image sample.” Claim 17: in line 26, “obtain an image sample” should be changed to “to obtain an image sample.” Appropriate correction is required. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1, 3, 8, 9, 11, 16, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Huang et al. ("Arbitrary style transfer in real-time with adaptive instance normalization." Proceedings of the IEEE international conference on computer vision. 2017) (hereafter, “Huang”) in view of Hu et al. (CN 112837329 B) (hereafter, “Hu”) and further in view of Lee et al. (Lee, Yunsung, Teakgyu Hong, and Seungryong Kim. "Data Augmentations for Document Images." SDU@ AAAI. 2021) (hereafter, “Lee”). Regarding claim 1, Huang discloses receiving, by a TransferNet framework, a source image and a corresponding target image from an image dataset [using MS-COCO [36] as content images and a dataset of paintings mostly collected from WikiArt [39] as style images, pg. 1505, 6.2 Training] via at least one encoder model [Figure 2; our style transfer network T takes a content image c (the examiner interprets the content image as the source image) and an arbitrary style image s (the examiner interprets the style image as the target image) as inputs, pg. 1504, 6.1 Architecture], and wherein the TransferNet framework comprises the at least one encoder model, an Adaptive Instance Normalization (AdaIN) module, and a decoder model [Figure 2; adopt a simple encoder-decoder architecture ... after encoding ... we feed both feature maps to a AdaIN layer, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a source image feature map corresponding to the source image and a target image feature map corresponding to the target image via the at least one encoder model [Figure 2; After encoding the content and style images in feature space, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a rough stylized image feature map through the AdaIN module based on each of the source image feature map and the target image feature map [we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps, producing the target feature maps, pg. 1504, 6.1 Architecture], wherein the rough stylized image feature map comprises a combination of background of the source image and foreground of the target image [Figure 2 & 4; take a content image c and an arbitrary style image s as input, and synthesizes an output image that recombines content of the former and the style latter ... producing the target feature maps t ... decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]; and transforming, by the TransferNet framework, the rough stylized image feature map into an image form through the decoder model to obtain a rough stylized image [Figure 2; a randomly initialized decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]. Huang fails to explicitly disclose wherein the target image is a pixel-level ground truth image of the source image, and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image; and randomly applying, by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample, wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image. However, Hu teaches wherein the target image is a pixel-level ground truth image of the source image [acquire an image of an ancient Tibetan book document, and perform a binarization process on the image of the Tibetan ancient book document to determine a preliminary binarized image, para 0088], and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image [the second binarized Tibetan ancient book document image generation unit is used to utilize the Ojin binarization algorithm Binarization processing is performed on the reduced Tibetan ancient book document image to generate a second binarized Tibetan ancient book document image; the final binarization result map generation unit is used to integrate the first binarized image, para 00122]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu to more clearly and accurately display content in the image, as recognized by Hu. Neither Huang nor Hu appears to explicitly disclose randomly applying, by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample, wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image. However, Lee teaches randomly applying [Algorithm 1; we set the probability of applying augmentation to 0.5. DocCutOut has hyper-parameters called fill_value, λ , and patch_ratio, pg. 5, Parameters for DocCutout and DocCutMix, first paragraph], by the Refinement Network, one or more pixel-wise masks on the final stylized image to obtain an image sample [we propose novel augmentation methods, DocCutout and DocCutMix, that are more suitable for document images, by applying the transform to each word unit and thus preserving text semantic feature during augmentation, pg. 1, Abstract ... we mask the visual feature vectors of text images, pg. 4, DocCutOut, second paragraph], wherein each of the one or more pixel-wise masks is a white patch applied to foreground of the final stylized image [0 is replaced with the fill_value matrix F which represents the value to be filled in mask region. We experimented with F for 0, meaning black, and 1, meaning white, pg. 4, DocCutout, second paragraph]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and incorporate the teachings of Lee to create more realistic data and show better precision [pg. 6, Hyper-parameter search for our methods, second paragraph], as recognized by Lee. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 1. Regarding claim 3, which claim 1 is incorporated, Huang fails to explicitly disclose training the image binarization model using the image sample for performing image binarization. However, Hu teaches training the image binarization model using the image sample for performing image binarization [use the image annotation map of the Tibetan ancient book document and the Tibetan ancient book document image to train the improved U-Net network model, generate a trained U-net network model, para 0095]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu by training the image binarization model to train the network to focus on salient regions and suppress irrelevant background regions such as noise, as recognized by Hu. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 3. Regarding claim 8, which claim 1 is incorporated, Huang discloses generating the rough stylized image feature map comprises: extracting, by the TransferNet framework, features corresponding to background of at least one image of the image dataset from a feature map of each of the at least one image and features corresponding to the foreground of the target image from the target image feature map [Figure 2; an AdaIN layer is used to perform style transfer in the feature space, pg. 1504 ... our style transfer network T takes a content image c and an arbitrary style image s as inputs, and synthesizes an output image that recombines the content of the former and the style latter ... after encoding the content and style images in feature space, we feed both feature maps to an AdaIN layer, 5. Adaptive Instance Normalization; pg. 1504, 6.1 Architecture]; and performing, by the TransferNet framework, channel-wise adjustments of mean and variance of the features corresponding to the background of the at least one image to match mean and variance of the features corresponding to the foreground of the target image to obtain the rough stylized image feature map [after encoding the content and style images in feature space, we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps ... AdaIN receives a content input x and a style input y, and simply aligns the channel-wise mean an variance of x to match those of y, pg. 1504, 6.1 Architecture; pg. 1504, 5. Adaptive Instance Normalization]. Regarding claim 9, Huang discloses receiving, by a TransferNet framework, a source image and a corresponding target image from an image dataset [using MS-COCO [36] as content images and a dataset of paintings mostly collected from WikiArt [39] as style images, pg. 1505, 6.2 Training] via at least one encoder model [Figure 2; our style transfer network T takes a content image c (the examiner interprets the content image as the source image) and an arbitrary style image s (the examiner interprets the style image as the target image) as inputs, pg. 1504, 6.1 Architecture], and wherein the TransferNet framework comprises the at least one encoder model, an Adaptive Instance Normalization (AdaIN) module, and a decoder model [Figure 2; adopt a simple encoder-decoder architecture ... after encoding ... we feed both feature maps to a AdaIN layer, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a source image feature map corresponding to the source image and a target image feature map corresponding to the target image via the at least one encoder model [Figure 2; After encoding the content and style images in feature space, pg. 1504, 6.1 Architecture]; generating, by the TransferNet framework, a rough stylized image feature map through the AdaIN module based on each of the source image feature map and the target image feature map [we feed both feature maps to an AdaIN layer that aligns the mean and variance of the content feature maps to those of the style feature maps, producing the target feature maps, pg. 1504, 6.1 Architecture], wherein the rough stylized image feature map comprises a combination of background of the source image and foreground of the target image [Figure 2 & 4; take a content image c and an arbitrary style image s as input, and synthesizes an output image that recombines content of the former and the style latter ... producing the target feature maps t ... decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]; and transforming, by the TransferNet framework, the rough stylized image feature map into an image form through the decoder model to obtain a rough stylized image [Figure 2; a randomly initialized decoder g is trained to map t back to the image space, generating the stylized image T, pg. 1504, 6.1 Architecture]. Huang fails to explicitly disclose a processor; a memory coupled to the processor, wherein the memory stores processor-executable instructions, wherein the target image is a pixel-level ground truth image of the source image, and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image. However, Hu teaches a processor [the present invention is divided into two branches: GPU branch and CPU branch, para 00109]; a memory coupled to the processor [for the CPU branch ... the larger-capacity RAM can completely save the operation result of the CPU, para 00113], wherein the memory stores processor-executable instructions, wherein the target image is a pixel-level ground truth image of the source image [acquire an image of an ancient Tibetan book document, and perform a binarization process on the image of the Tibetan ancient book document to determine a preliminary binarized image, para 0088], and generating, by a Refinement Network, a residual details image to obtain a final stylized image based on a combination of the residual details image and the rough stylized image [the second binarized Tibetan ancient book document image generation unit is used to utilize the Ojin binarization algorithm Binarization processing is performed on the reduced Tibetan ancient book document image to generate a second binarized Tibetan ancient book document image; the final binarization result map generation unit is used to integrate the first binarized image, para 00122]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Hu to more clearly and accurately display content in the image, as recognized by Hu. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Lee with Huang and Hu to obtain the invention as specified in claim 9. Regarding claim 11, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 3 renders obvious the steps of the system claim 11, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 3 is equally applicable to claim 11. Regarding claim 16, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 8 renders obvious the steps of the system claim 16, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 8 is equally applicable to claim 16. Regarding claim 17, (drawn to a non-transitory computer-readable medium) the proposed combination of Huang in view of Hu and further in view of Lee explained in the rejection of method claim 1 renders obvious the steps of the non-transitory computer-readable medium claim 17, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 1 is equally applicable to claim 17. Claims 4 and 12 rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images"), as applied above, and Li et al. (“Page segmentation using convolutional neural network and graphical model,”, July 26–29, 2020, Proceedings 14 (pp. 231-245). Springer International Publishing) (hereafter, “Li”). Regarding claim 4, which claim 3 is incorporated, neither Huang, Hu, nor Lee discloses wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs). However, Li teaches wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs) [Figure 2 & 4; after down-stream encoding and up-stream decoding ... node features and edge features can go through multiple GAT layers, 3.3 Deep Feature Learning; 3.4 Context Integration]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Li for contextual information in a larger scope, as recognized by Li. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Li with Huang, Hu, and Lee to obtain the invention as specified in claim 4. Regarding claim 12, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Li explained in the rejection of method claim 4 renders obvious the steps of the system claim 12, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 4 is equally applicable to claim 12. Claims 5 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images"), as applied above, and Suh et al. (Suh, Sungho, et al. "Two-stage generative adversarial networks for binarization of color document images." Pattern Recognition 130 (May 2022): 108810.) (hereafter, “Suh”). Regarding claim 5, which claim 1 is incorporated, neither Huang, Hu, nor appears to explicitly disclose performing, by the image binarization model, image binarization of the image sample through a binarization technique to recover inter-connectivity between broken characters in the image sample. However, Suh teaches performing, by the image binarization model, image binarization of the image sample through a binarization technique to recover inter-connectivity between broken characters in the image sample [Figure 5 & 11; in the local binarization process, regions of different colors can be removed from text components, and text-only regions can be extracted by using the image result from the document image enhancement stage, pg. 5, right column, 3.2. Document image binarization using multi-stage feature fusion, second paragraph ... the proposed method differentiates textual components from background more effectively than the state-of-the-art methods ... the proposed method not only removes the shadow and noise better but also preserves the textual components, pg. 6, right column, 4.3.1. Results on DIBCO datasets, second paragraph, third paragraph]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference and incorporate the teachings of Suh for better extraction of strokes and prevent background misclassification, as recognized by Suh [pg. 2, right column, Introduction, second paragraph]. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Suh with Huang, Hu, and Lee to obtain the invention as specified in claim 5. Regarding claim 13, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Suh explained in the rejection of method claim 5 renders obvious the steps of the system claim 13, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 5 is equally applicable to claim 13. Claims 6 and 14 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images") and Suh (“Two-stage generative adversarial networks for binarization of color document images”), as applied above, and Li (“Page segmentation using convolutional neural network and graphical model”). Regarding claim 6, which claim 5 is incorporated, neither Huang, Hu, nor Lee discloses wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs). However, Li teaches wherein the image binarization model comprises a set of encoder models, a set of decoder models, and a set of Graph Attention Networks (GATs) [Figure 2 & 4; after down-stream encoding and up-stream decoding ... node features and edge features can go through multiple GAT layers, 3.3 Deep Feature Learning; 3.4 Context Integration]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Li for contextual information in a larger scope, as recognized by Li. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Li with Huang, Hu, and Lee to obtain the invention as specified in claim 4. Regarding claim 14, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Li explained in the rejection of method claim 6 renders obvious the steps of the system claim 14, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 6 is equally applicable to claim 14. Claims 7 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Huang ("Arbitrary style transfer in real-time with adaptive instance normalization”) in view of Hu (CN 112837329 B), and further in view of Lee (“Data Augmentations for Document Images”), as applied above, and further in view of Wang et al. ("CLAST: Contrastive learning for arbitrary style transfer." IEEE Transactions on Image Processing 31 (Oct 2022): 6761-6772) (hereafter, “Wang”). Regarding claim 7, which claim 1 is incorporated, neither Huang, Hu, nor Lee appears to explicitly disclose wherein the at least one encoder model comprises a source encoder model and a target encoder model, and wherein the source encoder model is configured to receive the source image and the target encoder model is configured to receive the target image. However, Wang teaches wherein the at least one encoder model comprises a source encoder model and a target encoder model [Figure 3; our style transfer network (STNet) consists of a content encoder Ec, a style encoder Es, pg. 6765; pg. 6766, E. Overall Framework], and wherein the source encoder model is configured to receive the source image and the target encoder model is configured to receive the target image [Figure 3; denoting the input content image as Ic, the style image as Is, we first use the Ec and Es to extract the content and style feature of Ic and Is, pg. 6766, E. Overall Framework]. It would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to modify Huang’s reference in view of Hu and further in view of Lee and incorporate the teachings of Wang with a source encoder model and a target encoder model to prevent losing domain-specific information and visual degradation, as recognized by Wang. Further, one skilled in the art could have combined the elements as described above with known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine Wang with Huang, Hu, and Lee to obtain the invention as specified in claim 7. Regarding claim 15, (drawn to a system for generating training sample images for an image binarization model) the proposed combination of Huang in view of Hu and further in view of Lee and Wang explained in the rejection of method claim 7 renders obvious the steps of the system claim 15, because these steps occur in the operation of the method as discussed above. Thus, the arguments similar to that presented above for claim 7 is equally applicable to claim 15. Conclusion The art made of record and not relied upon is considered pertinent to applicant's disclosure: US20190244060A1 to Dundar et al. discloses a style transfer neural network that generates stylized synthetic images based on real images that provide the style to transfer to synthetic images. DocBinFormer: A Two-Level Transformer Network for Effective Document Image Binarization to Biswas et al. discloses a two-level vision transformer architecture for document image binarization. ColdBin: Cold Diffusion for Document Image Binarization to Saifullah et al. discloses an end-to-end framework for binarization of degraded document images based on cold diffusion. THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to TOLUWANI MARY-JANE IJASEUN whose telephone number is (571)270-1877. The examiner can normally be reached Monday - Friday 7:30AM-4PM. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /TOLUWANI MARY-JANE IJASEUN/Examiner, Art Unit 2676 /Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676
Read full office action

Prosecution Timeline

Mar 22, 2023
Application Filed
May 16, 2025
Non-Final Rejection — §103
Aug 19, 2025
Response Filed
Nov 05, 2025
Non-Final Rejection — §103
Feb 05, 2026
Response Filed
Mar 17, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12597117
METHOD, PROGRAM, APPARATUS, AND SYSTEM FOR ABNORMALITY DETECTION SUCH AS FOR DETERMINING WHETHER A PLURALITY OF CONTAINERS TO BE STACKED ON A PALLET IS NORMAL OR ABNORMAL
2y 5m to grant Granted Apr 07, 2026
Patent 12555231
DETECTING ISCHEMIC STROKE MIMIC USING DEEP LEARNING-BASED ANALYSIS OF MEDICAL IMAGES
2y 5m to grant Granted Feb 17, 2026
Patent 12536796
REMOTE SOIL AND VEGETATION PROPERTIES DETERMINATION METHOD AND SYSTEM
2y 5m to grant Granted Jan 27, 2026
Patent 12525056
METHOD AND DEVICE FOR MULTI-DNN-BASED FACE RECOGNITION USING PARALLEL-PROCESSING PIPELINES
2y 5m to grant Granted Jan 13, 2026
Patent 12499506
INFERENCE MODEL CONSTRUCTION METHOD, INFERENCE MODEL CONSTRUCTION DEVICE, RECORDING MEDIUM, CONFIGURATION DEVICE, AND CONFIGURATION METHOD
2y 5m to grant Granted Dec 16, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

4-5
Expected OA Rounds
90%
Grant Probability
91%
With Interview (+1.5%)
1y 10m
Median Time to Grant
High
PTA Risk
Based on 578 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month