Last updated: May 29, 2026
Application No. 18/252,647
Data Processing Method and Apparatus

Final Rejection §103
Filed
May 11, 2023
Priority
Nov 12, 2020 — CN 202011261210.8 +1 more
Examiner
BARNES JR, CARL E
Art Unit
2178
Tech Center
2100 — Computer Architecture & Software
Assignee
BEIJING JINGDONG CENTURY TRADING CO., LTD.
OA Round
2 (Final)
This examiner grants 32% of cases after interview

— +24.2% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 205 resolved cases, 2023–2026
Examiner Intelligence

BARNES JR, CARL E View full profile →
Grants only 32% of cases
Career Allowance Rate
66 granted / 205 resolved
-22.8% vs TC avg
Strong +24% interview lift
Without
With
+24.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
23 currently pending
Career history
238
Total Applications
across all art units
Statute-Specific Performance

§101
0.2%
-39.8% vs TC avg
§103
96.7%
+56.7% vs TC avg
§102
2.3%
-37.7% vs TC avg
§112
0.4%
-39.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 205 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Acknowledgment is made of applicant’s claim for foreign priority under 35 U.S.C. 119 (a)-(d). The certified copy has been filed in parent Application No. CN202011261210.8, filed on 10/22/2021.

Response to Amendment
Claims 1-15, 17-20 and 22 were previously pending and subject to non-final action filed 10/29/2025. In the response filed on 01/26/2026, claim 1, 11, 13-14 and 22 were amended. Therefore, claims 1-15, 17-20 and 22 are currently pending and subject to the final action below.

Response to Arguments
Applicant’s arguments, see page 10, filed 01/26/2026, with respect to Objection of the Claims of claims13-14 have been fully considered and are persuasive.  The objection of claims 13-14 has been withdrawn.

Applicant's arguments filed 01/21/2026, see pages 10-16, with respect to claims 1-20 under 35 U.S.C. 103 have been fully considered but they are not persuasive.
Applicant’s argument: Independent claims 1, 11, and 22 are allowable over Mannar and Kikuchi at least because Mannar and Kikuchi, whether considered alone or in any combination, fail to teach or fairly suggest each and every recitation of claims 1, 11, and 22.
The Office interprets the image formats (such as JPEG and BMP) in Mannar as the "container type" in claim 1 of the present application. (Office Action, p. 4). Applicant respectfully disagrees with this interpretation. The image formats (such as JPEG and BMP) disclosed by Mannar belong to data encoding standards, which define the storage method of image pixel data. For example, it is generally understood that JPEG reduces file size through lossy compression, while BMP retains original data through uncompressed storage. In contrast, it can be readily understood by one having ordinary skilled in the art that the container type of an image is a page layout component and has no relation to data storage. Therefore, the image formats (such as JPEG and BMP) in Mannar are not equivalent to the container type of an image in the present application. Accordingly does not and cannot disclose "a first image set for recognizing a container type" as recited in claim 1.
Further, the Office admits that Mannar does not disclose a page template or the step of performing a conversion as recited in claim 1, but argues that Kikuchi remedies these deficiencies. (Office Action, pp. 6-7). Applicant respectfully disagrees.
Kikuchi discloses data transmission between the server and the client, and also does not mention the conversion of data sets to achieve data normalization.
Examiner Response: After careful consideration and review of applicant’s arguments and specification. The examiner respectfully disagrees. During examination, the claims must be interpreted as broadly as their terms reasonably allow. In re American Academy of Science Tech Center, 367 F.3d 1359, 1369, 70 U.S.P.Q.2d 1827, 1834 (Fed. Cir. 2004). Under the broadest reasonable of interpretation a container-type is a “wrapper”. hat holds various types of data, such as audio, video, images, and metadata, within a single file, allowing them to be packaged together for storage and transport. Furthermore, Kikuchi recites generating a template, and can be consider standardizing and structure of data (specific language structure). The disclosure does not defined the term container-type or normalization, and the claim only “a specific language structure” which can be a template, SQL, CSS, HTML, or other language structure formats.
Therefore, MANNAR teaches: annotating, in response to receiving a page image, (MANNAR − [0073] Referring to FIG. 4, with respect to external data 104 as disclosed herein, at 400, a crawler may be implemented by the data receiver 102 to identify locations and/or websites to identify and extract data and/or images) the page image to generate image sets corresponding to annotated data, (MANNAR − [0074] At 404, an annotated image data set may be generated to build the text extraction model. NOTE: annotated image data set is the “generate image sets”.)
wherein the image sets comprise a first image set for recognizing a container type, (MANNAR − [0074] Once the crawler extracts contents of a webpage, images may then be identified based on predefined formats such as JPEG, BMP etc. The extracted images may be stored in an image repository at 402. [0077] image identification (ID, ) and the values extracted. [0123] classification of images into categories based on a type of image (e.g., images containing drugs, chemical formula, mushroom, etc.). NOTE: Container type JPEG, BMP for categories of on a type for a first image set.)
a second image set for recognizing text information, (MANNAR − [0024] The deep embedded clustering approach may combine continuous bag of words (CBOW) embedding based similarity for text with convolutional neural network (CNN) auto encoder similarity for images to identify clusters that are effective in the context of the dark web where some images may include informative text that may be used to better cluster a dataset.)
and a third image set for detecting an image element, (MANNAR − [0013] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus of FIG. 1 in accordance with an example of the present disclosure; [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. NOTE: Image element of counterfeit watches images)
and the page image is generated based on a page template; (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data.)
inputting the image sets into a trained image recognition model to generate a container type data set corresponding to the first image set, (MANNAR − [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
a text data set corresponding to the second image set (MANNAR − [0062] [0063] At 302, the deep learning based data analyzer 110 may utilize a deep learning model (e.g., a text extraction model) to identify and extract text embedded in images.)
and an image element data set corresponding to the third image set, (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc. NOTE: Image element of counterfeit watches images)
wherein the image recognition model is used to determine a container type of each image in the first image set, (MANNAR − [0074] Once the crawler extracts contents of a webpage, images may then be identified based on predefined formats such as JPEG, BMP etc. The extracted images may be stored in an image repository at 402. [0077] image identification (ID, ) and the values extracted. [0123] classification of images into categories based on a type of image (e.g., images containing drugs, chemical formula, mushroom, etc.). NOTE: Container type JPEG, BMP for categories of on a type for a first image set.)
perform a word detection and text recognition for each image in the second image set, and detect and recognize an image element in each image in the third image set;  (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
Kikuchi teaches: and the page image is generated based on a page template; (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data.)
and performing a conversion on the container type data set, the text data set and the image element data set based on template information of a page to generate a template data set corresponding to the page image, (Kikuchi − [0069] In FIG. 5A, a template 501 dynamically changes container layout, (including size and arrangement), in accordance with an orientation of an image in order to execute functionality of two distinct templates 504 and 505. The template 504 includes an image container on an upper half and a text container on a lower half and is linked in the vertical direction. The template 505 includes an image container on a left side and a text container on a right side linked in a horizontal direction)
and uploading the template data set, wherein the conversion is performed on the container type data set, the text data set and the image element data set based on a specific language structure. (Kikuchi − [0043] The server PC sends (upload) the document template back to Client PC.)
Therefore the rejection is maintained. 

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-15, 17-20 and 22 are rejected under 35 U.S.C. 103 as being unpatentable over MANNAR (US 20200151222 A1, filed date: Oct. 1, 2019) in view of Kikuchi (US 20080189603 A1, filed date: Jan. 16, 2008).
Regarding independent claim 1, MANNAR teaches: A method for processing data, comprising: 
annotating, in response to receiving a page image, (MANNAR − [0073] Referring to FIG. 4, with respect to external data 104 as disclosed herein, at 400, a crawler may be implemented by the data receiver 102 to identify locations and/or websites to identify and extract data and/or images) the page image to generate image sets corresponding to annotated data, (MANNAR − [0074] At 404, an annotated image data set may be generated to build the text extraction model. NOTE: annotated image data set is the “generate image sets”.)
wherein the image sets comprise a first image set for recognizing a container type, (MANNAR − [0074] Once the crawler extracts contents of a webpage, images may then be identified based on predefined formats such as JPEG, BMP etc. The extracted images may be stored in an image repository at 402. [0077] image identification (ID, ) and the values extracted. [0123] classification of images into categories based on a type of image (e.g., images containing drugs, chemical formula, mushroom, etc.). NOTE: Container type JPEG, BMP for categories of on a type for a first image set.)
a second image set for recognizing text information, (MANNAR − [0024] The deep embedded clustering approach may combine continuous bag of words (CBOW) embedding based similarity for text with convolutional neural network (CNN) auto encoder similarity for images to identify clusters that are effective in the context of the dark web where some images may include informative text that may be used to better cluster a dataset.)
and a third image set for detecting an image element, (MANNAR − [0013] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus of FIG. 1 in accordance with an example of the present disclosure; [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. NOTE: Image element of counterfeit watches images)
and the page image is generated based on a page template; (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data.)
inputting the image sets into a trained image recognition model to generate a container type data set corresponding to the first image set, (MANNAR − [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
a text data set corresponding to the second image set (MANNAR − [0062] [0063] At 302, the deep learning based data analyzer 110 may utilize a deep learning model (e.g., a text extraction model) to identify and extract text embedded in images.)
and an image element data set corresponding to the third image set, (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc. NOTE: Image element of counterfeit watches images)
wherein the image recognition model is used to determine a container type of each image in the first image set, (MANNAR − [0074] Once the crawler extracts contents of a webpage, images may then be identified based on predefined formats such as JPEG, BMP etc. The extracted images may be stored in an image repository at 402. [0077] image identification (ID, ) and the values extracted. [0123] classification of images into categories based on a type of image (e.g., images containing drugs, chemical formula, mushroom, etc.). NOTE: Container type JPEG, BMP for categories of on a type for a first image set.)
perform a word detection and text recognition for each image in the second image set, and detect and recognize an image element in each image in the third image set;  (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
MANNAR does not explicitly: a page template;
However, Kikuchi teaches: and the page image is generated based on a page template; (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data.)
and performing a conversion on the container type data set, the text data set and the image element data set based on template information of a page to generate a template data set corresponding to the page image, (Kikuchi − [0069] In FIG. 5A, a template 501 dynamically changes container layout, (including size and arrangement), in accordance with an orientation of an image in order to execute functionality of two distinct templates 504 and 505. The template 504 includes an image container on an upper half and a text container on a lower half and is linked in the vertical direction. The template 505 includes an image container on a left side and a text container on a right side linked in a horizontal direction)
and uploading the template data set, wherein the conversion is performed on the container type data set, the text data set and the image element data set based on a specific language structure. (Kikuchi − [0043] The server PC sends (upload) the document template back to Client PC.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 2, depends on claim 1, MANNAR teaches: wherein annotating the page image to generate the image sets corresponding to the annotated data comprises: 
annotating the page image to obtain the annotated data corresponding to the page image; (MANNAR − [0074] At 404, an annotated image data set may be generated to build the text extraction model. NOTE: annotated image data set is the “generate image sets”.)
inputting the annotated data into a position determination model to generate position information of each block corresponding to the annotated data, (MANNAR − [0063] a CNN model may be used to extract text features from the image i.e., locations which contain text. [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
wherein the position determination model is trained and obtained through historical related data of the annotated data; and determining the image sets corresponding to the annotated data based on the position information of each block. (MANNAR − [0057-0058] [0057] identify which sellers on the web are potentially associated with known unauthorized entities in the region. [0059] the CNN classifier (e.g., to detect increase in trends related to a specific type of pattern), and DCGAN (e.g., to detect increase in imitator and/or clone websites) to predict risks.)
Regarding dependent claim 3, depends on claim 1, MANNAR teaches: wherein the image recognition model is trained and obtained by: acquiring a training sample set, wherein a training sample in the training sample set comprises the first image set for recognizing the container type, the second image set for recognizing the text information, the third image set for detecting the image element, the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set; and using a deep learning method to train and obtain the image recognition model with the first image set, the second image set and the third image set that are included in training sample in the training sample set as input data and the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set as expected output data. (MANNAR − [0044] [0061-0065] [0061] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc. [0064] At 304, the deep learning based data analyzer 110 may further implement deep embedded clustering to leverage text and images together in a model. In this regard, the deep learning based data analyzer 110 may implement the approach disclosed herein with respect to FIG. 5.  [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
Regarding dependent claim 4, depends on claim 1, MANNAR teaches: wherein the image recognition model comprises a container type recognition sub-model, a text recognition sub- model and an element recognition sub-model, (MANNAR − [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
and inputting the image sets into the trained image recognition model to generate the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set comprises: (MANNAR − [0076] At 408 the model trained at 406 may be applied on the entire image repository to identify images that have text, and to extract the text. When the model is executed on an image, the model may generate the text identified in the image and also the confidence of the model in the identified text (e.g., from the RNN model).)
inputting the first image set into the container type recognition sub-model to generate the container type data set corresponding to the first image set, wherein the container type recognition sub-model is used to represent the container type determination for each image in the first image set; (MANNAR − [0076] [0077] At 410, the results may be stored as an output table with a key being image identification (ID; e.g., filename of image) and the value being the text extracted by the model. These results may be used in the subsequent deep embedded clustering.)
inputting the second image set into the text recognition sub-model to generate the text data set corresponding to the second image set, wherein the text recognition sub-model is used to represent the word detection and text recognition for each image in the second image set; (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
and inputting the third image set into the element recognition sub-model to generate the image element data set corresponding to the third image set, wherein the element recognition sub-model is used to represent the image element detection and recognition for each image in the third image set. (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc.)
Regarding dependent claim 5, depends on claim 4, MANNAR teaches: wherein the text recognition sub-model comprises a feature extraction sub-model and a word sequence extraction sub-model, and inputting the second image set into the text recognition sub-model to generate the text data set corresponding to the second image set comprises: (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
inputting the second image set into the feature extraction sub-model to obtain each feature matrix corresponding to the second image set, wherein the feature extraction sub- model is constructed based on a convolutional neural network; (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
inputting each feature matrix into the word sequence extraction sub-model to obtain a word sequence corresponding to each feature matrix, wherein the word sequence extraction sub-model is constructed based on a recursive neural network; (MANNAR − [0076] At 408 the model trained at 406 may be applied on the entire image repository to identify images that have text, and to extract the text. When the model is executed on an image, the model may generate the text identified in the image and also the confidence of the model in the identified text (e.g., from the RNN model).)
and determining, based on each word sequence, text information corresponding to each word sequence, and generating the text data set corresponding to each piece of text information. (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc.)
Regarding dependent claim 6, depends on claim 4, MANNAR teaches: wherein the image recognition model is constructed based on a deep residual network model, and/or the container type recognition sub- model is constructed based on the deep residual network model. (MANNAR − [0076] At 408 the model trained at 406 may be applied on the entire image repository to identify images that have text, and to extract the text. When the model is executed on an image, the model may generate the text identified in the image and also the confidence of the model in the identified text (e.g., from the RNN model).)
Regarding dependent claim 7, depends on claim 1, MANNAR does not explicitly teach: performing a correction on the container type data set,
However, Kikuchi teaches: wherein, before performing the conversion on the container type data set, the text data set and the image element data set based on template information of the page to generate the template data set corresponding to the page image, the method further comprises: performing a correction on the container type data set, the text data set and the image element data set to obtain corrected container type data set, corrected text data set and corrected image element data set, wherein the correction is used to represent reordering data in the container type data set, the text data set and the image element data set based on an analysis result of an image position, image order and image repeatability of each image in the image sets. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 8, depends on claim 7, MANNAR does not teach: wherein the correction is accomplished based on a combination of image scaling, image graying, an image enhancement, an image noise reduction and an image edge detection on each image in the image sets.
However, Kikuchi teaches: wherein the correction is accomplished based on a combination of image scaling, image graying, an image enhancement, an image noise reduction and an image edge detection on each image in the image sets. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513. NOTE: Size is image scaling correction.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 9, depends on claim 7, MANNAR does not teach: wherein, before performing the correction on the container type data set, the text data set and the image element data set to obtain the corrected container type data set, the corrected text data set and the corrected image element data set, the method further comprises: performing content recognition on the image sets to obtain a first data set corresponding to the first image set, a second data set corresponding to the second image set and a third data set corresponding to the third image set; and performing a revision on the data in the container type data set, the text data set and the image element data set according to a comparison result of the first data set, the second data set and the third data set with the container type data set, the text data set and the image element data set, to obtain revised container type data set, revised text data set and revised image element data set.
However, Kikuchi teaches: wherein, before performing the correction on the container type data set, the text data set and the image element data set to obtain the corrected container type data set, the corrected text data set and the corrected image element data set, the method further comprises: performing content recognition on the image sets to obtain a first data set corresponding to the first image set, a second data set corresponding to the second image set and a third data set corresponding to the third image set; and performing a revision on the data in the container type data set, the text data set and the image element data set according to a comparison result of the first data set, the second data set and the third data set with the container type data set, the text data set and the image element data set, to obtain revised container type data set, revised text data set and revised image element data set. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513. NOTE: Size is image scaling correction.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 10, depends on claim 1, MANNAR does not teach: further comprising: generating a template interface corresponding to the template data set based on the template data set, and presenting the template interface; and/or optimizing a design scheme of the page template based on the template data set.
However, Kikuchi teaches: further comprising: generating a template interface corresponding to the template data set based on the template data set, and presenting the template interface; and/or optimizing a design scheme of the page template based on the template data set. (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data. [0121] FIG. 18 illustrates a setting screen 1801 for setting extraction conditions for the flow area and setting a sub template. A user sets a sub template used in the flow area to a field 1802. Alternatively, the user may designate a sub template name using a file open icon 1803. The arrow in FIG. 18 indicates a mouse pointer 1816.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding independent claim 11, is directed to an apparatus. Claim 11 have similar/same technical features/limitations as claim 1 and the claims are rejected under the same rationale.
Regarding dependent claim 12, depends on claim 11, MANNAR teaches: annotating the page image to generate the image sets corresponding to the annotated data comprises: (MANNAR − [0074] At 404, an annotated image data set may be generated to build the text extraction model. NOTE: annotated image data set is the “generate image sets”.)
annotating the page image to obtain the annotated data corresponding to the page image; (MANNAR − [0074] At 404, an annotated image data set may be generated to build the text extraction model. NOTE: annotated image data set is the “generate image sets”.)
inputting the annotated data into a position determination model to generate position information of each block corresponding to the annotated data, (MANNAR − [0063] a CNN model may be used to extract text features from the image i.e., locations which contain text. [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
wherein the position determination model is trained and obtained through historical related data of the annotated data; and determining the image sets corresponding to the annotated data based on the position information of each block. (MANNAR − [0057-0058] [0057] identify which sellers on the web are potentially associated with known unauthorized entities in the region. [0059] the CNN classifier (e.g., to detect increase in trends related to a specific type of pattern), and DCGAN (e.g., to detect increase in imitator and/or clone websites) to predict risks.)
Regarding dependent claim 13, depends on claim 11, MANNAR teaches: wherein the image recognition model in the generating unit is trained and obtained by acquiring a training sample set, wherein a training sample in the training sample set comprises the first image set for recognizing the container type, the second image set for recognizing the text information, the third image set for detecting the image element, the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set; and using a deep learning method to train and obtain the image recognition model with the first image set, the second image set and the third image set that are included in training sample in the training sample set as input data and the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set as expected output data. (MANNAR − [0044] [0061-0065] [0061] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc. [0064] At 304, the deep learning based data analyzer 110 may further implement deep embedded clustering to leverage text and images together in a model. In this regard, the deep learning based data analyzer 110 may implement the approach disclosed herein with respect to FIG. 5.  [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
Regarding dependent claim 14, depends on claim 11, MANNAR teaches: wherein the image recognition model in the generating unit comprises a container type recognition sub-model, a text recognition sub-model and an element recognition sub-model, (MANNAR − [0062] Referring to FIG. 3, at 300, training data may be received, for example, by the data receiver 102. The training data may include data from the dark web, social media, blogs, etc.)
and inputting the image sets into the trained image recognition model to generate the container type data set corresponding to the first image set, the text data set corresponding to the second image set and the image element data set corresponding to the third image set comprises: (MANNAR − [0076] At 408 the model trained at 406 may be applied on the entire image repository to identify images that have text, and to extract the text. When the model is executed on an image, the model may generate the text identified in the image and also the confidence of the model in the identified text (e.g., from the RNN model).)
inputting the first image set into the container type recognition sub-model to generate the container type data set corresponding to the first image set, wherein the container type recognition sub-model is used to represent the container type determination for each image in the first image set; (MANNAR − [0076] [0077] At 410, the results may be stored as an output table with a key being image identification (ID; e.g., filename of image) and the value being the text extracted by the model. These results may be used in the subsequent deep embedded clustering.)
inputting the second image set into the text recognition sub-model to generate the text data set corresponding to the second image set, wherein the text recognition sub-model is used to represent the word detection and text recognition for each image in the second image set; (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
and inputting the third image set into the element recognition sub-model to generate the image element data set corresponding to the third image set, wherein the element recognition sub-model is used to represent the image element detection and recognition for each image in the third image set. (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc.)
Regarding dependent claim 15, depends on claim 14, MANNAR teaches: wherein the text recognition sub-model in the second generating module comprises a feature extraction sub-model and a word sequence extraction sub-model, inputting the second image set into the text recognition sub-model to generate the text data set corresponding to the second image set comprises: (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
inputting the second image set into the feature extraction sub-model to obtain each feature matrix corresponding to the second image set, wherein the feature extraction sub-model is constructed based on a convolutional neural network; (MANNAR − [0044] According to examples disclosed herein, the deep learning based data analyzer 110 may analyze the ascertained data 104 by performing deep embedded clustering with respect to the ascertained text 106, the images 108, and the text extracted from the images 108 to generate the plurality of clusters 112 by analyzing, for the ascertained text 106 and the text extracted from the images 108, combine continuous bag of words (CBOW) based similarity, and analyzing, for the ascertained images 108, convolutional neural network (CNN) based similarity. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories)
inputting each feature matrix into the word sequence extraction sub-model to obtain a word sequence corresponding to each feature matrix, wherein the word sequence extraction sub-model is constructed based on a recursive neural network; (MANNAR − [0076] At 408 the model trained at 406 may be applied on the entire image repository to identify images that have text, and to extract the text. When the model is executed on an image, the model may generate the text identified in the image and also the confidence of the model in the identified text (e.g., from the RNN model).)
and determining, based on each word sequence, text information corresponding to each word sequence, and generate the text data set corresponding to each piece of text information. (MANNAR − [0113] FIG. 11 illustrates an example output of clusters generated from images related to various categories to illustrate operation of the dark web content analysis and identification apparatus 100. [0114] Referring to FIG. 11, various clusters may be generated from images related to various categories. For example, clusters may be generated for images related to prescription drugs as shown at 1100, drugs as shown at 1102, etc.)
Regarding dependent claim 17, depends on claim 11, MANNAR does not explicitly teach: performing a correction on the container type data set,
However, Kikuchi teaches: wherein, before performing the conversion on the container type data set, the text data set and the image element data set based on template information of the page to generate the template data set corresponding to the page image, the method further comprises: performing a correction on the container type data set, the text data set and the image element data set to obtain corrected container type data set, corrected text data set and corrected image element data set, wherein the correction is used to represent reordering data in the container type data set, the text data set and the image element data set based on an analysis result of an image position, image order and image repeatability of each image in the image sets. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
 Regarding dependent claim 18, depends on claim 17, MANNAR does not teach: wherein the correction is accomplished based on a combination of image scaling, image graying, an image enhancement, an image noise reduction and an image edge detection on each image in the image sets.
However, Kikuchi teaches: wherein the correction is accomplished based on a combination of image scaling, image graying, an image enhancement, an image noise reduction and an image edge detection on each image in the image sets. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513. NOTE: Size is image scaling correction.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 19, depends on claim 17, MANNAR does not teach: wherein, before performing the correction on the container type data set, the text data set and the image element data set to obtain the corrected container type data set, the corrected text data set and the corrected image element data set, the method further comprises: performing content recognition on the image sets to obtain a first data set corresponding to the first image set, a second data set corresponding to the second image set and a third data set corresponding to the third image set; and performing a revision on the data in the container type data set, the text data set and the image element data set according to a comparison result of the first data set, the second data set and the third data set with the container type data set, the text data set and the image element data set, to obtain revised container type data set, revised text data set and revised image element data set.
However, Kikuchi teaches: wherein, before performing the correction on the container type data set, the text data set and the image element data set to obtain the corrected container type data set, the corrected text data set and the corrected image element data set, the method further comprises: performing content recognition on the image sets to obtain a first data set corresponding to the first image set, a second data set corresponding to the second image set and a third data set corresponding to the third image set; and performing a revision on the data in the container type data set, the text data set and the image element data set according to a comparison result of the first data set, the second data set and the third data set with the container type data set, the text data set and the image element data set, to obtain revised container type data set, revised text data set and revised image element data set. (Kikuchi − [0072] FIG. 5B illustrates an example of a link setting method. In response to a user's instruction to set a link, the application displays a link setting screen 506 including a link type setting area 507 and a link distance setting area 508. A fixed link in "Link Type" is a link having a fixed size, and a flexible link is a link having a size varying depending on content data input into each container. Further, a maximum value 510 and a minimum value 512 are set if a flexible link is selected, to define a flexible range of the link. Further, a reference value 511 is a size preset at the time of setting a link (or a size of the fixed link). A user checks the "transposition link" check box 509 through the link setting screen 506 to set the transposition link 523 and rotate a template in accordance with the content data inserted into the container. A value set on the link setting screen 506 is confirmed by clicking an OK button 513. NOTE: Size is image scaling correction.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding dependent claim 20, depends on claim 11, MANNAR does not teach: further comprising: generating a template interface corresponding to the template data set based on the template data set, and presenting the template interface; and/or optimizing a design scheme of the page template based on the template data set.
However, Kikuchi teaches: further comprising: generating a template interface corresponding to the template data set based on the template data set, and presenting the template interface; and/or optimizing a design scheme of the page template based on the template data set. (Kikuchi − [0014] The present invention is directed to an apparatus and method of changing template layout in accordance with input data to reduce a load in generating a template when laying out data. [0121] FIG. 18 illustrates a setting screen 1801 for setting extraction conditions for the flow area and setting a sub template. A user sets a sub template used in the flow area to a field 1802. Alternatively, the user may designate a sub template name using a file open icon 1803. The arrow in FIG. 18 indicates a mouse pointer 1816.)
Accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have combined the teaching of MANNAR and Kikuchi as each invention relates to analyzing webpage content textual and imagery data. Adding the teaching of Kikuchi provide MANNAR with a content template generator application. One of ordinary skill in the art would have been motivated to reduce processing and load time when generating webpage template.
Regarding independent claim 22, is directed to a non-transitory computer readable storage medium. Claim 22 have similar/same technical features/limitations as claim 1 and the claims are rejected under the same rationale.

Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to CARL E BARNES JR whose telephone number is (571)270-3395. The examiner can normally be reached Monday-Friday 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Stephen Hong can be reached at (571) 272-4124. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/CARL E BARNES JR/Examiner, Art Unit 2178                                                                                                                                                                                                        
/STEPHEN S HONG/Supervisory Patent Examiner, Art Unit 2178
Read full office action
Prosecution Timeline

May 11, 2023
Application Filed
Oct 29, 2025
Non-Final Rejection mailed — §103
Jan 26, 2026
Response Filed
May 19, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/898,903
Patent 12639806
MEDICAL SYSTEM, INFORMATION PROCESSING METHOD, AND COMPUTER-READABLE MEDIUM
3y 9m to grant Granted May 26, 2026
17/289,673
Patent 12614280
SYSTEM FOR ESTIMATING PRIMARY OPEN-ANGLE GLAUCOMA LIKELIHOOD
5y 0m to grant Granted Apr 28, 2026
17/953,132
Patent 12584932
SLIDE IMAGING APPARATUS AND A METHOD FOR IMAGING A SLIDE
3y 6m to grant Granted Mar 24, 2026
16/871,512
Patent 12541640
COMPUTING DEVICE FOR MULTIPLE CELL LINKING
5y 8m to grant Granted Feb 03, 2026
16/262,443
Patent 12536464
SYSTEM FOR CONSTRUCTING EFFECTIVE MACHINE-LEARNING PIPELINES WITH OPTIMIZED OUTCOMES
6y 12m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
32%
Grant Probability
56%
With Interview (+24.2%)
3y 10m (~10m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 205 resolved cases by this examiner. Grant probability derived from career allowance rate.