Last updated: April 19, 2026
Application No. 18/704,427
IMAGE RENDERING METHOD AND APPARATUS, DEVICE, STORAGE MEDIUM, AND PROGRAM PRODUCT

Non-Final OA §102§103
Filed
Apr 24, 2024
Examiner
GUO, XILIN
Art Unit
2616
Tech Center
2600 — Communications
Assignee
BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
OA Round
1 (Non-Final)
Interview Optional

— +17.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 456 resolved cases, 2023–2026
Examiner Intelligence

GUO, XILIN View full profile →
Grants 82% — above average
Career Allow Rate
374 granted / 456 resolved
+20.0% vs TC avg
Strong +17% interview lift
Without
With
+17.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 5m
Avg Prosecution
18 currently pending
Career history
474
Total Applications
across all art units
Statute-Specific Performance

§101
7.6%
-32.4% vs TC avg
§103
56.3%
+16.3% vs TC avg
§102
12.8%
-27.2% vs TC avg
§112
19.0%
-21.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 456 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Priority
Receipt is acknowledged of certified copies of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.

Response to Preliminary Amendment
The preliminary amendment filed on April 24, 2024 has been entered.
In view of the amendment to the specification, the clean substitute specification and Abstract of the specification have been acknowledged.
In view of the amendment to the claims, the amendment of claims 11, 14 and 15 have been acknowledged. Claims 13 and 16 have been canceled. New claims 17-22 have been added. 

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1 and 14-15 are rejected under 35 U.S.C. 102 (a)(1) as being anticipated by Kokemohr et al (U.S. Patent No. 10,049,477 B1).

	Regarding claim 1, Kokemohr discloses an image rendering method (Col 10, lines 27-29, FIG. 3 is a flow diagram illustrating another example method 300 for providing text and visual styling for images ...), comprising: 
processing an image to be rendered to determine a text area (Col 10, lines 33-38, in block 302, the method obtains an original image similarly as described above for block 202 of FIG. 2; Col 18, lines 59-67, this text area can be determined in block 306 in some implementations); 
determining a target character style of text based on attribute information of the text area (Col 16, lines 10-53, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as font ... A particular word in the text (e.g., “memories”) can be associated with a particular font, a classic or old-style font); 
determining a target pattern style of text based on the image to be rendered (Col 16, lines 10-18, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as, color, size, layout (e.g., justification, positioning of letters relative to each other, how many lines of text to form, etc.) ...; Col 16, lines 54-67 to Col 17, lines 1-8, one or more text attributes can be based on one or more image characteristics. For example, a color of generated text and/or user text can be based on color characteristics of one or more image features depicted in the image ... The text can be assigned a fill pattern, texture, or other visual effect similarly based on one or more visual characteristics ...); and 
rendering the image to be rendered based on the target character style of text and the target pattern style of text (Col 17, lines 48-64, in block 308, the method provides the generated text in the image ...; Col 18, line 67, the generated text area can be sized and shaped to fit the generated text ...; Col 20, lines 29-65, in block 310, the method determines a visual style modification for the image based on a set of one or more of the determined characteristics. The set of characteristics used here may or may not be the same set of characteristics used to determine the generated text in blocks 306 and/or 308 ...; Col 21, lines 50-67, in block 312, the method modifies the image and the text message with the visual style modification determined in block 310 to create a modified image. In some implementations, the generated text can be added to the image before the visual style modification is applied to the image, and the generated text can be modified along with the rest of the image by the visual style modification ...; Col 22, lines 1-9, In block 314, the method provides the modified image for display to a user).

	Regarding claim 14, Kokemohr discloses an electronic device (FIG. 1; Col 6, lines 36-37, client device 120; Col 33, lines 32-46, FIG. 13 is a block diagram of an example device 1300 ...; Col 35, lines 1-5, a client device can also implement and/or be used with features described herein, such as client devices 120-126 shown in FIG. 1. Example client devices can be computer devices including some similar components as the device 1300), comprising: 
one or more processors (Col 35, lines 1-5, processor(s) 1302); 
a memory device (Col 35, lines 1-5, memory 1304) for storing one or more programs that, when executed by the one or more processors (Col 34, lines 3-4, memory 1304 can store software operating on the device 1300 by the processor 1302 ...; Col 35, lines 26-52, one or more methods described herein (e.g., methods 200 and/or 300) can be implemented by computer program instructions or code, which can be executed on a computer. For example, the code can be implemented by one or more processors (e.g., microprocessors or other processing circuitry), and can be stored on a computer program product including a non-transitory computer readable medium (e.g., storage medium), e.g., a magnetic, optical, electromagnetic, or semiconductor storage medium ...), cause the one or more processors to implement an image rendering method (Col 10, lines 27-29, FIG. 3 is a flow diagram illustrating another example method 300 for providing text and visual styling for images ...) comprising: 
processing an image to be rendered to determine a text area (Col 10, lines 33-38, in block 302, the method obtains an original image similarly as described above for block 202 of FIG. 2; Col 18, lines 59-67, this text area can be determined in block 306 in some implementations);
determining a target character style of text based on attribute information of the text area (Col 16, lines 10-53, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as font ... A particular word in the text (e.g., “memories”) can be associated with a particular font, a classic or old-style font);
determining a target pattern style of text based on the image to be rendered (Col 16, lines 10-18, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as, color, size, layout (e.g., justification, positioning of letters relative to each other, how many lines of text to form, etc.) ...; Col 16, lines 54-67 to Col 17, lines 1-8, one or more text attributes can be based on one or more image characteristics. For example, a color of generated text and/or user text can be based on color characteristics of one or more image features depicted in the image ... The text can be assigned a fill pattern, texture, or other visual effect similarly based on one or more visual characteristics ...); and
rendering the image to be rendered based on the target character style of text and the target pattern style of text (Col 17, lines 48-64, in block 308, the method provides the generated text in the image ...; Col 18, line 67, the generated text area can be sized and shaped to fit the generated text ...; Col 20, lines 29-65, in block 310, the method determines a visual style modification for the image based on a set of one or more of the determined characteristics. The set of characteristics used here may or may not be the same set of characteristics used to determine the generated text in blocks 306 and/or 308 ...; Col 21, lines 50-67, in block 312, the method modifies the image and the text message with the visual style modification determined in block 310 to create a modified image. In some implementations, the generated text can be added to the image before the visual style modification is applied to the image, and the generated text can be modified along with the rest of the image by the visual style modification ...; Col 22, lines 1-9, In block 314, the method provides the modified image for display to a user).

	Regarding claim 15, Kokemohr discloses a non-transitory computer-readable storage medium stored thereon a computer program ((FIG. 1; Col 6, lines 36-37, client device 120; Col 33, lines 32-46, FIG. 13 is a block diagram of an example device 1300 ...; Col 35, lines 1-5, a client device can also implement and/or be used with features described herein, such as client devices 120-126 shown in FIG. 1. Example client devices can be computer devices including some similar components as the device 1300), such as processor(s) 1302, memory 1304 ...) that, when executed by a processor (Col 34, lines 3-4, memory 1304 can store software operating on the device 1300 by the processor 1302 ...; Col 35, lines 26-52, one or more methods described herein (e.g., methods 200 and/or 300) can be implemented by computer program instructions or code, which can be executed on a computer. For example, the code can be implemented by one or more processors (e.g., microprocessors or other processing circuitry), and can be stored on a computer program product including a non-transitory computer readable medium (e.g., storage medium), e.g., a magnetic, optical, electromagnetic, or semiconductor storage medium ...), implements an image rendering method (Col 10, lines 27-29, FIG. 3 is a flow diagram illustrating another example method 300 for providing text and visual styling for images ...) comprising:
processing an image to be rendered to determine a text area (Col 10, lines 33-38, in block 302, the method obtains an original image similarly as described above for block 202 of FIG. 2; Col 18, lines 59-67, this text area can be determined in block 306 in some implementations);
determining a target character style of text based on attribute information of the text area (Col 16, lines 10-53, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as font ... A particular word in the text (e.g., “memories”) can be associated with a particular font, a classic or old-style font);
determining a target pattern style of text based on the image to be rendered (Col 16, lines 10-18, block 306 (and/or other blocks of method 300) can also determine one or more text attributes of the generated text, such as, color, size, layout (e.g., justification, positioning of letters relative to each other, how many lines of text to form, etc.) ...; Col 16, lines 54-67 to Col 17, lines 1-8, one or more text attributes can be based on one or more image characteristics. For example, a color of generated text and/or user text can be based on color characteristics of one or more image features depicted in the image ... The text can be assigned a fill pattern, texture, or other visual effect similarly based on one or more visual characteristics ...); and
rendering the image to be rendered based on the target character style of text and the target pattern style of text (Col 17, lines 48-64, in block 308, the method provides the generated text in the image ...; Col 18, line 67, the generated text area can be sized and shaped to fit the generated text ...; Col 20, lines 29-65, in block 310, the method determines a visual style modification for the image based on a set of one or more of the determined characteristics. The set of characteristics used here may or may not be the same set of characteristics used to determine the generated text in blocks 306 and/or 308 ...; Col 21, lines 50-67, in block 312, the method modifies the image and the text message with the visual style modification determined in block 310 to create a modified image. In some implementations, the generated text can be added to the image before the visual style modification is applied to the image, and the generated text can be modified along with the rest of the image by the visual style modification ...; Col 22, lines 1-9, In block 314, the method provides the modified image for display to a user).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 2-5, 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Kokemohr et al (U.S. Patent No. 10,049,477 B1) in view of HAYASAKI (U.S. Patent Application Publication 2009/0323089 A1).

	Regarding claim 2, Kokemohr discloses everything claimed as applied above (see claim 1).
	However, Kokemohr does not specifically disclose wherein the processing the image to be rendered to determine the text area comprises: 
inputting the image to be rendered into a segmentation model to obtain an image mask; and 
determining the text area corresponding to the image to be rendered based on the image mask.  
In additional, HAYASAKI discloses wherein the processing the image to be rendered to determine the text area comprises: 
inputting the image to be rendered into a segmentation model to obtain an image mask (FIG. 2; paragraph [0061],  the foreground mask generation section 21 extracts a text region as the foreground layer from an input image and generates a foreground mask); and 
determining the text area corresponding to the image to be rendered based on the image mask (Paragraph [0061], a text region as the foreground layer from an input image and generates a foreground mask. In this process, a pixel determined as the text region in a segmentation process explained later is converted to a binary value pixel, and a text pixel is extracted ...).  
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 3, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 2).
	However, Kokemohr does not specifically disclose wherein the determining the text area corresponding to the image to be rendered based on the image mask comprises: 
in response to a foreground region in the image mask being greater than or equal to a first threshold, arranging the text area in an area corresponding to the foreground region in the image to be rendered.
In additional, HAYASAKI discloses wherein the determining the text area corresponding to the image to be rendered based on the image mask comprises: 
in response to a foreground region in the image mask being greater than or equal to a first threshold, arranging the text area in an area corresponding to the foreground region in the image to be rendered (Paragraph [0094], a flow of the segmentation process. First, (i) the calculated maximum density difference is compared with a maximum density difference threshold, and (ii) the calculated total density busyness is compared with a total density busyness threshold. Then, if it is determined that the maximum density difference is smaller than the maximum density difference threshold and that the total density busyness is smaller than the total density busyness threshold, the target pixel is determined to be a page-background/photograph region; otherwise, the target pixel is determined to be a text/halftone dot region. Thus, if it is determined that the maximum density difference is “greater than or equal to” the maximum density difference threshold and that the total density busyness is “greater than or equal to” the total density busyness threshold, the target pixel is determined to be a text/halftone dot region).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 4, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 2).
	However, Kokemohr does not specifically disclose further comprising: 
dividing the image to be rendered into a first region and a second region in response to a foreground region in the image mask being less than a first threshold; and 
arranging the text area in the first region or the second region.  
In additional, HAYASAKI discloses further comprising: 
dividing the image to be rendered into a first region and a second region in response to a foreground region in the image mask being less than a first threshold (Paragraph [0089], The segmentation process ... a maximum density difference and total density busyness are calculated. The maximum density difference is a difference between a minimum density value and a maximum density value in an nxm (e.g., 15.times.15) block including a target pixel, and the total density busyness is the sum total of absolute values of density differences between adjacent pixels. Then, the maximum density difference and the total density busyness are compared with a plurality of predetermined thresholds. As a result of the comparison, pixels are segmented into a page-background region, a photograph region (continuous tone region), a text edge region, and a halftone dot region); and 
arranging the text area in the first region or the second region (Paragraph [0093], in a case where the total density busyness is smaller than a product of the maximum density difference and the text/halftone dot determining threshold, it is possible to determine that the pixel is a text edge pixel).  
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.


	Regarding claim 5, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 4).
	However, Kokemohr does not specifically disclose further comprising: 
arranging the text area at a preset position in the image to be rendered in response to the first region being less than a second threshold or the second region being less than the second threshold. 
In additional, HAYASAKI discloses further comprising: 
arranging the text area at a preset position in the image to be rendered in response to the first region being less than a second threshold (Paragraph [0093], the total density busyness-is smaller than that in the halftone dot region. Accordingly, in a case where the total density busyness is smaller than a product of the maximum density difference and the text/halftone dot determining threshold, it is possible to determine that the pixel is a text edge pixel) or the second region being less than the second threshold. 
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 17, Kokemohr discloses everything claimed as applied above (see claim 14).
	However, Kokemohr does not specifically disclose wherein the one or more programs further cause the one or more processors to: 
input the image to be rendered into a segmentation model to obtain an image mask; and 
determine the text area corresponding to the image to be rendered based on the image mask.
In additional, HAYASAKI discloses disclose wherein the one or more programs further cause the one or more processors to: 
input the image to be rendered into a segmentation model to obtain an image mask (FIG. 2; paragraph [0061],  the foreground mask generation section 21 extracts a text region as the foreground layer from an input image and generates a foreground mask); and 
determine the text area corresponding to the image to be rendered based on the image mask (Paragraph [0061], a text region as the foreground layer from an input image and generates a foreground mask. In this process, a pixel determined as the text region in a segmentation process explained later is converted to a binary value pixel, and a text pixel is extracted ...).  
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 18, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 17).
	However, Kokemohr does not specifically disclose wherein the one or more programs further cause the one or more processors to: 
in response to a foreground region in the image mask being greater than or equal to a first threshold, arrange the text area in an area corresponding to the foreground region in the image to be rendered.
In additional, HAYASAKI discloses wherein the one or more programs further cause the one or more processors to: 
in response to a foreground region in the image mask being greater than or equal to a first threshold, arrange the text area in an area corresponding to the foreground region in the image to be rendered (Paragraph [0094], a flow of the segmentation process. First, (i) the calculated maximum density difference is compared with a maximum density difference threshold, and (ii) the calculated total density busyness is compared with a total density busyness threshold. Then, if it is determined that the maximum density difference is smaller than the maximum density difference threshold and that the total density busyness is smaller than the total density busyness threshold, the target pixel is determined to be a page-background/photograph region; otherwise, the target pixel is determined to be a text/halftone dot region. Thus, if it is determined that the maximum density difference is “greater than or equal to” the maximum density difference threshold and that the total density busyness is “greater than or equal to” the total density busyness threshold, the target pixel is determined to be a text/halftone dot region).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 19, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 17).
	However, Kokemohr does not specifically disclose wherein the one or more programs further cause the one or more processors to: 
divide the image to be rendered into a first region and a second region in response to a foreground region in the image mask being less than a first threshold; and 
arrange the text area in the first region or the second region.   
In additional, HAYASAKI discloses wherein the one or more programs further cause the one or more processors to: 
divide the image to be rendered into a first region and a second region in response to a foreground region in the image mask being less than a first threshold (Paragraph [0089], The segmentation process ... a maximum density difference and total density busyness are calculated. The maximum density difference is a difference between a minimum density value and a maximum density value in an nxm (e.g., 15.times.15) block including a target pixel, and the total density busyness is the sum total of absolute values of density differences between adjacent pixels. Then, the maximum density difference and the total density busyness are compared with a plurality of predetermined thresholds. As a result of the comparison, pixels are segmented into a page-background region, a photograph region (continuous tone region), a text edge region, and a halftone dot region);; and 
arrange the text area in the first region or the second region (Paragraph [0093], in a case where the total density busyness is smaller than a product of the maximum density difference and the text/halftone dot determining threshold, it is possible to determine that the pixel is a text edge pixel).  
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

	Regarding claim 20, the combination of Kokemohr in view of HAYASAKI discloses everything claimed as applied above (see claim 19).
	However, Kokemohr does not specifically disclose wherein the one or more programs further cause the one or more processors to:
arrange the text area at a preset position in the image to be rendered in response to the first region being less than a second threshold or the second region being less than the second threshold.
In additional, HAYASAKI discloses wherein the one or more programs further cause the one or more processors to:
arrange the text area at a preset position in the image to be rendered in response to the first region being less than a second threshold (Paragraph [0093], the total density busyness-is smaller than that in the halftone dot region. Accordingly, in a case where the total density busyness is smaller than a product of the maximum density difference and the text/halftone dot determining threshold, it is possible to determine that the pixel is a text edge pixel) or the second region being less than the second threshold.
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of HAYASAKI, and applying the image processing taught by HAYASAKI to provide the foreground mask generation on the input image to generate a foreground mask for the text region. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of HAYASAKI to obtain the invention as specified in claim.

Claims 6 and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Kokemohr et al (U.S. Patent No. 10,049,477 B1) in view of Foroughi (U.S. Patent Application Publication 2019/0266433 A1).

	Regarding claim 6, Kokemohr discloses everything claimed as applied above (see claim 1), and Kokemohr discloses wherein the target pattern style of text comprises a target color of text, and the determining the target pattern style of text based on the image to be rendered (see claim 1).
	However, Kokemohr does not specifically disclose further comprises: 
converting the image to be rendered to a Hue, Saturation, Value (HSV) color space; 
for one or more pixels in the image to be rendered, obtaining hue values of the pixels in the HSV color space; and 
determining the target color of text based on the hue values of the one or more pixels.
In additional, Foroughi discloses further comprises: 
converting the image to be rendered to a Hue, Saturation, Value (HSV) color space (FIG. 4; paragraph [0035], the image is converted to a color space that enables separation of luminance and chromaticity. Assume, for example, that the image is provided in RGB format. A transformation may be performed, for example, to the HSV (hue, saturation, value) color space or to an alternative color space such as HSL (hue, saturation, lightness). In HSV color space, each pixel in the image is encoded by a hue component (describing the color), a saturation component (describing the intensity of the color, i.e., how much black, white or gray is added to the color), and a value component (describing the shade or brightness)); 
for one or more pixels in the image to be rendered (Paragraph [0049], FIG. 6, a method for cropping a foreground image ...l paragraph [0054], ... regions with text ...; paragraph [0055], for each row and for each column of pixels in the image), obtaining hue values of the pixels in the HSV color space (Paragraph [0055], hue values of HSV-transformed pixels); and 
determining the target color of text based on the hue values of the one or more pixels (Paragraph [0055], in Step 650, a color variance is obtained, separately for each row and for each column of pixels in the image obtained from the execution of Step 606. The color variance may be based on hue values of HSV-transformed pixels).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of Foroughi, and applying the method for clustering pixels of an image to obtain image segments taught by Foroughi to convert the input color image into HSV color space, determine hue values of pixels and obtain a color variance based on hue values of HSV-transformed pixels for enabling the distinction of text (foreground) from non-text ( background). Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of Foroughi to obtain the invention as specified in claim.

	Regarding claim 21, Kokemohr discloses everything claimed as applied above (see claim 14).
	However, Kokemohr does not specifically disclose wherein the one or more programs further cause the one or more processors to: 
convert the image to be rendered to a Hue, Saturation, Value (HSV) color space;
for one or more pixels in the image to be rendered, obtain hue values of the pixels in the HSV color space; and 
determine the target color of text based on the hue values of the one or more pixels.  
In additional, Foroughi discloses wherein the one or more programs further cause the one or more processors to: 
convert the image to be rendered to a Hue, Saturation, Value (HSV) color space (FIG. 4; paragraph [0035], the image is converted to a color space that enables separation of luminance and chromaticity. Assume, for example, that the image is provided in RGB format. A transformation may be performed, for example, to the HSV (hue, saturation, value) color space or to an alternative color space such as HSL (hue, saturation, lightness). In HSV color space, each pixel in the image is encoded by a hue component (describing the color), a saturation component (describing the intensity of the color, i.e., how much black, white or gray is added to the color), and a value component (describing the shade or brightness));
for one or more pixels in the image to be rendered (Paragraph [0049], FIG. 6, a method for cropping a foreground image ...l paragraph [0054], ... regions with text ...; paragraph [0055], for each row and for each column of pixels in the image), obtain hue values of the pixels in the HSV color space (Paragraph [0055], hue values of HSV-transformed pixels); and 
determine the target color of text based on the hue values of the one or more pixels (Paragraph [0055], in Step 650, a color variance is obtained, separately for each row and for each column of pixels in the image obtained from the execution of Step 606. The color variance may be based on hue values of HSV-transformed pixels).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of Foroughi, and applying the method for clustering pixels of an image to obtain image segments taught by Foroughi to convert the input color image into HSV color space, determine hue values of pixels and obtain a color variance based on hue values of HSV-transformed pixels for enabling the distinction of text (foreground) from non-text ( background). Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of Foroughi to obtain the invention as specified in claim.

Claims 9 and 10 are rejected under 35 U.S.C. 103 as being unpatentable over Kokemohr et al (U.S. Patent No. 10,049,477 B1) in view of Dhanuka et al (U.S. Patent Application Publication 2022/0335667 A1).

	Regarding claim 9, Kokemohr discloses everything claimed as applied above (see claim 1).
	However, Kokemohr does not specifically disclose wherein the attribute information of the text area comprises a width of a bounding box of the text area, and the target character style of text comprises a target character size of text, and the determining the target character style of text based on the attribute information of the text area comprises: 
determining the target character size of text based on the width of the bounding box and a number of text characters.
In additional, Dhanuka discloses wherein the attribute information of the text area comprises a width of a bounding box of the text area (Paragraph [0022], font designers, for instance, are free to design glyphs as desired within an em-box. The em-box has a height (e.g., as a number of points) and a width defined by the designer for spacing between inline glyphs ...; paragraph [0028], a bounding box is formed by the glyph sizing system that defines actual dimensions of the glyph within the em-box, e.g., based on minimum and maximum coordinates in X and Y directions of pixels of the glyph when rendered virtually from a corresponding vector representation; paragraph [0065], FIG. 9 depicts an example implementation 900 of a unit-of-measure that is dynamic 508 based on glyphs to be rendered in the user interface 110. This example implementation is depicted using first, second, third, fourth, and fifth stages 902, 904, 906, 908, 910. In some instances, content creators have a predefined area within which to place text and wish to maximize a size of the text within that area, such as a banner 912 ...), and the target character style of text comprises a target character size of text (Paragraph [0067], the size determination module 218 automatically adjusts font size internally in real time as the text is entered), and the determining the target character style of text based on the attribute information of the text area comprises: 
determining the target character size of text based on the width of the bounding box (Paragraph [0065], content creators have a predefined area within which to place text and wish to maximize a size of the text within that area, such as a banner 912) and a number of text characters (Paragraph [0067], at the fifth stage 910, a lower-case letter “e” is entered ... five letters “a, g, i, l and e” are entered).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of Dhanuka, and applying the glyph sizing control techniques taught by Dhanuka to provide a bounding box information for the text attribute and determine the target character size of text based on the width of the bounding box and number of text characters. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of Dhanuka to obtain the invention as specified in claim.

	Regarding claim 10, the combination of Kokemohr Dhanuka discloses everything claimed as applied above (see claim 9).
	However, Kokemohr does not specifically disclose wherein the determining the target character size of text based on the width of the bounding box and the number of text characters comprises: 
traversing each character size, starting with a largest character size; 
determining a text width based on a current character size and the number of text characters; and 
determining the current character size as the target character size of text in response to the text width being less than or equal to the width of the bounding box.
In additional, Dhanuka discloses wherein the determining the target character size of text based on the width of the bounding box and the number of text characters (see claim 9) comprises: 
traversing each character size, starting with a largest character size (FIG. 2; paragraph [0050], a user interface module 202 outputs a user interface 110. The user interface 110 includes options to initiate an input to specify a glyph-size property 204. The glyph-size property defines a unit-of-measure within an em-box for glyph sizing along a dimension ... Other options from the drop-down menu 304 include, but are not depicted, as x-height, ICF-height, dynamic (real time) height, object height, width, and other spans, e.g., based on descent; FIG. 7 shows the em-box, starting with a upper-case letter of “M”); 
determining a text width (Paragraph [0025], For that font and font size, different units of measure are output with indications of corresponding sizes of glyphs for the respective unit-of-measure. Examples of units of measure include width; FIG. 9; paragraphs [0066]-[0067], at the first stage 902 a lower-case letter “a” is entered and the size determination module 218 determine a font size of “54 pt.”... At the second stage 904, a lower-case letter “g” is entered, which has a descender. In response, the size determination module 218 determines a font size of “24 pt.” ... At the third stage 906, a lower-case letter “i” is entered, which has an ascender. In response, the size determination module 208 determines a font size of “22 pt.” ... At the four stage 906, a lower-case letter “1” is entered, which also has an ascender but which is “taller” than the ascender for the letter “i.” In response, the size determination module 218 determines a font size of “18 pt.” ... at the fifth stage 910, a lower-case letter “e” is entered, which does not cause a change and thus the size determination module 218 keeps a font size of “18 pt.”. Thus, the text width is determined based on the size of all fonts) based on a current character size (Paragraph [0067], the size determination module 218 automatically adjusts font size internally in real time as the text is entered ) and the number of text characters (Paragraph [0067], at the fifth stage 910, a lower-case letter “e” is entered ... five letters “a, g, i, l and e” are entered); and 
determining the current character size as the target character size of text in response to the text width being less than or equal to the width of the bounding box (FIG. 3; paragraph [0051], the user interface 110 includes a glyph size control 306 that indicates a size of a glyph for the selected unit-of-measure and functionality to change that amount, e.g., as an up/down arrow to change a number of points. This also causes a change to an indicated font size of a font size control 308 ... Other examples include a height control 310 and width control 312 to specify height and width as a percentage of an overall value as well as an alignment control 314 to specify alignment behaviors to a glyph (as opposed to line height), e.g., to align graphical objects to bounding box).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of Dhanuka, and applying the glyph sizing control techniques taught by Dhanuka to provide a bounding box information for the text attribute and determine the target character size of text based on the width of the bounding box and number of text characters. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of Dhanuka to obtain the invention as specified in claim.

Claims 11-12 are rejected under 35 U.S.C. 103 as being unpatentable over Kokemohr et al (U.S. Patent No. 10,049,477 B1) in view of KANG et al (U.S. Patent Application Publication 2019/0294652 A1).

	Regarding claim 11, Kokemohr discloses everything claimed as applied above (see claim 1), and Kokemohr discloses further comprising: 
selecting a video frame from a video to be processed as the image to be rendered (Col 5, lines 64-67, an image, as referred to herein, can be a still image or standalone image, or can be an image in a series of images, e.g., a frame in a video sequence of frames); and 
after rendering the image to be rendered based on the target character style of text and the target pattern style of text (Col 27, lines 21-46, text and style modifications can be generated, added, and edited similarly as described above and applied to video data. In some examples, each (or a portion) of one or more video frames can be treated as an image to which text and style modifications are added. The content depicted in video frames can vary over time, which may cause additional constraints on the text and style modifications. For example, multiple frames of the video data can be examined to determine how long a particular area of the video display remains suitable for text placement, to determine whether to place text in that area and/or the amount of text to place in that area ..).
However, Kokemohr does not specifically disclose using the rendered image as a cover of the video to be processed.
In additional, KANG discloses using the rendered image as a cover of the video to be processed (FIG. 1; paragraph [0111], when a content such as an image, a video, or a music is inserted into the note, the electronic device 100 may display an associated item (for example, an icon form) indicating that the corresponding content is included, and may provide the item to the user. According to an embodiment, when a video is included in the note, the electronic device 100 may generate a cover image including an associated item (for example, an icon form associated with a video (for example, a film icon)) identifying the video).
Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify the method for adding the text to the image taught by Kokemohr incorporate the teachings of KANG, and applying the method for providing a cover of a note created by a user in an electronic device taught by KANG to provide a cover generation for the system to render the modified image as a cover of the video. Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the invention to modify Kokemohr according to the relied-upon teachings of KANG to obtain the invention as specified in claim.

	Regarding claim 12, the combination of Kokemohr in view of KANG discloses everything claimed as applied above (see claim 1), and Kokemohr discloses further comprising: 
after receiving a video uploaded by a user, generating a cover generation instruction in response to detecting the video does not have a cover (Col 27, lines 21-46, text and style modifications can be generated, added, and edited similarly as described above and applied to video data. In some examples, each (or a portion) of one or more video frames can be treated as an image to which text and style modifications are added. The content depicted in video frames can vary over time, which may cause additional constraints on the text and style modifications. For example, multiple frames of the video data can be examined to determine how long a particular area of the video display remains suitable for text placement, to determine whether to place text in that area and/or the amount of text to place in that area ..).

Allowable Subject Matter
Claims 7-8 and 22 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.

Dependent claim 7 depends upon dependent claim 6 and recites the additional limitations of “calculating an average hue value of the hue values of the one or more pixels; determining a candidate set of color based on the average hue value; for at least one pixel in the image to be rendered, obtaining at least one of a saturation value or a brightness value of the pixel in the HSV color space; and selecting the target color of text from the candidate set of color based on at least one of saturation values or brightness values of the one or more pixels”.

Dependent claim 22 depends upon dependent claim 21 and recites the additional limitations of “calculate an average hue value of the hue values of the one or more pixels; determine a candidate set of color based on the average hue value; for at least one pixel in the image to be rendered, obtain at least one of a saturation value or a brightness value of the pixel in the HSV color space; and select the target color of text from the candidate set of color based on at least one of saturation values or brightness values of the one or more pixels”.
However, the search results failed to show the obviousness of the claims as a whole. None of the prior art cited alone or in combination provides the motivation to teach the limitations recited in claims 7 and 22.
	
Dependent claim depends from dependent claim 7 and has the same reasons as stated above.
	
Conclusion
	
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Xilin Guo whose telephone number is (571)272-5786. The examiner can normally be reached Monday - Friday 9:00 AM-5:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Daniel Hajnik can be reached at 571-272-7642. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/XILIN GUO/Primary Examiner, Art Unit 2616
Read full office action
Prosecution Timeline

Apr 24, 2024
Application Filed
Jan 14, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/484,586
Patent 12602855
LIVE MODEL PROMPTING AND REAL-TIME OUTPUT OF PHOTOREAL SYNTHETIC CONTENT
2y 5m to grant Granted Apr 14, 2026
18/589,428
Patent 12597403
DISPLAY DEVICE FOR A VEHICLE
2y 5m to grant Granted Apr 07, 2026
18/420,037
Patent 12579712
ASSET CREATION USING GENERATIVE ARTIFICIAL INTELLIGENCE
2y 5m to grant Granted Mar 17, 2026
18/586,703
Patent 12579766
SYSTEM AND METHOD FOR RAPID OUTFIT VISUALIZATION
2y 5m to grant Granted Mar 17, 2026
18/432,623
Patent 12573121
Automated Generation and Presentation of Sign Language Avatars for Video Content
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
82%
Grant Probability
99%
With Interview (+17.4%)
2y 5m
Median Time to Grant
Low
PTA Risk
Based on 456 resolved cases by this examiner. Grant probability derived from career allow rate.