Last updated: April 19, 2026
Application No. 18/482,878
IMAGE SYNTHESIS METHOD AND SYSTEM

Non-Final OA §102§103
Filed
Oct 07, 2023
Examiner
LEE, JONATHAN S
Art Unit
2677
Tech Center
2600 — Communications
Assignee
Alibaba Damo (Hangzhou) Technology Co., Ltd.
OA Round
1 (Non-Final)
Interview Optional

— +9.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 585 resolved cases, 2023–2026
Examiner Intelligence

LEE, JONATHAN S View full profile →
Grants 84% — above average
Career Allow Rate
493 granted / 585 resolved
+22.3% vs TC avg
Moderate +10% lift
Without
With
+9.5%
Interview Lift
resolved cases with interview
Typical timeline
2y 4m
Avg Prosecution
19 currently pending
Career history
604
Total Applications
across all art units
Statute-Specific Performance

§101
7.8%
-32.2% vs TC avg
§103
41.9%
+1.9% vs TC avg
§102
28.1%
-11.9% vs TC avg
§112
10.3%
-29.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 585 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.

Claim(s) 1, 2, 4, 7-16, 18, and 19 is/are rejected under 35 U.S.C. 102(a)(1) and (a)(2) as being anticipated by Zhang (U.S. Pub. No. 2020/0226729).
Regarding claim 1, Zhang teaches:
An image synthesis method (See the Abstract.), comprising: 
acquiring a first image and a second image, an image quality of the first image being higher than an image quality of the second image (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.); 
obtaining a compressed image by compressing the first image (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); and 
synthesizing at least a part of the compressed image with at least a part of the second image (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”).

Regarding claim 2, Zhang teaches:
The method according to claim 1, further comprising: performing statistical analysis on the second image to determine a quality characteristic value of the second image, wherein the compressing the first image comprises: reducing the image quality of the first image based on the quality characteristic value (See [0325]: “In a case that there are several background images having resolution differences from the resolution of the first portrait region image less than the threshold, an image having the smallest resolution difference from the resolution of the first portrait region image may be selected from the several background images as the first target background image.”).

Regarding claim 4, Zhang teaches:
The method according to claim 2, wherein reducing the image quality of the first image based on the quality characteristic value comprises: determining a compression parameter according to the first image and the quality characteristic value; and compressing the first image by using a preset compression algorithm and the compression parameter (See [0317]: “For example, for the first target background image having the resolution of 800×600 PPI, in a case that the number of pixels in the x direction of the first target background image may be reduced by half after the compression operation, in order to avoid that distortion occurs to the obtained second target background image, the number of pixels in the y direction of the first target background image may also be reduced by half after the compression.” The reduction factor (i.e., half) meets the claimed “compression parameter” and the recognition to perform compression on either a single background image or in the case of multiple background images meets the claimed “preset compression algorithm”.), wherein the compression parameter indicates a degree of an image quality decline, the degree being: negatively correlated with the image quality of the first image, and positively correlated with the quality characteristic value (See [0312]: “For the first portrait region image and the first target background image, in a case that the solution of the first target background image is higher, the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.” In the specific case that the background image resolution is higher, the claimed relationship (high quality of the background image means lower degree of quality decline and lower quality of the portrait image means higher degree of quality decline) is present.).

Regarding claim 7, Zhang teaches:
The method according to claim 1, wherein the first and second images are: a preset background image, and a foreground video frame sent by a client participating in a video conference, respectively; or video frames sent by two clients participating in a video conference, respectively; or video frames sent by two host clients participating in a microphone-connected live broadcast (See [0136]: “In some application scenarios, for example, it is desired to hide a current background while the current user is in a video chat with another, with the image processing method according to implementations of the present disclosure, the portrait region image corresponding to the current user and the target background image having the resolution that matches the resolution of the portrait region image may be merged, and the merged image is displayed to the another.”).

Regarding claim 8, Zhang teaches:
An image synthesis method (See the Abstract.), comprising: 
acquiring a first image and a second image, an image quality of the first image being higher than an image quality of the second image (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.); 
determining a processing parameter for reducing the image quality of the first image (See [0317]: “For example, for the first target background image having the resolution of 800×600 PPI, in a case that the number of pixels in the x direction of the first target background image may be reduced by half after the compression operation, in order to avoid that distortion occurs to the obtained second target background image, the number of pixels in the y direction of the first target background image may also be reduced by half after the compression.” The reduction factor (i.e., half) meets the claimed “processing parameter”.), and processing the first image by using the processing parameter (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); and 
synthesizing the processed first image with the second image (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”).

Zhang teaches claim 9 for the reasons given in the treatment of claim 2.

Regarding claim 10, Zhang teaches:
The method according to claim 8, wherein the processing parameter comprises a blur parameter, and the processing of the first image by using the processing parameter comprises: performing blur processing on the first image by using the blur parameter (See [0320]: “It should be noted that, in embodiments of the present disclosure, the second portrait region image may be merged with the second target background image after edges of the second portrait region image are feathered, such that the edges of the portrait region image may be smoothly and naturally transited to the second target background image, presenting a better visual effect of the merged image.” Use of a processing parameter/value to perform blurring is understood.).

Regarding claim 11, Zhang teaches:
The method according to claim 8, wherein the processing parameter comprises a compression parameter, and the processing of the first image by using the processing parameter comprises: compressing the first image by using the compression parameter (See [0317]: “For example, for the first target background image having the resolution of 800×600 PPI, in a case that the number of pixels in the x direction of the first target background image may be reduced by half after the compression operation, in order to avoid that distortion occurs to the obtained second target background image, the number of pixels in the y direction of the first target background image may also be reduced by half after the compression.” The reduction factor (i.e., half) meets the claimed “compression parameter”.).

Regarding claim 12, Zhang teaches:
A video conference method (See the Abstract.), comprising: 
receiving a foreground video frame sent by a client participating in a video conference, and acquiring a background image synthesized with the foreground video frame (See Fig. 10, steps 202, 224, and 225 and [0136]: “In some application scenarios, for example, it is desired to hide a current background while the current user is in a video chat with another, with the image processing method according to implementations of the present disclosure, the portrait region image corresponding to the current user and the target background image having the resolution that matches the resolution of the portrait region image may be merged, and the merged image is displayed to the another.”), an image quality of the foreground video frame being lower than an image quality of the background image (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.); 
compressing the background image to obtain a compressed background image (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); 
synthesizing the compressed background image with the foreground video frame to obtain a synthesized video frame (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”); and 
transmitting the synthesized video frame to the client, the synthesized video frame being used for display by the client (See [0136]: “In some application scenarios, for example, it is desired to hide a current background while the current user is in a video chat with another, with the image processing method according to implementations of the present disclosure, the portrait region image corresponding to the current user and the target background image having the resolution that matches the resolution of the portrait region image may be merged, and the merged image is displayed to the another.”).

Regarding claim 13, Zhang teaches:
A video conference method (See the Abstract.), comprising:
 receiving video frames sent by at least two clients participating in a video conference (See Fig. 10, steps 202, 224, and 225 and [0136]: “In some application scenarios, for example, it is desired to hide a current background while the current user is in a video chat with another, with the image processing method according to implementations of the present disclosure, the portrait region image corresponding to the current user and the target background image having the resolution that matches the resolution of the portrait region image may be merged, and the merged image is displayed to the another.”); the video frames comprising a first video frame and a second video frame (See the background image and portrait image in Fig. 10.), an video frame quality of the first video frame being higher than a video frame quality of the second video frame (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.);
compressing the first video frame to obtain a compressed first video frame (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); 
synthesizing the compressed first video frame with the second video frame, to generate a synthesized video frame (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”); and 
transmitting the synthesized video frame to the at least two clients, the synthesized video frames being used for display by the at least two clients (See [0136]: “In some application scenarios, for example, it is desired to hide a current background while the current user is in a video chat with another, with the image processing method according to implementations of the present disclosure, the portrait region image corresponding to the current user and the target background image having the resolution that matches the resolution of the portrait region image may be merged, and the merged image is displayed to the another.”).

Regarding claim 14, Zhang teaches:
An electronic device (See the Abstract.), comprising: 
a memory storing a set of instructions (See [0360].); and 
one or more processors configured to execute the set of instructions to cause the device to perform (See [0360].): 
acquiring a first image and a second image (See the background image and portrait image in Fig. 10.), an image quality of the first image being higher than an image quality of the second image (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.); 
obtaining a compressed image by compressing the first image (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); and 
synthesizing at least a part of the compressed image with at least a part of the second image (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”).

Regarding claim 15, Zhang teaches:
A non-transitory computer readable medium storing a set of instructions that is executable by one or more processors of an apparatus to cause the apparatus to execute an image synthesis method (See the Abstract and [0360].), the method comprising: 
acquiring a first image and a second image (See the background image and portrait image in Fig. 10.), an image quality of the first image being higher than an image quality of the second image (See [0312]: “For the first portrait region image and the first target background image, in a case that the [resolution] of the first target background image is higher…”.); 
obtaining a compressed image by compressing the first image (See Fig. 10, step 226 and [0312]: “…the compression operation may be performed on the first target background image to reduce the resolution of the first target background image.”); and 
synthesizing at least a part of the compressed image with at least a part of the second image (See Fig. 10 and [0125]: “At block 227, the second portrait region image and the second target background image are merged to obtain the merged image.”).

Zhang teaches claim 16 for the reasons given in the treatment of claim 2.

Zhang teaches claim 18 for the reasons given in the treatment of claim 4.

Zhang teaches claim 19 for the reasons given in the treatment of claim 5.









Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 5, 6, 19, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (U.S. Pub. No. 2020/0226729) in view of Son et al. (Toward Real-World Super-Resolution via Adaptive Downsampling Models, 8 September 2021, arXiv, Pages 1-19), hereinafter “Son”.
Claim 5 is met by the combination of Zhang and Son, wherein
Zhang teaches:
The method according to claim 4, wherein determining the compression parameter according to the first image and the quality characteristic value comprises: 
Zhang does not disclose the following; however, Son teaches:
processing the first image and the quality characteristic value based on a pre-trained compression parameter estimation model, to obtain the compression parameter (See the two-stage model (including a downsampling model D) in Fig. 2 and page 3, left column: “Thus, we develop an unsupervised learning framework to accurately simulate the LR samples ILR ∈ ILR from the unpaired HR images IHR ∈ IHR. The following SR model can then be trained to reconstruct the HR results from the given LR dataset ILR. For simplicity, we assume that the LR and HR images have spatial resolutions of H × W and sH × sW, respectively, for a downsampling factor s.” As seen on page 4, right column and page 7, left column, the filter weights and scale factors are determined to optimize the model. A downsampling factor s, meet the claimed “compression parameter”, is therefore obtained based on a pre-trained compression parameter estimation model.), wherein the compression parameter estimation model is obtained through supervised learning using a training set comprising one or more training samples (See the trainings performed in Fig. 2 using low and high resolution samples.), each training sample comprising: a training image, a compression parameter label (See page 3, right column: “Thus, we introduce the adversarial loss Ladv to enforce the downsampled images IDown to follow a target distribution ILR without using ground-truth LR images.”), and a quality characteristic value determined according to the training image and the compression parameter label (See Eq. 3 on page 3.).
Zhang and Son together teach the limitations of claim 5. Son is directed to a similar field of art (adaptive downsampling models for superr-resolution). Therefore, Zhang and Son are combinable. Modifying the system and method of Zhang by adding the capability of “processing the first image and the quality characteristic value based on a pre-trained compression parameter estimation model, to obtain the compression parameter, wherein the compression parameter estimation model is obtained through supervised learning using a training set comprising one or more training samples, each training sample comprising: a training image, a compression parameter label, and a quality characteristic value determined according to the training image and the compression”, as taught by Son, would yield the expected and predictable result of improved estimation of lower resolution image using the downsampling model. Therefore, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine Zhang and Son in this way.

Claim 6 is met by the combination of Zhang and Son, wherein
Zhang teaches:
The method according to claim 1, further comprising:  
Zhang does not disclose the following; however, Son teaches:
determining the image qualities of the first and second images, based on a pre-trained image quality evaluation model, wherein the image quality evaluation model is pre-trained through supervised learning using a training set comprising one or more training samples, each training sample comprising: a distorted image, obtained by reducing an image quality of a training image (See the two-stage model (including a downsampling model D) in Fig. 2 and page 3, left column: “Thus, we develop an unsupervised learning framework to accurately simulate the LR samples ILR ∈ ILR from the unpaired HR images IHR ∈ IHR. The following SR model can then be trained to reconstruct the HR results from the given LR dataset ILR. For simplicity, we assume that the LR and HR images have spatial resolutions of H × W and sH × sW, respectively, for a downsampling factor s.”); and a quality matrix label, determined based on a difference between the training image and the distorted image (See Eq. 3 on page 3.).
See the motivation to combine in the treatment of claim 5.

Claim 19 is met by the combination of Zhang and Son for the reasons given in the treatment of claim 5.

Claim 20 is met by the combination of Zhang and Son for the reasons given in the treatment of claim 6.

Claim(s) 3 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Zhang (U.S. Pub. No. 2020/0226729) in a single reference rejection.
Claim 3 is met by the Zhang, wherein
Zhang teaches:
The method according to claim 2, wherein: the image quality of the second image is represented by a quality matrix that comprises a plurality of entries respectively corresponding to a plurality of pixels of the second image, each of the plurality of entries indicating a degree of loss of a corresponding pixel value (See the correction matrix in [0187]-[0188].); and 
Zhang suggests the following under certain conditions:
the quality characteristic value of the second image comprises: a statistical value of the plurality of entries, the statistical value comprising at least one of an average value, a median, or a maximum value (First see [0321]: “In some implementations, in a case that the first target background image is selected by the processor 20, in order to prevent the processing amount of the processor 20 from being too high, an image having a small resolution difference from the resolution of the first portrait region image may be selected from multiple background images to reduce a processing pressure of the processor 20.” Then see [0325]: “ an image having the smallest resolution difference from the resolution of the first portrait region image may be selected from the several background images as the first target background image.” A minimum value is disclosed. However, to find that minimum value, Zhang computes the resolution differences between the several available background images and the portrait image. A maximum value is therefore computed but not selected to reduce “processing pressure of the processor”. In the case that reducing that processing pressure is not an issue, one of ordinary skill in the art would select a background image having the maximum value for other reasons, e.g., that background image having otherwise high quality in terms of even illumination.).
Zhang meets the limitations of claim 3. Since Zhang discloses computing resolution differences between the portrait image and each of multiple background images, the Zhang provides enough information to enable one of at least ordinary skill in the art to determine the claimed “maximum value”. One of at least ordinary skill in the art would then select the “maximum value” in a scenario in which he/she is not limited by processing power and arrive at the claimed invention.

Claim 17 is met by Zhang for the reasons given in the treatment of claim 3.



Contact
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JONATHAN S LEE whose telephone number is (571)272-1981. The examiner can normally be reached 11:30 AM - 7:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at (571)270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Jonathan S Lee/Primary Examiner, Art Unit 2677
Read full office action
Prosecution Timeline

Oct 07, 2023
Application Filed
Jan 09, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/012,861
Patent 12602807
METHOD FOR SUBPIXEL DISPARITY CALCULATION
2y 5m to grant Granted Apr 14, 2026
19/032,338
Patent 12602785
TRAINING A MACHINE LEARNING MODEL TO ASSESS EMBRYO CHARACTERISTICS FROM VIDEO IMAGE DATA
2y 5m to grant Granted Apr 14, 2026
18/453,818
Patent 12597108
METHOD AND APPARATUS TO PERFORM A WIRELINE CABLE INSPECTION
2y 5m to grant Granted Apr 07, 2026
18/482,149
Patent 12597110
IMAGE RECOGNITION METHOD, APPARATUS AND DEVICE
2y 5m to grant Granted Apr 07, 2026
17/942,548
Patent 12584727
DIMENSION MEASUREMENT METHOD AND DIMENSION MEASUREMENT DEVICE
2y 5m to grant Granted Mar 24, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
84%
Grant Probability
94%
With Interview (+9.5%)
2y 4m
Median Time to Grant
Low
PTA Risk
Based on 585 resolved cases by this examiner. Grant probability derived from career allow rate.