Last updated: April 19, 2026
Application No. 18/004,578
Videoconferencing Systems with Facial Image Rectification

Final Rejection §103§112
Filed
Jan 06, 2023
Examiner
PHAM, NHUT HUY
Art Unit
2674
Tech Center
2600 — Communications
Assignee
HP (Chongqing) Co., Ltd.
OA Round
3 (Final)
Interview Optional

— +26.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 53 resolved cases, 2023–2026
Examiner Intelligence

PHAM, NHUT HUY View full profile →
Grants 79% — above average
Career Allow Rate
42 granted / 53 resolved
+17.2% vs TC avg
Strong +27% interview lift
Without
With
+26.8%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
31 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
9.4%
-30.6% vs TC avg
§103
62.2%
+22.2% vs TC avg
§102
11.9%
-28.1% vs TC avg
§112
14.5%
-25.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 53 resolved cases
Office Action

§103 §112
DETAIL OFFICE ACTIONS
The United States Patent & Trademark Office appreciates the response filed for the current application that is submitted on 11/10/2025. The United States Patent & Trademark Office reviewed the following documents submitted and has made the following comments below.
Applicant submitted amendments on 11/10/2025. The Examiner acknowledges the amendment and has reviewed the claims accordingly.

Applicant Arguments:
Applicant/s state/s that the cited prior arts do not teach the amended claims, specially, the limitation “receiving the second image frame at a neural processing unit; receiving the composite image frame at the neural processing unit; and forming a rectified image frame using the neural processing unit, based on the second image frame and the composite image frame.”; therefore, the rejection under 35 U.S.C. 103 should be withdrawn. 

Examiner’s Responses:
Applicant’s arguments and amendments, see Remarks, filed 11/10/2025, with respect to the rejection(s) of claim(s) 2 and 13 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration of amendments, a new ground(s) of rejection is made in view of Wang et al. (US-20190122329-A1), in view of Liu et al. (Liu, Zhi-Song, Wan-Chi Siu, and Yui-Lam Chan. "Reference based face super-resolution." IEEE, published 2019, hereinafter Liu).
Claim Status
Claims 7-9 and 18-19 are rejected under 35 U.S.C. 112(b).
Claims 25-27 are rejected under 35 U.S.C. 112(a).
Claims 2-5, 7-9, 13-16 and 18-27 are rejected under 35 U.S.C. 103:
Claims 2, 4-5, 8 and 23-27 are rejected in view of Wang in view of Liu.
Claim 3 is rejected in view of Wang in view of Liu in view of WangShi.
Claim 9 is rejected in view of Wang in view of Liu in view of Fang.
Claims 7, 13, 15-16, 18-22 are rejected in view of Wang in view of Liu in view of Kabata.
Claim 14 is rejected in view of Wang in view of Liu in view of Kabata in view of WangShi.
Claim 10 is objected. 

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 25-27 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. Claim 25 discloses “… replacing a first polygon of the plurality of polygons with a second polygon of the first plurality of polygons… “ This describes taking a polygon from the Reference Image and swapping it with another polygon in the Reference Image.
However, the specification consistently describes transferring data from the first set of polygons to the second set of polygons.
Paragraph [0039]: “At step 620, the system 300 maps 621 pixel information from polygons 610 of region 608 of the reference image frame 604 [first image] to corresponding polygons 618 determined from region 616 [second image].”
Paragraph [0040]: “The system 300 replaces image data in some or all of the polygons in region 616 of the received image frame 614 [second image] with translated image data from region 608 of the reference image frame 604 [first image] …”

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 7-9 and 18-19 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention. The examiner strongly suggested that appropriate corrections be made to clarify the claim scope.
With respect to Claim 7, the claim recites the following, each of which renders the claim indefinite:
“The method of claim 6” on line 1 (unclear to what this refers since claim 6 is cancelled. For the purpose of the examination, this is interpreted as “The method of claim 2”);
With respect to Claim 8, the claim recites the following, each of which renders the claim indefinite:
“The method of claim 6” on line 1 (unclear to what this refers since claim 6 is cancelled. For the purpose of the examination, this is interpreted as “The method of claim 2”);
With respect to Claim 9, the claim recites the following, each of which renders the claim indefinite:
“The method of claim 6” on line 1 (unclear to what this refers since claim 6 is cancelled. For the purpose of the examination, this is interpreted as “The method of claim 2”);
With respect to Claim 18, the claim recites the following, each of which renders the claim indefinite:
“The videoconferencing system of claim 17” on line 1 (unclear to what this refers since claim 17 is cancelled. For the purpose of the examination, this is interpreted as “The videoconferencing system of claim 13”);
With respect to Claim 19, the claim recites the following, each of which renders the claim indefinite:
“The videoconferencing system of claim 17” on line 1 (unclear to what this refers since claim 17 is cancelled. For the purpose of the examination, this is interpreted as “The videoconferencing system of claim 13”);

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2, 4-5, 8 and 23-27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al. (US-20190122329-A1, published 04/25/2019, hereinafter Wang) in view of Liu et al. (Liu, Zhi-Song, Wan-Chi Siu, and Yui-Lam Chan. "Reference based face super-resolution." IEEE, published 2019, hereinafter Liu).
CLAIM 2
In regards to Claim 2, Wang teaches a method comprising: receiving a first image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN … the source face”; See FIG. 2 annotation (1) below. The Examiner notes a single frame from target video, which depicts a face, is input into the model); determining locations of first feature landmarks corresponding to the first image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”; See FIG. 2 annotation (2) below); determining a first region corresponding to the first image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the first feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the first region into a first plurality of polygons based on the locations of the first feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); receiving a second image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN … target face”; See FIG. 2 annotation (3) below. The Examiner notes a single frame from source video (different with target video), which also depicts a face, is input into the model); determining locations of second feature landmarks corresponding to the second image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network 
    PNG
    media_image1.png
    1233
    1761
    media_image1.png
    Greyscale
(CMC-CNN) … denotes the 2D positions of facial landmarks”; See FIG. 2 annotation (4) below);

determining a second region corresponding to the second image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the second feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the second region into a second plurality of polygons based on the locations of the second feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); translating image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image); and forming a composite image frame (Wang, ¶ [0037]: “Image blending results in a blend of the warped source face into the target face, and is needed to produce natural and realistic face replacement results”, see FIG. 5) by replacing image data of at least one polygon in the second plurality of polygons with translated image data from the one or more polygons of the first plurality of polygons. (Wang, ¶ [0037]: “The system can automatically select the facial ROI, according to the detected facial landmarks. After getting the facial ROI in source image and target image, a Poisson Image Editing technique or other picture compositing method can be used to seamlessly blend the source face into target face”); receiving the first image frame at a neural processing unit (Wang, ¶ [0008-0009]: “cascade multichannel convolutional neural network (CMC-CNN)…a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the source face”); receiving the composite image frame at the neural processing unit (Wang, [0008-0009]: “cascade multichannel convolutional neural network (CMC-CNN)…a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the target face”);
Wang does not explicitly disclose forming a rectified image frame using the neural processing unit, based on the first image frame and the composite image frame.
Liu is in the same field of art of image synthesis by using neural network. Further, Liu teaches forming a rectified image frame (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”. Liu teaches how to enhance a low quality input image with a reference image, and generate a super-resolution (SR) image. The network uses a rectified linear unit (ReLU) as the activation function. Thus, this reads on “forming a rectified image frame”) (Liu, page 129116, right col, first paragraph: “the decoder can sample from the learned generative model of P(z|R, X) to generate SR images.”) using the neural processing unit (Liu, page 129115, section III The Proposed Work: “we will introduce the proposed RefSR-VAE”, see FIG. 4), based on the first image frame (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see input LR image (low-resolution) in FIG. 4) and the composite image frame (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see Reference image in FIG. 4).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Wang by incorporating the method for image enhancement that is taught by Liu, to make an image processing system that is able to enhance quality of image; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need for a post-processing method to improve image quality in image processing system that involve an input image and a reference image (Liu, abstract: “Despite the great progress of image super-resolution in recent years, face super-resolution has still much room to explore good visual quality while preserving original facial attributes for larger up-scaling factors. This paper investigates a new research direction in face super-resolution, called Reference based face Super-Resolution (RefSR), in which a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution”).
	Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

CLAIM 4
In regards to Claim 4, the combination of Wang and Liu teaches the method of Claim 2. In addition, the combination of Wang and Liu teaches determining locations of first feature landmarks corresponding to the first image frame comprises determining locations of first facial feature landmarks corresponding to the first image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining the first region corresponding to the first image frame, based on the locations of the first feature landmarks comprises determining a first facial region corresponding to the first image frame, based on the locations of the first facial feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created … utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”); partitioning the first region into the first plurality of polygons based on the locations of the first feature landmarks comprises partitioning the first facial region into the first plurality of polygons based on the locations of the first facial feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); determining locations of second feature landmarks corresponding to the second image frame comprises determining locations of second facial feature landmarks corresponding to the second image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining the second region corresponding to the second image frame, based on the locations of the second feature landmarks comprises determining a second facial region corresponding to the second image frame, based on the locations of the second facial feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created … utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”); partitioning the second region into a second plurality of polygons based on the locations of the second feature landmarks comprises partitioning the second facial region into the second plurality of polygons based on the locations of the second facial feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); and translating image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons comprises mapping facial image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons. (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image)

CLAIM 5
In regards to Claim 5, the combination of Wang and Liu teaches the method of Claim 4. In addition, the combination of Wang and Liu teaches first image depicting a person (Wang, see face of target video in FIG. 2); and the second image frame (Wang, ¶ [0008-0009]: “source face”, see FIG. 2) within a data stream initiated at a remote videoconferencing system (Wang, ¶ [0002]: “online video chats”, ¶ [0018]: “a remote computer system), the second image frame depicting the person (Wang, see face of source video in FIG. 2); the first image being of a first quality, the second image being of a second quality, wherein the second quality is inferior to first quality. (Liu, Abstract: “a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution”. Liu teaches how to enhance a LR (low-resolution) input image with a reference image, the reference image contains detailed visual attributes that can be transferred to the LR image)

CLAIM 8
In regards to Claim 8, the combination of Wang and Liu teaches the method of Claim 2. In addition, the combination of Wang and Liu teaches receiving a third image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the source face”), the third image frame corresponding to the rectified image frame (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”); determining locations of third feature landmarks corresponding to the third image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining a third region corresponding to the third image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the third feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the third region into a third plurality of polygons based on the locations of the third feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); receiving a fourth image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN… target face”); determining locations of fourth feature landmarks corresponding to the fourth image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining a fourth region corresponding to the fourth image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the fourth feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the fourth region into a fourth plurality of polygons based on the locations of the fourth feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); translating image data of one or more polygons of the third plurality of polygons to one or more polygons of the fourth plurality of polygons (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image); and forming a composite image frame (Wang, ¶ [0037]: “Image blending results in a blend of the warped source face into the target face, and is needed to produce natural and realistic face replacement results”, see FIG. 5) by replacing image data of at least one polygon in the fourth plurality of polygons with translated image data from the one or more polygons of the third plurality of polygons. (Wang, ¶ [0037]: “The system can automatically select the facial ROI, according to the detected facial landmarks. After getting the facial ROI in source image and target image, a Poisson Image Editing technique or other picture compositing method can be used to seamlessly blend the source face into target face”)

CLAIM 23
In regards to Claim 23, the combination of Wang and Liu teaches the method of Claim 2. In addition, the combination of Wang and Liu teaches the neural processing unit employs circuitry optimized for matrix operations. (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”. The Examiner notes convolution is fundamentally a form of matrix multiplication)

CLAIM 24
In regards to Claim 24, the combination of Wang and Liu teaches the method of Claim 2. In addition, the combination of Wang and Liu teaches one or more neural networks perform convolutional-type operations on the first image frame and the composite image frame to produce the rectified image frame. (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”)

CLAIM 25
Regarding Claim 25, Wang teaches a method (Wang, ¶ [0006]: “a face replacement method for replacing a target face with a source face includes the steps of determining facial landmarks in both the target and the source face using a cascade multichannel convolutional neural network (CMC-CNN). Next, the source face is warped using its determined facial landmarks to match the determined facial landmarks of the target face”. Wang teaches a method of transferring visual attributes between images)
Wang does not explicitly disclose receiving a first image of a person having a first quality; receiving second image of the person, the second image having a second quality inferior to the first quality.
Liu is in the same field of art of modifying facial region of one image based on a reference image. Further, Liu teaches receiving a first image of a person having a first quality; receiving second image of the person (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images”, see FIG. 13, both LR image and reference image depicted same person), the second image having a second quality inferior to the first quality. (Liu, Abstract: “a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution”. Liu teaches an image enhancement method based on transferring visual attributes between images)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Wang by incorporating source image and target image of the same person that is taught by Liu, to make a face replacement method that can enhance a low-quality input image using visual attributes of a high-quality reference image; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need for applying methods of transferring visual attributes between images in image quality enhance system (Wang, abstract: “a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution. We focus on transferring the key information extracted from reference facial images to the super-resolution process to guarantee the content similarity between the reference and super-resolution image”).
The combination of Wang and Liu teaches determining a first facial region of the person of the first image (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on determining locations of facial landmarks in the first image (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the first facial region into a first plurality of polygons (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); determining a second facial region of the person of the second image (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on determining locations of facial landmarks in the second image (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the second facial region into a second plurality of polygons (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); replacing a first polygon of the first plurality of polygons with a second polygon of the first plurality of polygons to generate a revised second facial region, the first polygon corresponding to the second polygon (Wang, ¶ [0034-0036]: “a Delaunay triangulation, which follows the max-min criterion can be constructed to maximize the minimum angles in all triangles. Next, a linear interpolation between two triangles is made. For instance, [(X1,Y1),(x1,y1)], [(X2,Y2),(x2,y2)] and [(X3,Y3),(x3,y3)] are three corresponding control points' coordinates, for which a linear interpolation function X=f(x,y) and Y=g(x,y) that overlays the triangles can be provided”, see FIG. 3. Wang teaches replacing a first triangle of a facial region with a corresponding second triangle, which warps the facial region into a warped facial region); and forming a composite image frame (Wang, ¶ [0037]: “Image blending results in a blend of the warped source face into the target face, and is needed to produce natural and realistic face replacement results”, see FIG. 5) by replacing the second facial region of the second image with the revised second facial region. (Wang, ¶ [0037]: “The system can automatically select the facial ROI, according to the detected facial landmarks. After getting the facial ROI in source image and target image, a Poisson Image Editing technique or other picture compositing method can be used to seamlessly blend the source face into target face”. Wang teaches further modifying the warped facial region of source image by blending source face into target face); 
	Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

CLAIM 26
Regarding claim 26, the combination of Wang and Liu teaches the method of Claim 25. In addition, the combination of Wang and Liu teaches determining the first facial region of the person and determining the second facial region of the person (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”) rely on the same facial landmarks. (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images”, see FIG. 13, both LR image and reference image depicted same person) (Wang, ¶ [0024-0025]: “The processing module detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) model based on deep learning in both …”) (The Examiner notes facial landmarks that are detected in two different images of a person are same landmarks)

CLAIM 27
Regarding claim 27, the combination of Wang and Liu teaches the method of Claim 25. In addition, the combination of Wang and Liu teaches providing, to one or more neural networks (Liu, page 129115, section III The Proposed Work: “we will introduce the proposed RefSR-VAE”, see FIG. 4), the composite image and the second image (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images”) to output a rectified image. (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”. Liu teaches how to enhance a low quality input image with a reference image, and generate a super-resolution (SR) image. The network uses a rectified linear unit (ReLU) as the activation function. Thus, this reads on “forming a rectified image frame”) (Liu, page 129116, right col, first paragraph: “the decoder can sample from the learned generative model of P(z|R, X) to generate SR images.”)

CLAIM 3
Claim(s) 3 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Liu, and further in view of WangShi et al. (CN109410119A, published 2019, a translated copy of this document is attached, hereinafter WangShi)
In regards to Claim 3, the combination of Wang and Liu teaches the method of Claim 2. In addition, the combination of Wang and Liu teaches partitioning the first region into the first plurality of polygons based on the locations of the first feature landmarks comprises partitioning the first region into a first quantity of polygons (Wang, see FIG. 3 (b)); partitioning the second region into the second plurality of polygons based on the locations of the second feature landmarks comprises partitioning the second region into a second quantity of polygons (Wang, see FIG. 3 (a));
The combination of Wang and Liu does not explicitly disclose the second quantity of polygons is equal to the first quantity of polygons.
WangShi is in the same field of art of image deformation using Delaunay triangulation. Further, WangShi teaches the second quantity of polygons is equal to the first quantity of polygons. (WangShi, ¶ [0092]: “The mask template needs to be deformed to the face of the current posture, which is actually an affine transformation process… An arbitrary affine transformation usually includes rotation (linear transformation), translation and scaling, and usually represents a mapping relationship between two images … the selected mask and the face image are divided into equal numbers of triangles, and their deformation is to transform one shape between the triangles into another shape, and make a one-to-one mapping of the textures contained therein, as shown in FIG6.”)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Wang and Liu by incorporating the Delaunay triangulation process that is taught by WangShi, to make a Delaunay triangulation process that divides both images into equal number of triangles; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need to improve the accuracy of the image deformation process (WangShi, ¶ [0020]: “the mask image can be deformed and accurately attached to the face of the person being tracked in real time, so that the face and mask are harmoniously and naturally integrated.”).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

CLAIM 9
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Liu, and further in view of Fang et al. (US20210097648A1, foreign priority claimed 2019, hereinafter Fang).
In regards to Claim 9, the combination of Wang and Liu teaches the method of claim 2
The combination of Wang and Liu does not explicitly disclose receiving the first image frame at the neural processing unit comprises receiving the first image frame at a processing unit comprising a U-net architecture; and receiving the composite image frame at the neural processing unit comprises receiving the composite image frame at the processing unit having the U-net architecture.
Fang is in the same field of art of reference-based image enhancement system. Further, Fang teaches receiving the first image frame at the neural processing unit comprises receiving the first image frame at a processing unit comprising a U-net architecture; and receiving the composite image frame at the neural processing unit comprises receiving the composite image frame at the processing unit having the U-net architecture. (Fang, ¶ [0035-0036]: “synthesizing the features corresponding to the low-resolution target image and the features corresponding to the reference image, for each scale, to generate the final output may include: encoding aligned feature maps, for each scale, using an encoder, like the U-Net encoder”. Fang teaches enhancing a low resolution input image with a reference image, using U-Net encoder)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Wang and Liu by incorporating U-Net encoder that is taught by Fang, to make a reference-based image enhancement system that can combine features of two images in multi-scale; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need to improve speed and performance (Fang, ¶ [0052]: “compared with the PatchMatch method, the speed is improved, and no grid noise occurs. Furthermore, compared with other alignment and synthesis methods, the alignment in the feature domain and the direct supervisory signals allow good convergence effect and high output performance.”).
	Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

Claim(s) 7, 13, 15-16, 18-22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang in view of Liu, and further in view of Kabata et al. (WO2020090128A1, a translated copy of this document is attached, published 05/07/2020, hereinafter, Kabata) 
CLAIM 7
In regards to Claim 7, the combination of Wang and Liu teaches the method of Claim 2.
The combination of Wang and Liu does not explicitly disclose displaying the image depicting the person within a predetermined period of receiving the second image frame at a videoconferencing system.
Kabata is in the same field of art of facial image processing. Further, Kabata teaches displaying the image depicting the person within a predetermined period of receiving the second image frame at a videoconferencing system. (Kabata, ¶ [0046]: “This display is performed substantially in real time after the image is captured by the camera 210, preferably within 0.5 seconds.”)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Wang and Liu by incorporating displaying system that is taught by Kabata, to make an image processing system that is able to display an image shortly after the image is captured; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need to improve user experience in videoconferencing system by displaying captured image in real time (Kabata, ¶ [0046]: “This display is performed substantially in real time after the image is captured by the camera 210, preferably within 0.5 seconds.”).
	Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

CLAIM 13
In regards to Claim 13, Kabata teaches a videoconferencing system (Kabata, ¶ [0019]: “The system according to the first embodiment is a video conference system”) comprising a processor (Kabata, ¶ [0020-0021]: “computer device 100 as an image processing device”)
Kabata does not explicitly disclose the processor operable to: receive a first image frame; determine locations of first feature landmarks corresponding to the first image frame; determine a first region corresponding to the first image frame, based on the locations of the first feature landmarks; partition the first region into a first plurality of polygons based on the locations of the first feature landmarks; receive a second image frame; determine locations of second feature landmarks corresponding to the second image frame; determine a second region corresponding to the second image frame, based on the locations of the second feature landmarks; partition the second region into a second plurality of polygons based on the locations of the second feature landmarks; translate image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons; and form a composite image frame by replacing image data of at least one polygon in the second plurality of polygons with translated image data from the one or more polygons of the first plurality of polygons; and a neural processor, wherein the neural processor is operable to: receive the first image frame; receive the composite image frame; and form a rectified image frame based on the first image frame and the composite image frame.
Wang is in the same field of art of image processing system. Further, Wang teaches a processor (Wang, ¶ [0051]: “a special purpose or general-purpose computer including computer hardware, such as, for example, one or more processors”), wherein the processor operable to: receiving a first image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN … the source face”; See FIG. 2 annotation (1) below. The Examiner notes a single frame from target video, which depicts a face, is input into the model); determining locations of first feature landmarks corresponding to the first image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”; See FIG. 2 annotation (2) below); determining a first region corresponding to the first image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the first feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the first region into a first plurality of polygons based on the locations of the first feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); receiving a second image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN … target face”; See FIG. 2 annotation (3) below. The Examiner notes a single frame from source video (different with target video), which also depicts a face, is input into the model); determining locations of second feature landmarks corresponding to the second image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”; 
    PNG
    media_image1.png
    1233
    1761
    media_image1.png
    Greyscale
See FIG. 2 annotation (4) below);

determining a second region corresponding to the second image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the second feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the second region into a second plurality of polygons based on the locations of the second feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); translating image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image); and forming a composite image frame (Wang, ¶ [0037]: “Image blending results in a blend of the warped source face into the target face, and is needed to produce natural and realistic face replacement results”, see FIG. 5) by replacing image data of at least one polygon in the second plurality of polygons with translated image data from the one or more polygons of the first plurality of polygons. (Wang, ¶ [0037]: “The system can automatically select the facial ROI, according to the detected facial landmarks. After getting the facial ROI in source image and target image, a Poisson Image Editing technique or other picture compositing method can be used to seamlessly blend the source face into target face”); receiving the first image frame at a neural processing unit (Wang, ¶ [0008-0009]: “cascade multichannel convolutional neural network (CMC-CNN)…a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the source face”); receiving the composite image frame at the neural processing unit (Wang, [0008-0009]: “cascade multichannel convolutional neural network (CMC-CNN)…a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the target face”);
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Kabata by incorporating image processing system that is taught by Wang, to make a videoconferencing system that can modify user’s face; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need for a videoconferencing system that can modify user’s face (Wang, ¶ [0002]: “Face replacement is also applicable to social media, virtual, or direct personal interactions such as online video chats.”).
The combination of Kabata and Wang does not explicitly disclose forming a rectified image frame using the neural processing unit, based on the first image frame and the composite image frame.
Liu is in the same field of art of image synthesis by using neural network. Further, Liu teaches forming a rectified image frame (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”. Liu teaches how to enhance a low quality input image with a reference image, and generate a super-resolution (SR) image. The network uses a rectified linear unit (ReLU) as the activation function. Thus, this reads on “forming a rectified image frame”) (Liu, page 129116, right col, first paragraph: “the decoder can sample from the learned generative model of P(z|R, X) to generate SR images.”) using the neural processing unit (Liu, page 129115, section III The Proposed Work: “we will introduce the proposed RefSR-VAE”, see FIG. 4), based on the first image frame (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see input LR image (low-resolution) in FIG. 4) and the composite image frame (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see Reference image in FIG. 4).
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Kabata and Wang by incorporating the method for image enhancement that is taught by Liu, to make an image processing system that is able to enhance quality of image; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need for a post-processing method to improve image quality in image processing system that involve an input image and a reference image (Liu, abstract: “Despite the great progress of image super-resolution in recent years, face super-resolution has still much room to explore good visual quality while preserving original facial attributes for larger up-scaling factors. This paper investigates a new research direction in face super-resolution, called Reference based face Super-Resolution (RefSR), in which a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution”).
	Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.

CLAIM 15
In regards to Claim 15, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches determining locations of first feature landmarks corresponding to the first image frame comprises determining locations of first facial feature landmarks corresponding to the first image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining the first region corresponding to the first image frame, based on the locations of the first feature landmarks comprises determining a first facial region corresponding to the first image frame, based on the locations of the first facial feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created … utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”); partitioning the first region into the first plurality of polygons based on the locations of the first feature landmarks comprises partitioning the first facial region into the first plurality of polygons based on the locations of the first facial feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); determining locations of second feature landmarks corresponding to the second image frame comprises determining locations of second facial feature landmarks corresponding to the second image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining the second region corresponding to the second image frame, based on the locations of the second feature landmarks comprises determining a second facial region corresponding to the second image frame, based on the locations of the second facial feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created … utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”); partitioning the second region into a second plurality of polygons based on the locations of the second feature landmarks comprises partitioning the second facial region into the second plurality of polygons based on the locations of the second facial feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); and translating image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons comprises mapping facial image data of one or more polygons of the first plurality of polygons to one or more polygons of the second plurality of polygons. (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image)

CLAIM 16
In regards to Claim 16, the combination of Kabata, Wang and Liu teaches the system of Claim 15. In addition, the combination of Kabata, Wang and Liu teaches first image depicting a person (Wang, see face of target video in FIG. 2); and the second image frame (Wang, ¶ [0008-0009]: “source face”, see FIG. 2) within a data stream initiated at a remote videoconferencing system (Kabata, ¶ [0019]: “The system according to the first embodiment is a video conference system”) (Wang, ¶ [0002]: “online video chats”, ¶ [0018]: “a remote computer system), the second image frame depicting the person (Wang, see face of source video in FIG. 2); the first image being of a first quality, the second image being of a second quality, wherein the second quality is inferior to first quality. (Liu, Abstract: “a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution”. Liu teaches how to enhance a LR (low-resolution) input image with a reference image, the reference image contains detailed visual attributes that can be transferred to the LR image)

CLAIM 18
In regards to Claim 18, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches causing the display device to display the image depicting the person based on the rectified image frame within a predetermined period of receiving the second image frame. (Kabata, ¶ [0046]: “This display is performed substantially in real time after the image is captured by the camera 210, preferably within 0.5 seconds.”)

CLAIM 19
In regards to Claim 19, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches receiving a third image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN…the source face”), the third image frame corresponding to the rectified image frame (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”) ; determining locations of third feature landmarks corresponding to the third image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining a third region corresponding to the third image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the third feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the third region into a third plurality of polygons based on the locations of the third feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); receiving a fourth image frame (Wang, ¶ [0008-0009]: “a single video frame, an initial face shape, and a ground truth shape are input to the CMC-CNN… target face”); determining locations of fourth feature landmarks corresponding to the fourth image frame (Wang, ¶ [0024-0025]: “detects facial landmarks using a Cascade Multi-Channel Convolutional Neural Network (CMC-CNN) … denotes the 2D positions of facial landmarks”); determining a fourth region corresponding to the fourth image frame (Wang, ¶ [0037]: “utilizing the binary facial mask, an accurate facial Region of Interest (ROI) as seen in 400A. The system can automatically select the facial ROI, according to the detected facial landmarks”), based on the locations of the fourth feature landmarks (Wang, ¶ [0037]: “Based on the detected landmarks in a face such as shown in 400A, a binary facial mask 400B is created”); partitioning the fourth region into a fourth plurality of polygons based on the locations of the fourth feature landmarks (Wang, ¶ [0034-0036]: “a Delaunay triangulation”, see FIG. 3); translating image data of one or more polygons of the third plurality of polygons to one or more polygons of the fourth plurality of polygons (Wang, ¶ [0034-0035], see FIG. 3, the triangulated facial region of source image is warped based on the triangulated facial region of target image); and forming a composite image frame (Wang, ¶ [0037]: “Image blending results in a blend of the warped source face into the target face, and is needed to produce natural and realistic face replacement results”, see FIG. 5) by replacing image data of at least one polygon in the fourth plurality of polygons with translated image data from the one or more polygons of the third plurality of polygons. (Wang, ¶ [0037]: “The system can automatically select the facial ROI, according to the detected facial landmarks. After getting the facial ROI in source image and target image, a Poisson Image Editing technique or other picture compositing method can be used to seamlessly blend the source face into target face”)

CLAIM 20
In regards to Claim 20, the combination of Kabata, Wang and Liu teaches the system of Claim 19. In addition, the combination of Kabata, Wang and Liu teaches receiving the third image frame  (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see input LR image (low-resolution) in FIG. 4); receiving the second composite image frame (Liu, page 1291116, section B: “The input is paired Bicubic up-sampled LR images and their reference images” ; see Reference image in FIG. 4); form a second rectified image frame based on the third image frame and the second composite image frame (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”) (Liu, page 129116, right col, first paragraph: “the decoder can sample from the learned generative model of P(z|R, X) to generate SR images.”); and providing the second rectified image frame to the processor (Kubata, page 17: “a computer (for example, a desktop computer…”), wherein the processor is further operable to cause a display device to display an image depicting the person based on the second rectified image frame. (Kubata, page 25: “Therefore, when a moving image based on the converted moving image data generated by the image processing device described above is displayed on some display, the target face displayed on the display basically faces forward”.)

CLAIM 21
In regards to Claim 21, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches the neural processor employs circuitry optimized for matrix operations. (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”. The Examiner notes convolution is fundamentally a form of matrix multiplication)

CLAIM 22
In regards to Claim 22, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches one or more neural networks perform convolutional-type operations on the first image frame and the composite image frame to produce the rectified image frame. (Liu, page 129119, left col, second paragraph: “Inside the UBP block, we used Parametric ReLU (PReLU) for activation and 6×6 filter with stride 2 for convolution and deconvolution”)


Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kabata, in view of Wang in view of Liu, and further in view of WangShi et al. (CN109410119A, published 2019, a translated copy of this document is attached, hereinafter WangShi)
CLAIM 14
In regards to Claim 14, the combination of Kabata, Wang and Liu teaches the system of Claim 13. In addition, the combination of Kabata, Wang and Liu teaches partitioning the first region into the first plurality of polygons based on the locations of the first feature landmarks comprises partitioning the first region into a first quantity of polygons (Wang, see FIG. 3 (b)); partitioning the second region into the second plurality of polygons based on the locations of the second feature landmarks comprises partitioning the second region into a second quantity of polygons (Wang, see FIG. 3 (a));
The combination of Kabata, Wang and Liu does not explicitly disclose the second quantity of polygons is equal to the first quantity of polygons.
WangShi is in the same field of art of image deformation using Delaunay triangulation. Further, WangShi teaches the second quantity of polygons is equal to the first quantity of polygons. (WangShi, ¶ [0092]: “The mask template needs to be deformed to the face of the current posture, which is actually an affine transformation process… An arbitrary affine transformation usually includes rotation (linear transformation), translation and scaling, and usually represents a mapping relationship between two images … the selected mask and the face image are divided into equal numbers of triangles, and their deformation is to transform one shape between the triangles into another shape, and make a one-to-one mapping of the textures contained therein, as shown in FIG. 6.”)
Therefore, it would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention to modify the invention of Kabata, Wang and Liu by incorporating the Delaunay triangulation process that is taught by WangShi, to make a Delaunay triangulation process that divides both images into equal number of triangles; thus, one of ordinary skilled in the art would be motivated to combine the references since among its several aspects, the present invention recognizes there is a need to improve the accuracy of the image deformation process (WangShi, ¶ [0020]: “the mask image can be deformed and accurately attached to the face of the person being tracked in real time, so that the face and mask are harmoniously and naturally integrated.”).
Thus, the claimed subject matter would have been obvious to a person having ordinary skill in the art before the effective filing date of the claimed invention.


Allowable Subject Matter
Claim 10 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The closest prior arts for Claim 10 are:
Wang et al. (US-20190122329-A1), which directed to a system a face replacement system for replacing a target face with a source face can include a facial landmark determination model having a cascade multichannel convolutional neural network (CMC-CNN) to process both the target and the source face. A face warping module is able to warp the source face using determined facial landmarks that match the determined facial landmarks of the target face, and a face selection module is able to select a facial region of interest in the source face. An image blending module is used to blend the target face with the selected source region of interest..
Liu et al. (Liu, Zhi-Song, Wan-Chi Siu, and Yui-Lam Chan. "Reference based face super-resolution." IEEE) which directed to a face super-resolution network, called Reference basedface Super-Resolution (RefSR), in which a reference facial image containing genuine attributes is provided in addition to the low-resolution images for super-resolution.
Fang et al. (US20210097648A1) which directed to a reference-based image enhance network that comprise multi-scale encoder and encoder (U-Net).
Neither Wang, or Liu, or Fang, nor the combination teaches a reference-based image enhancement network that comprise both U-Net and VDSR architectures. 

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NHUT HUY (JEREMY) PHAM whose telephone number is (703)756-5797. The examiner can normally be reached Mo - Fr. 8:30am - 6pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, O'Neal Mistry can be reached on (313)446-4912. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.


NHUT HUY (JEREMY) PHAMExaminerArt Unit 2674

/Ross Varndell/Primary Examiner, Art Unit 2674
Read full office action
Prosecution Timeline

Jan 06, 2023
Application Filed
Mar 28, 2025
Non-Final Rejection — §103, §112
Jun 24, 2025
Response Filed
Aug 13, 2025
Non-Final Rejection — §103, §112
Nov 10, 2025
Response Filed
Feb 11, 2026
Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/925,903
Patent 12598397
DIRT DETECTION METHOD AND DEVICE FOR CAMERA COVER
2y 5m to grant Granted Apr 07, 2026
17/990,310
Patent 12598074
FACIAL RECOGNITION METHOD AND APPARATUS, DEVICE, AND MEDIUM
2y 5m to grant Granted Apr 07, 2026
17/992,917
Patent 12597254
TRACKING OPERATING ROOM PHASE FROM CAPTURED VIDEO OF THE OPERATING ROOM
2y 5m to grant Granted Apr 07, 2026
18/125,767
Patent 12592087
IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM
2y 5m to grant Granted Mar 31, 2026
17/973,627
Patent 12579622
METHOD AND APPARATUS FOR PROCESSING IMAGE SIGNAL, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

4-5
Expected OA Rounds
79%
Grant Probability
99%
With Interview (+26.8%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 53 resolved cases by this examiner. Grant probability derived from career allow rate.
Videoconferencing Systems with Facial Image Rectification

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email