DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-15 are pending.
Priority
Receipt is acknowledged of papers submitted under 35 U.S.C. 119(a)-(d).
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on 05/05/2023 and 09/30/2024 is/are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information referred to therein has been considered by the examiner.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-5 and 12-14 are rejected under 35 U.S.C. 102(a)(1) as being clearly anticipated by Lu et al. ("Global-local fusion network for face super-resolution." Neurocomputing 387 (2020): 309-320, IDS).
Regarding claim 1, Lu discloses a recognition system comprising:
a storage device for storing a learned model; and an arithmetic circuit accessible to the storage device (Abstract and sections 5.2 and 5.8, Storage device and processor are basic components of a computer-based system such as Lu’s. Lu specifically used “Intel Core i7-6700K CPU at 4.00 GHz and 8 GB RAM”), the learned model including:
a first model part (Fig. 3, the Reconstruction Sub-network and Section 3.2.1) learned to, in response to input of a first resolution image showing a target object at a first resolution (the low resolution input image x showing a target face), output a second resolution image (the synthetic high resolution (SHR) image PR(x)) and a difference image (equation (3), the residual image fR(x)), the second resolution image being corresponding to an image resulting from conversion of the first resolution image into a second resolution higher than the first resolution (Fig. 3, the “Reconstruction Sub-network” converts low resolution image x to high resolution image PR(x)), the difference image being corresponding to a difference between the first resolution image and the second resolution image (section 3.2.1, equation (4)); and
a second model part learned to output a feature amount of the target object in response to input of the second resolution image and the difference image (Fig. 3 and sections 3.2.2 - 3.2.4: the final high resolution image PF(x) defines a feature amount of the target face. PF(x) is generated in response to the SHR image and the residual image which is generated by the two residual enhancement sub-networks. Also see section 5.10, features used for face recognition also defines “a feature amount”), and the arithmetic circuit being configured to perform:
obtainment processing of obtaining the first resolution image as a target image; and inference processing of providing the target image obtained by the obtainment processing to the learned model to allow the learned model to calculate a feature amount of a target object shown in the target image (Section 5.10: low resolution images are obtained from Yale-B database. The images are input to the super-resolution model (Fig. 3) to allow the learned model to calculate the final high- resolution images).
Regarding claim 2, Lu discloses the recognition system of claim 1, wherein the inference processing includes recognizing the target object based on a feature amount of a target object shown in the target image (Section 5.10: the output of the super-resolution model is used for face recognition).
Regarding claim 3, Lu discloses the recognition system of claim 1, wherein the arithmetic circuit is configured to execute output processing of outputting a result of the inference processing (Fig. 3 and section 5.9: output/display the final high- resolution images).
Claims 4-5 and 12-14 have been analyzed and are rejected for the same reasons as outlined above in the rejection of claim 1. Lu’s model can be seen as a distillation model because it is generated by distillation of a learned model (Abstract: “a novel global-local fused network (GLFSR) to refine HF information for recovering fine details of facial images. In contrast to existing methods that often increase the depth of network, we enhance the residual HF information from local to global levels through the networks”).
Claims 12 and 13 are rejected under 35 U.S.C. 102(a)(2) as being clearly anticipated by ZHANG et al. (hereafter referred to as “ZHANG”, US 2022/0253981).
Claims 12 and 13 are drawn to a storage medium carrying a learned/distillation model which is a compilation of nodes that connect and layer in neural network. The fact that the information content on the medium represents model related data per se, which is non-functional descriptive material, means that for prior art purposes, the features of the content cannot patentably distinguish the medium from a prior art medium capable of embodying the same content, see MPEP 2111.05 and the Interim Examination Instructions For Evaluating Subject Matter Eligibility Under 35 U.S.C. § 101, signed 24 August 2009 http://www.uspto.gov/patents/law/comments/2009-08-25_interim_101_instructions.pdf (See the first paragraph on page 4). Claims 12 and 13 are therefore rejected as being anticipated by ZHANG who discloses a system that includes storage media (see, e.g., Fig. 4, storage 34).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 6-11 and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Lu ("Global-local fusion network for face super-resolution." Neurocomputing 387 (2020)).
Regarding claim 6, Lu discloses a learning method comprising:
a preparation step of preparing a model (Section 3.2.1 : "divide the training sample into two parts … The first part … is used for the reconstruction network. The remaining part … is used for the residual enhancement network". The reconstruction network training step is preparation step); and
a learning step of performing machine learning using the model prepared by the preparation step (Section 3.2.1, the residual enhancement networks are trained using the output of the trained reconstruction network), the model including
a first model part, a second model part, and a third model part, the first model part being a model for, in response to input of a first resolution image showing a target object at a first resolution, outputting a second resolution image and a difference image, the second resolution image being corresponding to an image resulting from conversion of the first resolution image into a second resolution higher than the first resolution, the difference image being corresponding to a difference between the first resolution image and the second resolution image (Fig. 3, the Reconstruction Sub-network and Section 3.2.1. Please refer to analysis of claim 1), the second model part being a model for outputting a feature amount of the target object in response to input of the second resolution image and the difference image from the first model part (Fig. 3 and sections 3.2.2 - 3.2.4. Please refer to analysis of claim 1), and the third model part being a model for outputting a result of recognition of the target object in response to input of a feature amount of the target object from the second model part, and the learning step including training the model to learn a relationship between the first resolution image and a feature amount of a target object shown in the first resolution image, by machine learning using a learning dataset (Section 5.10: " 11 random images are used for the training task of the face recognition and the rest for testing. We use the model, which is trained with the CASIA-Webface database, to test those test images. Then we use the classic algorithm kernel partial- least-squares discrimination (KPLSD) [57] to verify the identity information of SR facial test images").
Lu does not expressly disclose that learning the third model includes the first resolution image as input and a result of recognition of a target object shown in the first resolution image as ground truth.
However, it is well known and common practice in the art to use end-to-end training to improve performance and accuracy (see e.g., Lu, section 6). Lu’s face recognition system is composed of the reconstruction model, the residual enhancement models and the recognition model. To improve face recognition performance, it would have been obvious to a person having ordinary skill in the art to choose end-to-end learning, with the low-resolution images as input and the corresponding face recognition result as output.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to which the claimed invention pertains to yield the invention as described in claim 6 from the teachings of Lu.
Regarding claim 7, Lu discloses the learning method of claim 6, wherein the preparation step includes:
a generation step of generating a learning dataset including the first resolution image as input and a set of the second resolution image and the difference image as ground truth; and a pre-learning step of training the first model part to learn a relationship between the first resolution image and the set of the second resolution image and the difference image, by using the leaning dataset generated by the generation step (see analysis of claim 6)
Regarding claim 8, Lu discloses the learning method of claim 7, wherein the generation step includes: a first step of obtaining the second resolution image; a second step of generating the first resolution image by converting the second resolution image obtained by the first step into an image at the first resolution; a third step of generating the difference image from the second resolution image obtained by the first step and the first resolution image generated by the second step; and a fourth step of generating a learning dataset including the first resolution image generated by the second step as input and the set of the second resolution image prepared by the first step and the difference image generated by the third step as ground truth (Section 2.1: "the observed LR images are generated by the following model: xi = DByi + v , where B is the blurring operation … D is the downsampling operation…"; Section 2.2 and section 3.1 "we define the residual image ri = yi - xi". In Lu, a low resolution image is generated by blurring, down-sampling and adding noise to a high resolution image and a difference image is generated using the high resolution image and an image generated by interpolating the low resolution image. Please also refer to analysis of claim 6 on end-to-end learning. It would have been obvious to a person having ordinary skill in the art to choose end-to-end learning of the reconstruction model and the residual enhancement models, with the low-resolution image as input and the corresponding high-resolution image and the difference image as output).
Regarding claim 9, Lu discloses the learning method of claim 8, wherein the third step enlarges the first resolution image generated by the second step to a size same as the second resolution image obtained by the first step and generates the difference image based on differences of pixels between the first resolution image enlarged and the second resolution image obtained by the first step (Section 2.2: "Considering that the input and output images are very similar, we define the residual image ri = yi – x~i, and its pixel value is mostly zero or small. Here x~i is the interpolation version of xi").
Regarding claim 10, Lu discloses the learning method of claim 9, wherein a range of each pixel in the difference image is narrower than a range of each pixel of the first resolution image enlarged and the second resolution image obtained by the first step (Section 2.2: "Considering that the input and output images are very similar, we define the residual image ri = yi – x~i, and its pixel value is mostly zero or small. Here x~i is the interpolation version of xi").
Regarding claim 11, Lu discloses the learning method of claim 6, but fails to further teach the remaining limitations of claim 11, which relates to using a discriminator to discriminate between real and generated high-resolution image.
However, using generative adversarial networks (GANs) for face super-resolution is well known in the art (see e.g., Lu, section 1). Using a discriminator to improve the accuracy of Lu’s generator (i.e., the GLFSR model that generates a HR image from a LR image) would be obvious for a person of ordinary skill in the art.
Therefore, before the effective filing date of the claimed invention, it would have been obvious to a person having ordinary skill in the art to which the claimed invention pertains to yield the invention as described in claim 11 from the teachings of Lu.
Claim 15 has been analyzed and is rejected for the same reasons as outlined above in the rejection of claim 8.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LI LIU whose telephone number is (571)270-5363. The examiner can normally be reached on Monday-Friday, 8:00AM-4:30PM, EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Emily Terrell can be reached on (571)270-3717. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LI LIU/Primary Examiner, Art Unit 2666