DETAILED ACTION
This action is responsive to the amendment filed on 11/21/2025. Claims 1, 2, 4 and 9-20 are pending in the case. Claims 1 and 11 are independent claims. Claims 1, 2, 4, 9, 11, 12 and 17-19 have been amended.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/21/2025 has been entered.
Response to Arguments
Applicant’s arguments, filed 11/21/2025, with respect to the prior art rejections have been fully considered and are not persuasive.
Applicant begins by summarizing the cited references. Applicant appears to suggest the cited art does not teach the following 7 features required by the training/verification of claim 1:
1. Image is translated forward
2. Image is translated backward
3. Perceptual loss is computed between this double-translated image and original image
4. The loss value is checked to verify it is below the threshold
5. If step (4) is true, this establishes that the transformation function is verified
6. Training continues until both transformation functions are verified
7. Training is stopped when verification criteria are met
Examiner disagrees.
1-2 is covered entirely by Deng which describes a CycleGAN
3 is not required by the claims. The claims recite “applying a perceptual loss on the transformed image…and the original…image”. The claim does not describe computing a loss “between a double translated image and an original image”. The transformed image in the claim is translated once. The claim simply describes “applying on” and does not limit the loss being between any two features.
4 Again this is not recited by the claim. The claims describe “applying a loss function”. The claim makes no mention of any loss values, much less checking below the threshold.
5 See above.
6 The claim notes training stops when the claimed images match “within a threshold”. The cited art describes matching the claimed types of images and terminating upon reaching a threshold. Thus, the functions are verified within a threshold. See the rejection for details.
7 See above.
Applicant argues that “Specifically, the claim recites applying the perceptual loss to a reconstructed image, comparing the loss to a threshold, and using this threshold-based determination to verify each transformation function and to control and terminate training”
Examiner disagrees. The claims do not recite such features. See the response above and the updated rejection.
Applicant goes on to argue specific disputes which are reproduced and addressed below:
Applicant notes “Qu teaches perceptual loss only as a training objective. Hoffman teaches cycle consistency loss only as a training constraint. Neither reference discloses or suggests using perceptual loss as a verification requirement or a training stop condition.”
Examiner disagrees. The references, particularly Deng, describe a cycle GAN system which is trained to be “cycle consistent”, indeed this training continues until a maximum number of iterations. Braun describes training until a threshold is met. Qu describes perceptual loss as a training objective specifically for image to image domain translation. The cycle consistency loss in Hoffman is indeed a training constraint which the claims recite using via claiming a cycle GAN. The claims do not in anyway require that the perceptual loss characterizes a “verification requirement” or a “training stop condition”. At most the claims note that training is stopping when the transformation functions are verified and that verification is based in part on the perceptual loss. The combination of references teaches perceptual loss used to verify and train the generator and discriminator network until a stopping condition.
Applicant argues that none of the references stop training, generate pairs, synthetic pairs, and mismatched pairs to train a different machine learning model.
Examiner disagrees. Deng explicitly describes using the SPGAN to generate said pairs. Then stops training. Deng notes “ Given G and F, we define two positive pairs… On the other hand, for generators G and F, we also define two types of negative training pairs… We train the SPGAN until the convergence or the maximum iterations… Feature learning is the second step of the “learning via translation” framework. Once we have style-transferred dataset G(S) composed of the translated images and their associated labels, the feature learning step is the same as supervised methods… We employ ResNet-50 [17] as the base model… Specifically, ResNet-50 [17] pretrained on ImageNet is used for fine-tuning on the translated training set” (section 3.2-Section 4.1). The generated translated training set is used in a different model, after generating new data from a GAN trained model. Further, Subramaniam teaches using a different model on explicitly matched and unmatched images “To do so, they can exploit the training set, consisting of matched and unmatched image pairs.” (Section 4.1). Examiner highlights the rejection is based on the teachings of a combinations of references. Applicant appears to dispute the references in isolation when the rejection is based on the combination.
Applicant argues “Additionally, none of the cited art teaches creating augmented dataset by combining these new labeled synthetic training pairs with at least one mismatch pair for training a machine learning model that is different from the neural network (e.g., GAN).”
Examiner disagrees. As previously noted, Deng describes generated new synthetic pairs used for training a different machine learning model. Subramaniam teaches both data set augmentation for paired images and using the generated pairs for a ML model.
Applicant argues the combination would not have been obvious and would change the principle of operation of Subramaniam.
Examiner disagrees. The network of Subramaniam works in exactly the same way as before prior to any combination. It processes paired images via training data to match images. The principle operations of the neural network or its training are not modified at all by any suggestions to use a further augmented data set. Subramaniam even notes that data augmentation is sometimes necessary for almost all data sets “For almost all the datasets, the number of negative pairs far outnumbers the number of positive pairs in the training set. This poses a serious challenge to deep neural nets, which can overfit and get biased in the process. Further, the positive samples may not have all the variations likely to be encountered in a real scenario. We therefore, hallucinate positive pairs and enrich the training corpus” (Section 4.2). Further, Deng points out that “when models trained on one dataset are directly used on another, the re-ID accuracy drops dramatically due to dataset bias [41]. Therefore, supervised, single-domain re-ID methods may be limited in real-world scenarios, where domain-specific labels are not available… A common strategy for this problem is unsupervised domain adaptation (UDA). But this line of methods assume that the source and target domains contain the same set of classes. Such assumption does not hold for person re-ID because different re-ID datasets usually contain entirely different persons (classes)… In literature, commonly used style transfer methods include [27, 22, 48, 57]. In this paper, we use CycleGAN [57] following the practice in [27, 18].”. It is understood that both systems are used for person re-identification, Deng explicitly points out a limitation of single domain data sets, noting that out of domain data sets of different classes require methods such as style transfer. Examiner highlights that nothing in the references or the Applicant’s arguments suggest that using a CycleGan to generate training data would change the principle of operation of Subramaniam.
Applicant finally argues the combination is based on impermissible hindsight.
Examiner disagrees. As made clear above and in the updated rejection. The cited art clearly describes the motivation to use other methods such as style transfer based cycle Gans to bolster person re-identification network to be suitable in real world data sets. The reason to combine is based on cited sections of disclosure in the references as demonstrated in the rejection. Applicant has provided no specific explanation for their conclusion.
Claim Rejections - 35 U.S.C. § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. §§ 102 and 103 (or as subject to pre-AIA 35 U.S.C. §§ 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. § 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 2, 4, 10-11, 13, 14 is/are rejected under 35 U.S.C. § 103 as being unpatentable over Subramaniam “Deep Neural Networks with Inexact Matching for Person Re-Identification”, further in view of Deng “Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification”, further in view of Qu “Perceptual-DualGAN: Perceptual Losses for Image to Image Translation with Generative Adversarial Nets” further in view of Braun et al. US PGPUB US 20190128989 A1.
Claim 1
Subramaniam teaches,
A computer-implemented method, comprising: receiving, by a data processing computer, input data comprising a first input image and a second input image; (Section 3.1 pg 3 “In this work, we propose two architectures for Person Re-Identification. Both of our architectures are a type of “Siamese”-CNN model which take as input two images for matching and outputs the likelihood that the two images contain the same person” ) providing, by the data processing computer, the first input image and the second input image as an input to a machine-learning model to determine whether the first input image matches the second input image, the machine-learning model formed by, ( abstract “Person Re-Identification is the task of matching images of a person across multiple camera views…we propose two CNN-based architectures for Person Re-Identification. In the first, given a pair of images, we extract feature maps from these images via multiple stages of convolution and pooling” pg 3 “This is the first layer that captures the similarity of the two input images; subsequent layers build on the output of this layer to finally arrive at a decision as to whether the two images are of the same person or not.” the final layer of the machine learning model determines a match or not between the images.) obtaining, by the data processing computer, an initial training set comprising a first set of original first images of a first type and a second set of original second images of a second type (Section 4.1 “The CUHK03 dataset is a large collection of 13,164 images of 1360 people captured from 6 different surveillance cameras, with each person observed by 2 cameras with disjoint views” the images are different types as they are from different cameras and different views.)generating, by the data processing computer … an augmented training data set(Section 4.2 pg 7 “We therefore, hallucinate positive pairs and enrich the training corpus”)wherein each of the first pairs and each of the second pairs is labeled as match examples, associating at least one second image of the second type and at least one a-first image of the first type that do not match, to create at least one mismatched pair of images, wherein the at least one mismatched pair is labeled as a mismatch example, and generating the augmented training data set to include the first pairs, the second pairs, and the at least one mismatched pair,(pg 3 Section 3.1 “Both of our architectures are a type of “Siamese”-CNN model which take as input two images for matching and outputs the likelihood that the two images contain the same person.” Pg 6 Section 4.1 “We conducted experiments on the large CUHK03 dataset [3], the mid-sized CUHK01 Dataset [23], and the small QMUL GRID dataset…To do so, they can exploit the training set, consisting of matched and unmatched image pairs.” Pg 7 Section 4.2 “We also augment the data with images reflected on a vertical mirror… Table 1 summarizes the results of the experiments on the CUHK03 Labeled dataset” The training data includes matched and unmatched pairs, the additional examples are merely those additional to the first initial training set. The data set is augmented with additional created pairs via reflection. The data is labeled as matching or non-matching.) training the machine-learning model to identify whether two input images match or not, using the augmented training data set comprising the first pairs, the second pairs, and the at least one mismatched pair (Section 4.1 “The dataset comes with manual and algorithmically labeled pedestrian bounding boxes…For our models, we use mini-batch sizes of 128 and train our models for about 200,000 iterations…” the matching model uses labeled datasets to train and is thus a supervised learning model. If determines whether two input images match through training. Pg 4 “This is the first layer that captures the similarity of the two input images; subsequent layers build on the output of this layer to finally arrive at a decision as to whether the two images are of the same person or not.”pg 6 Section 4.1 “We conducted experiments on the large CUHK03 dataset [3], the mid-sized CUHK01 Dataset [23], and the small QMUL GRID dataset…To do so, they can exploit the training set, consisting of matched and unmatched image pairs.”) and executing, by the data processing computer, at least one operation in response to receiving output of the machine-learning model indicating the first input image matches the second input image (pg 3 “This is the first layer that captures the similarity of the two input images; subsequent layers build on the output of this layer to finally arrive at a decision as to whether the two images are of the same person or not.”)
Subramaniam does not explicitly teach, training a neural network using the initial training set by: determining a first transformation function images of the second type from the first original images of the first type, determining a second transformation function for generating images of the first type from the second original images of the second type, verifying the first transformation function by (a) using the first transformation function to transform an original first image of the first type to a transformed image of the second type, (b) providing the transformed image of the second type as an input to the second transformation function to transform the transformed image of the second type to a transformed image of the first type, (c) applying a perceptual loss function on the transformed image of the first type and the original first image of the first type, and (d) determining whether the transformed image of the first type matches the original first image of the first type within a first threshold, verifying the second transformation function by (e) using the second transformation function to transform an original second image of the second type to a transformed image of the first type,(f) providing the transformed image of the first type as an input to the first transformation function to transform the transformed image of the first type to a transformed image of the second type, (g) applying a perceptual loss function on the transformed image of the second type and the original second image of the second type, and (h) determining whether the transformed image of the second type matches the original second image of the second type within a second threshold, wherein the first transformation function and the second transformation function are verified when (i) the transformed image of the first type matches the original first image of the first type within the first threshold and (i) the transformed image of the second type matches the original second image of the second type within the second threshold, and stopping the training the neural network when the first transformation function and the second transformation function are verified, wherein the neural network is different from the machine-learning model, in response to the first transformation function and the second transformation function being verified, … [generating an augmented training data set] using the neural network that is trained… by applying the first transformation function on the first original images from the first set to generate second generated images of the second type, applying the second transformation function on the original second images from the second set to generate first generated images of the first type, associating the second generated images with matching first original images from the first set, to create first pairs of matching images,
Deng however when addressing cycleGANs for generating labeled pairs of augmented data sets teaches, training a neural network using the initial training set by: determining a first transformation function images of the second type from the first original images of the first type, determining a second transformation function for generating images of the first type from the second original images of the second type, verifying the first transformation function by (a) using the first transformation function to transform an original first image of the first type to a transformed image of the second type, (b) providing the transformed image of the second type as an input to the second transformation function to transform the transformed image of the second type to a transformed image of the first type,… verifying the second transformation function by (e) using the second transformation function to transform an original second image of the second type to a transformed image of the first type, (f) providing the transformed image of the first type as an input to the first transformation function to transform the transformed image of the first type to a transformed image of the second type, ( Section 3.2.2 pg 4 “Formally, CycleGAN has two generators, i.e., generator G which maps source-domain images to the style of the target domain, and generator F which maps target-domain images to the style of the source domain. Suppose two samples denoted as xS and xT come from the source domain and target domain…In this manner, the network pushes two dissimilar images away. Training pairs are shown in Fig. 1….In the training phase, SPGAN are divided into three components which are learned alternately, the generators, discriminators and SiaNet” the cycleGAN neural network is trained to generate images of both types from both types via the two generators which are transformation functions. Section 3.1 pg 3 “CycleGAN introduces a cycle-consistent loss, which attempts to recover the original image after a cycle of translation and reverse translation” the CycleGAN also performs reverse translation, meaning the transformed image is provided back through the transformer function as claimed.) and stopping the training the neural network when the first transformation function and the second transformation function are verified, (pg 4 Section 3.2.2 “We train the SPGAN until the convergence or the maximum iterations.” The transformation functions are verified when the train terminates. Because training according to the cycle-consistent loss is training to verify the cycle consistent transformation functions) wherein the neural network is different from the machine-learning model, in response to the first transformation function and the second transformation function being verified, [generating an augmented training data set] using the neural network that is trained (pg 4 Section 3.3 “Feature learning is the second step of the “learning via translation” framework. Once we have style-transferred dataset G(S) composed of the translated images and their associated labels,… We employ ResNet-50 [17] as the base model” the base model is used to be trained/fine-tuned after verifying and generating the style transferred data. The Resnet model is the different machine learning model.)by applying the first transformation function on the first original images from the first set to generate second generated images of the second type, applying the second transformation function on the original second images from the second set to generate first generated images of the first type, (pg 4 Figure 3
PNG
media_image1.png
262
411
media_image1.png
Greyscale
pg 4 Section 3.2.2 “Formally, CycleGAN has two generators, i.e., generator G which maps source-domain images to the style of the target domain, and generator F which maps target-domain images to the style of the source domain… Training pairs are shown in Fig. 1. Some positive pairs are also shown in (a) and (d) of each column in Fig. 4. Overall objective function. The final SPGAN objective can be written as” the Cyclegan is generates the images of both types using the 2 transformation functions.) associating the second generated images with matching first original images from the first set, to create first pairs of matching images, associating the first generated images with matching second original images from the second set, to create second pairs of matching images, (pg 4 Section 3.2.2 “Similarity preserving loss function. We utilize the contrastive loss to train SiaNet…where x1 and x2 are a pair of input vectors, d denotes the Euclidean distance between normalized embeddings of two input vectors, and i represents the binary label of the pair. i = 1 if x1 and x2 are positive pair; i = 0 if x1 and x2 are negative pair…Training pairs are shown in Fig. 1. Some positive pairs are also shown in (a) and (d) of each column in Fig. 4” the labeled pairs used to train the Sianet include positive and negative pairs or matching and mismatching example images. The entire dataset is paired thus including pairs of the source and target domain types as shown in the figure)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the person re-identification and matching neural network of Subramaniam to utilize a CycleGan based method for assisted augmented data set creation for supervised matching person re-identification as described by Deng. One would have been motivated to make such a combination because both Subramaniam and Deng address dataset augmentation for person re-identification. Subramaniam highlights the need for augmenting data sets “For almost all the datasets, the number of negative pairs far outnumbers the number of positive pairs in the training set. … Further, the positive samples may not have all the variations likely to be encountered in a real scenario. We therefore, hallucinate positive pairs and enrich the training corpus” (Section 4.2 Subramaniam). Deng further points out that a “real scenario” (i.e person matching like that described by Subramaniam) may encounter untrained styles of images outside the domain, so it is necessary to produce those images noting “when models trained on one dataset are directly used on another, the re-ID accuracy drops dramatically due to dataset bias … Therefore, supervised, single-domain re-ID methods may be limited in real-world scenarios, where domain-specific labels are not available… In literature, commonly used style transfer methods include [27, 22, 48, 57]. In this paper, we use CycleGAN [57]” (Introduction Deng)
Subramaniam/Deng does not explicitly teach, (c) applying a perceptual loss function on the transformed image of the first type and the original first image of the first type, and (d) determining whether the transformed image of the first type matches the original first image of the first type within a first threshold,… (g) applying a perceptual loss function on the transformed image of the second type and the original second image of the second type, and (h) determining whether the transformed image of the second type matches the original second image of the second type within a second threshold, … wherein the first transformation function and the second transformation function are verified when (i) the transformed image of the first type matches the original first image of the first type within the first threshold and (i) the transformed image of the second type matches the original second image of the second type within the second threshold,
Qu however when addressing the use of Perceptual loss functions in image translation GANs teaches, (c) applying a perceptual loss function on the transformed image of the first type and the original first image of the first type… applying a perceptual loss function on the transformed image of the second type and the original second image of the second type (pg 1 “In this paper, we introduce perceptual reconstruction losses…We consider perceptual reconstruction losses consist of feature reconstruction loss and style reconstruction loss” pg 4 “So the feature reconstruction loss can be defined as:
PNG
media_image2.png
34
286
media_image2.png
Greyscale
…` Φ,j f eature(x, xˆ) means feature reconstruction loss between the original image x and the reconstruction image xˆ in jth layer of loss network…Therefore, the objective of generators (Gu and Gv) can be defined as
PNG
media_image3.png
130
339
media_image3.png
Greyscale
” the feature loss is a function of the original and transformed/reconstructed image. The loss is defined for both the transformations for the first type, u, and second type, v. )
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the data augmentation system which uses cycleGAN described by Subramaniam/Deng to utilize the perceptual loss by Qu. One would have been motivated to make such a combination at least because as noted by Qu “We have made an improvement in reconstruction losses on the basis of DualGAN… We replace naive pixel-level reconstruction losses with perceptual reconstruction losses… Therefore, the perceptual-DualGAN is a flexible framework for cross domain image-to-image translation…. using perceptual DualGAN, showing our framework performs better than baseline models.” (pg 2 Qu)
Subramaniam/Deng/Qu does not explicitly teach (d) determining whether the transformed image of the first type matches the original first image of the first type within a first threshold,…, and (h) determining whether the transformed image of the second type matches the original second image of the second type within a second threshold, … wherein the first transformation function and the second transformation function are verified when (i) the transformed image of the first type matches the original first image of the first type within the first threshold and (i) the transformed image of the second type matches the original second image of the second type within the second threshold,
Braun when combined Subramaniam/Deng/Qu with however address determining the generator image and original image of a GAN match within thresholds, (d) determining whether the transformed image of the first type matches the original first image of the first type within a first threshold,…, and (h) determining whether the transformed image of the second type matches the original second image of the second type within a second threshold, … wherein the first transformation function and the second transformation function are verified when (i) the transformed image of the first type matches the original first image of the first type within the first threshold and (i) the transformed image of the second type matches the original second image of the second type within the second threshold, (Braun paragraph 0061 “In the GAN process, the training of the discriminator network 411 may be done simultaneously with training the generator network 401. The training may be accomplished by performing small gradient steps in both the generator network 401 and discriminator network 411 weights. In an embodiment, the discriminator network 411 may be locked while the generator network 401 is trained so as to lower the accuracy of the discriminator network 411. If the generator network distribution is able to match the real data distribution perfectly or within a threshold amount, then the discriminator network 411 will be maximally confused….the discriminator network 411 is trained until optimal” matching the real and generated distribution within a threshold amounts to determining is the original and transformed images match as claimed. While Braun describes a single threshold for a single generator, when combined with the teaching of Subramaniam/Deng/Qu, which describe two generators (i.e two transformation functions), one of ordinary skill would understand the use of a threshold for both generators. Further, Examiner notes that Deng pg 4 states “The final SPGAN objective can be written as
PNG
media_image4.png
87
596
media_image4.png
Greyscale
… We train the SPGAN until the convergence or the maximum iterations”. It is clear that training to convergence necessarily involves training until a threshold condition is met. Where the degree of matching is a function of how close within some threshold that prior iterations match present iterations. Further still, examiner notes that the claim does not set limits on what the “threshold” refers to, the determination that images match or not which occurs within a threshold number of maximum iterations corresponds to the claimed “within a threshold”.)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the data augmentation system which uses cycleGAN described by Subramaniam/Deng/Qu to determine that the training of the GAN produces matching data within a distribution as disclosed by Braun One would have been motivated to make such a combination because both Subramaniam/Deng/Hoffman/Qu and Braun are concerned with training GAN systems capable of emulating real data of a certain type via a transformation function or generator. Braun noted in order to train the system optimally “the discriminator network 411 will be maximally confused, predicting real images for all inputs. In an embodiment, the discriminator network 411 is trained until optimal with respect to the current state of the generator network 401; then, the generator network 401 is again trained and updated” (Braun paragraph 0061)
Claim 2
Subramaniam/Deng/Qu/Braun teaches claim 1
Further Deng teaches, training a first neural network to generate the second generated images of the second type from the first original images of the first set; and training a second neural network to generate the first generated images of the first type from the second original images of the second set(pg 4 Section 3.2.2 “Formally, CycleGAN has two generators, i.e., generator G which maps source-domain images to the style of the target domain, and generator F which maps target-domain images to the style of the source domain… In the training phase, SPGAN are divided into three components which are learned alternately, the generators,” the generators are neural networks that are trained.)
Claim 4
Subramaniam/Deng/Qu/Braun teaches claim 1
Further Deng teaches, the initial training sets the first original images of the first type are unpaired with the second original images of the second type (Section 1 pg 1 “In the baseline approach, two steps are involved. First, labeled images from the source domain are transferred to the target domain, so that the transferred image has a similar style with the target. Second, the style-transferred images and their associated labels are used in supervised learning in the target domain” Section 2 pg 2 “A representative method is the conditional GAN … which using paired training data produces impressive transition results. However, the paired training data is often difficult to acquire. Unpaired image-image translation is thus more applicable. To tackle unpaired settings, a cycle consistency loss is introduced” Figure 6 caption pg 5 “Market images translated to Duke style. We use SPGAN for unpaired image-image translation.” The cycleGAN is introduced and used when the training data is unpaired. The cycleGAN as described in the baseline approach is what creates “associated labels” for images that previously did not have target domain labels.)
Claim 10
Subramaniam/Deng/Qu/Braun teaches claim 1
Further Deng teaches, wherein the neural network is a cycle-consistent generative adversarial network. (pg 2 “This paper introduces the Similarity Preserving cycle-consistent Generative Adversarial Network”)
Claim 11
Subramaniam teaches, A data processing computer, comprising: one or more processors; and one or more memories storing computer-executable instructions, wherein executing the computer-executable instructions by the one or more processors causes the data processing computer to: (Section 4.1 pg 6 “The implementation was done in a machine with NVIDIA Titan GPUs and the code was implemented using Torch and is available online”)
Subramaniam/Deng/Qu/Braun teaches the remaining limitations of claim 11 for the reasons set forth in the rejection of claim 1
Claim 13
Subramaniam/Deng/Qu/Braun teaches claim 11
Further Deng teaches, wherein training the neural network includes applying an adversarial loss function. (Section 3.2.1 pg 3 “For generator G and its associated discriminator DT , the adversarial loss is…” Section 3.2.2 “Overall objective function. The final SPGAN objective can be written as… The first three losses belong to the CycleGAN formulation… SPGAN training procedure. In the training phase, SPGAN are divided into three components which are learned alternately, the generators, discriminators and SiaNet” training according to a loss function corresponds to training via applying a loss function as known by those or ordinary skill in the art.)
Claim 14
Subramaniam/Deng/Qu/Braun teaches claim 11
Further Deng teaches, wherein the neural network comprises at least two generative networks and at least two corresponding discriminator networks. (Section 3.2.2 “SPGAN training procedure. In the training phase, SPGAN are divided into three components which are learned alternately, the generators, discriminators”)
Claim(s) 9, 12, 15-20 are rejected under 35 U.S.C. § 103 as being unpatentable over Subramaniam/Deng/Qu/Braun further in view of Shi et al “DocFace: Matching ID Document Photos to Selfies” hereinafter Shi
Claim 9
Subramaniam/Deng/Qu/Braun teaches claim 1
Subramaniam/Deng/Qu/Braun does not explicitly teach, the first original images comprise
Shi However when addressed the matching of images using a classification model teaches, the first original images comprise ( Section 3.2 “Our first ID document-selfie dataset is a private dataset composed of 10, 000 pairs of ID Cards photo and selfies….The ID card photos are read from chips in the Chinese Resident Identity Cards1 . The selfies are from a stationary camera.”)
Accordingly, it would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to modify the neural network system of Subramaniam/Deng/Qu/Braun to be used by the multi domain facial classification system described by Shi. One would have been motivated to make such a combination because both Subramaniam/Deng/Qu/Braun and Shi address solutions to classification problems when the dataset is limited in size. Shi in particular, notes the improvements when used for the face matching problem. Shi notes that” by employing the transfer learning technique, …, to train a domain-specific network for ID document photo matching without a large dataset.” (Shi abstract pg 1). Further Shi notes, “Experiment results show that general face matchers perform poorly on this problem because it involves many different difficulties and DocFace significantly improves the performance of current face matchers on this problem.” (Shi pg 8)
Claim 12
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, causes the data processing computer to collect the first original images utilizing a web crawler. (Section 3.1 “It contains 8, 456, 240 images of 99, 892 subjects (mostly celebrities) downloaded from internet” a web crawler is a tool for downloading information from the internet.)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Claim 15
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, wherein the input data is received from an interface provided by the data processing computer. (Caption Figure 3 pg 2 “Figure 3: An application scenario of the ID document matching system. The kiosk scans the ID document or reads its chip for the face photo and the camera takes another photo of the holder’s live face (selfie) Then, through face recognition, the system decides whether the holder is indeed the owner of the ID document.”)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Claim 16
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, wherein the input data is received from a computing device different from the data processing computer. (Caption Figure 3 pg 2 “Figure 3: An application scenario of the ID document matching system. The kiosk scans the ID document or reads its chip for the face photo and the camera takes another photo of the holder’s live face (selfie) Then, through face recognition, the system decides whether the holder is indeed the owner of the ID document.” The kiosk scanner is different from the neural network recognition system or data processing computer)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Claim 17
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, wherein the first type corresponds to a portrait image, and wherein the first original images are portrait images. (Section 3.2 “Our first ID document-selfie dataset is a private dataset composed of 10, 000 pairs of ID Cards photo and selfies. The ID card photos are read from chips in the Chinese Resident Identity Cards”)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Claim 18
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, wherein the second type corresponding to an ID document image, and wherein the second original images are ID document images. (Section 3.2 “Our first ID document-selfie dataset is a private dataset composed of 10, 000 pairs of ID Cards photo and selfies. The ID card photos are read from chips in the Chinese Resident Identity Cards”)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Claim 19
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, wherein each of the first original images and each of the second original images comprises at least some portion of a subject's face (Figure 4 caption pg 4 “Example images in each dataset. The left image in each pair in (b) and each row in (c) is the ID photo and on its right are the corresponding selfies” the first and second set of images each include at least a portion of a subject’s face.)
Subramaniam/Deng and Shi are combined for the reasons set forth in the rejection of claim 11
Claim 20
Subramaniam/Deng/Qu/Braun teaches claim 11
Shi teaches, executing the at least one operation in response to receiving output of the machine-learning model indicating the first input image matches the second input image comprises at least one of: approving a transaction or enabling access to a resource or location. (pg 2 “To use the SmartGate, travelers only need to let a machine read their ePassport chips containing their digital photos and then capture their face images using a camera mounted at the SmartGate. After verifying a traveler’s identity by face comparison, the gate is automatically opened for the traveler to enter Australia”)
Subramaniam/Deng/Qu/Braun and Shi are combined for the reasons set forth in the rejection of claim 9
Conclusion
Prior art not relied upon:
Li et al. “BeautyGAN: Instance-level Facial Makeup Transfer with Deep Generative Adversarial Network” describes a GAN system for converting between domains using perceptual and cycle-consistent loss.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOHNATHAN R GERMICK whose telephone number is (571)272-8363. The examiner can normally be reached M-F 7:30-4:30.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on 571-272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.R.G./
Examiner, Art Unit 2122
/KAKALI CHAKI/ Supervisory Patent Examiner, Art Unit 2122