Last updated: April 19, 2026
Application No. 18/600,679
Image enhancement

Final Rejection §103
Filed
Mar 09, 2024
Examiner
SATCHER, DION JOHN
Art Unit
2676
Tech Center
2600 — Communications
Assignee
Canva Pty Ltd.
OA Round
2 (Final)
Interview Optional

— +14.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 39 resolved cases, 2023–2026
Examiner Intelligence

SATCHER, DION JOHN View full profile →
Grants 85% — above average
Career Allow Rate
33 granted / 39 resolved
+22.6% vs TC avg
Moderate +14% lift
Without
With
+14.2%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
14.2%
-25.8% vs TC avg
§103
61.9%
+21.9% vs TC avg
§102
15.1%
-24.9% vs TC avg
§112
8.3%
-31.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 39 resolved cases
Office Action

§103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 01/26/2026 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Response to Amendment
Applicant’s Amendments filed on 01/26/2026 has been entered and made of record. 
Currently pending Claim(s):
Independent Claim(s): 
Cancelled Claim(s):
1, 2, 4–8 and 11–20
1 and 20
3, 9 and 10


Response to Applicant’s Arguments
This office action is responsive to Applicant’s Arguments/Remarks Made in an Amendment received on 01/26/2026.
In view of the amendments filed on 01/26/2026 to the specification, the specification objections are withdrawn.
In view of the amendments filed on 01/26/2026 to the drawing, the drawing objections are withdrawn.
In view of applicant Arguments/Remarks and amendment filed on 01/26/2026 with respect to independent claims 1 and 20 under 35 U.S.C 103, claim rejection has been fully considered and the arguments are found to be not persuasive (See Pages 8–10), therefore the claim rejection with respect to 35 U.S.C. 103 still applies.
Applicant argues, in summary the applied prior art (Chui) does not disclose or suggest (See pages 9 and 10):
	“receiving as an input to the trained machine learning model a combination of: image characteristics of an input image, wherein the image characteristics, …, a first classification output for the input image”
However, the Examiner respectfully disagrees with Applicant’s line of reasoning. The Examiner has thoroughly reviewed the Applicants arguments but respectfully believes that the cited reference to reasonably and properly meet the claimed limitations.
	The applicant argues that the Chui does not use image characteristics, classification output as input into the trained mode. Examiner respectfully disagrees, Chui uses images with labels including classification labels. See Chu, ¶ [0024], “The set of labels may include one or more hierarchical high level labels that provide classification of the object(s) and/or scene comprising an image” and image characteristic labels such as noise statistics and contrast which the examiner was interpreting as the image characteristic input as the noise characteristic is a part of the image characteristic. See Chu,  ¶ [0024], “Other labels that are not scene-based but instead image-based such as noise statistics (e.g., the number of samples of rays used in ray tracing when rendering) may be associated with an image”. ¶ [0029], “Examples of attributes 306 that may be detected for the set of images 302 include object/scene types and geometries, materials and textures, camera characteristics, lighting characteristics, noise statistics, contrast, etc, …, Attributes 306 identified for the set of images 302 may be employed to automatically label or tag images 302 that do not already have such labels or tags”. The attributes can used as tags or labels for the other images. The applicant argues that Chui is not using the input on a trained model. The Examiner respectfully disagrees, if a model is training using multiple sets of data for training after the first epoch or first iteration of training by definition the model would be trained when using the second dataset. Hence, the second epoch or iteration would be using a trained model. Therefore, Chui uses multiple training sets and inputs image characteristics and classification output into a trained model. See Chui, ¶ [0029], “In some cases, machine learning based framework 304 is trained on large labeled image datasets”. See also Chui, ¶ [0040], “In some embodiments, training datasets for such a denoising application comprise ray traced snapshots of images at different sampling intervals, with each snapshot labeled with an attribute specifying the number of samples for that snapshot in addition to being labeled with other image attributes”. Applicant further argues that Chui does not teach the combination of two losses and that Sharma is pertinent to CNN based image enhancement not the limitations of Chui. Respectfully, the Examiner is using Sharma to teach the combination of two losses and not Chui. Chui is used to teach the above model and Sharma is used to teach the combination of losses.
	Therefore, with this broad interpretation, Chui in combination with Sharma teaches, discloses or suggests the Applicant’s invention, of inputting an image characteristics and classification output into a machine learning model to produce a visual parameter and train the model using a combination of two losses. Thus, due to the Applicant’s broad claim language, Applicant’s invention is not far removed from the art of record. Accordingly, these limitations do not render claims patentably distinct over the prior art of record. As a result, it is respectfully submitted that the present application is not in condition for allowance. 
	Thus, the Examiner maintains that limitations as presented and as rejected were properly and adequately met. The rejection as presented in the non-final rejection is maintained regarding to the above limitation. Additional citations and/or modified citation may be present to more concisely address limitations. However, the ground of the rejection remain the same.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claim(s) 1, 5, 7–8, 11, 15 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma").
Regarding claim 1, Chui teaches an image processing method (See Chui, [Abstract], In some embodiments, a machine learning framework is trained to learn low level image attributes such as object/scene types, geometries, placements, materials and textures, camera characteristics, lighting characteristics, contrast, noise statistics, etc), the method including: 
by a computer processing system implementing a trained machine learning model (See Chui, ¶ [0023], FIG. 1 is a high level block diagram of an embodiment of a machine learning based image processing framework 100 for learning attributes associated with datasets.): 
receiving as an input to the trained machine learning model (See Chui, ¶ [0023], FIG. 1 is a high level block diagram of an embodiment of a machine learning based image processing framework 100 for learning attributes associated with datasets.) a combination of:
image characteristics of an input image (See Chui, ¶ [0024], Images comprising datasets 104 are tagged with comprehensive sets of labels or metadata. A set of labels defined and/or selected for images of a prescribed dataset may at least in part be application dependent. The set of labels may include one or more hierarchical high level labels that provide classification of the object(s) and/or scene comprising an image, …, Other labels that are not scene-based but instead image-based such as noise statistics (e.g., the number of samples of rays used in ray tracing when rendering) may be associated with an image. Note: Examiner is interpreting the noise statistic as the image characteristics), wherein the image characteristics include variables that change between an image before processing by the trained machine learning model and after the image has been processed by the trained machine learning model (See Chui, ¶ [0038], Generally, image processing applications 508 rely on attribute detection using machine learning framework 501. That is, actual attributes used to generate images 510 are detected using machine learning framework 501 and modified to generate output images 512 having modified attributes. Note: the attributes are modified which implies the characteristics change before and after. The examiner is interpreting the model as being trained after one iteration of training and hence it already trained once using the input again); and
a first classification output for the input image, the first classification output relating the input image to a set of image classes (See Chui, ¶ [0024], The set of labels may include one or more hierarchical high level labels that provide classification of the object(s) and/or scene comprising an image); and
generating, through application of the trained machine learning model, at least one visual parameter usable to generate a processed image relative to the input image (See Chui, ¶ [0038], Generally, image processing applications 508 rely on attribute detection using machine learning framework 501. That is, actual attributes used to generate images 510 are detected using machine learning framework 501 and modified to generate output images 512 having modified attributes. Note: Examiner is interpreting the visual parameters as the attributes);
wherein: 
[a machine learning model was trained to form the trained machine learning model by a process comprising utilising as a learning objective a reduction or minimisation  of a combination of both: i) a first loss, wherein the first loss is a loss between an output image of the machine learning model that applies the at least one visual parameter and a target training image and ii) a second loss, wherein the second loss is a loss between a second classification output, different from the first classification output, and a known classification of the target training image].
However, Chui fail(s) to teach a machine learning model was trained to form the trained machine learning model by a process comprising utilising as a learning objective a reduction or minimisation  of a combination of both: i) a first loss, wherein the first loss is a loss between an output image of the machine learning model that applies the at least one visual parameter and a target training image and ii) a second loss, wherein the second loss is a loss between a second classification output, different from the first classification output, and a known classification of the target training image.
Sharma, working in the same field of endeavor, teaches: a machine learning model was trained to form the trained machine learning model by a process comprising utilising as a learning objective a reduction or minimization (See Sharma, [Pg. 4037, Col. 1, ln. 12–19], We believe training our network in this manner, offers a natural way to encourage the filters to apply a transformation that enhances the image structures for an accurate classification, as the classification network is regularized via enhancement networks. Moreover, joint optimization helps minimize the overall cost function of the whole architecture, hence leading to better results) of a combination of both: i) a first loss, wherein the first loss is a loss between an output image of the machine learning model that applies the at least one visual parameter and a target training image (See Sharma, [Pg. 4036, Col. 1, ln. 3–4]. The Stage 1-2 cascade with two loss functions - MSE (enhancement). [Pg. 4035, Col. 2, ln. 24–28], To compare the reconstruction image Y′ with the ideal T, we use MSE loss as a measure of image quality, although we note that more complex loss functions could be used [10]) and ii) a second loss, wherein the second loss is a loss between a second classification output, different from the first classification output, and a known classification of the target training image (See Sharma, [Pg. 4036, Col. 1, ln. 3–7]. The Stage 1-2 cascade with two loss functions - MSE (enhancement) and softmax-loss L (classification) - enables joint optimization by end-to-end propagation of gradients in both ClassNet and EnhanceNet using SGD optimizer. [Pg. 4036, Col. 1, ln. 17–19], ClassNet that is fed to a C-way softmax function, y is the vector of true labels for image I, and C is the number of classes).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference to a first classification output for the input image, the first classification output relating the input image to a set of image classes; and generating, through application of the trained machine learning model, at least one visual parameter usable to generate a processed image relative to the input image based on the method of Sharma’s reference. The suggestion/motivation would have been to accurately enhance the visual features of the image and more accurately classify the image (See Sharma, [Pg. 4040, Col. 2, ln. 27–45] and [Table 5]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Sharma with Chui to obtain the invention as specified in claim 1.
Regarding claim 5, Chui teaches the method of claim 1, wherein the trained machine learning model is a first trained machine learning model (See Chui, ¶ [0032], image processing architecture 500 includes many components that have separately been described in detail with respect to FIGS. 1-4. Machine learning framework 501 (which may comprise framework 100 of FIG. 1) is the foundation of image processing architecture 500 and trains on large datasets, e.g., which may at least in part be generated from available three-dimensional models, to learn attributes associated with the datasets) and the first classification output comprises an output of a second trained machine learning model, different to the first trained machine learning model, wherein the second trained machine learning model is trained to classify images into one of a plurality of scene classes (See Chui, ¶ [0025], Training 106 on a dataset 104, for example, using any combination of one or more appropriate machine learning techniques such as deep neural networks and convolutional neural networks, results in a set of one or more low level properties or attributes 110 associated with the dataset 104 to be learned, …, In various embodiments, different training models may be used to learn different attributes. Note: the different training models for different attributes implies using a first and second model).
Regarding claim 7, Chui teaches the method of claim 1, wherein the at least one visual parameter comprises one or more of: (i) brightness, (ii) contrast, (iii) saturation, (iv) vibrance, (v) whites, (vi) blacks, (vii) shadows and (viii) highlights (See Chui, ¶ [0023], FIG. 1 is a high level block diagram of an embodiment of a machine learning based image processing framework 100 for learning attributes associated with datasets. ¶ [0025], Such attributes may be derived or inferred from labels of the dataset 104. Examples of attributes that may be learned include attributes associated with object/scene types and geometries, materials and textures, camera characteristics, lighting characteristics, noise statistics, contrast (e.g., global and/or local image contrast defined by prescribed metrics, which may be based on, for instance, maximum and minimum pixel intensity values), etc. Note: Examiner is interpreting one of the attributes (e.g., object/scene type) as a classification and more than one attribute can be learned saturation enhancement parameter.).
Regarding claim 8, Chui in view of Sharma teaches the method of claim 1, [wherein the first loss is a mean square error loss between the output image and the target training image, and/or the second loss is a multi-class cross entropy loss between the second classification output and the known classification of the target training image].
However, Chui fail(s) to teach wherein the first loss is a mean square error loss between the output image and the target training image, and/or the second loss is a multi-class cross entropy loss between the second classification output and the known classification of the target training image.
Sharma, working in the same field of endeavor, teaches: wherein the first loss is a mean square error loss between the output image and the target training image (See Sharma, [Pg. 4036, Col. 1, ln. 3–4]. The Stage 1-2 cascade with two loss functions - MSE (enhancement). [Pg. 4035, Col. 2, ln. 24–28], To compare the reconstruction image Y′ with the ideal T, we use MSE loss as a measure of image quality, although we note that more complex loss functions could be used [10]), and/or the second loss is a multi-class cross entropy loss between the second classification output and the known classification of the target training image (See Sharma, [Pg. 4036, Col. 1, ln. 3–7]. The Stage 1-2 cascade with two loss functions - MSE (enhancement) and softmax-loss L (classification) - enables joint optimization by end-to-end propagation of gradients in both ClassNet and EnhanceNet using SGD optimizer. [Pg. 4036, Col. 1, ln. 17–19], ClassNet that is fed to a C-way softmax function, y is the vector of true labels for image I, and C is the number of classes. Note: softmax loss is known as a combination of cross entropy and softmax activation function. The examiner is interpreting the softmax loss as the cross entropy loss).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference wherein the first loss is a mean square error loss between the output image and the target training image, and/or the second loss is a multi-class cross entropy loss between the second classification output and the known classification of the target training image based on the method of Saeed Rad’s reference. The suggestion/motivation would have been to accurately enhance the visual features of the image and more accurately classify the image (See Sharma, [Pg. 4040, Col. 2, ln. 27–45] and [Table 5]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Sharma with Chui to obtain the invention as specified in claim 8.
Regarding claim 11, Chui teaches the method of claim 1, wherein the machine learning model comprises a first multilayer perceptron configured to provide the at least one visual parameter and a second multilayer perceptron, configured to provide the second classification output (See Chui, ¶ [0025], Training 106 on a dataset 104, for example, using any combination of one or more appropriate machine learning techniques such as deep neural networks and convolutional neural networks, results in a set of one or more low level properties or attributes 110 associated with the dataset 104 to be learned. Such attributes may be derived or inferred from labels of the dataset 104. Examples of attributes that may be learned include attributes associated with object/scene types and geometries, materials and textures, camera characteristics, lighting characteristics, noise statistics, contrast (e.g., global and/or local image contrast defined by prescribed metrics, which may be based on, for instance, maximum and minimum pixel intensity values), etc, …, In various embodiments, different training models may be used to learn different attributes).
Regarding claim 15, Chui teaches the method of claim 1, further comprising, by the computer processing system, applying the at least one visual parameter to generate a processed image relative to the input image (See Chui, ¶ [0038], Generally, image processing applications 508 rely on attribute detection using machine learning framework 501. That is, actual attributes used to generate images 510 are detected using machine learning framework 501 and modified to generate output images 512 having modified attributes. Note: Examiner is interpreting the visual parameters as the attributes).
Regarding claim 20, claim 20 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 20, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference. Furthermore, Chui teaches non-transitory computer-readable storage storing instructions for a computer processing system, wherein the instructions, when executed by the computer processing system cause the computer processing system to perform a method comprising (See Chui, ¶ [0017], he invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor).
Claim(s) 2 and 6 is rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Chen et al. (CN 113763296 A, hereafter, "Chen").
Regarding claim 2, Chui in view of Sharma teaches the method of claim 1, [wherein the image characteristics define colour and brightness semantics of the input image, or comprise data representing a colour histogram of the input image, for example an RGB histogram] 
However, Chui and Sharma fail(s) to teach wherein the image characteristics define colour and brightness semantics of the input image, or comprise data representing a colour histogram of the input image, for example an RGB histogram.
Chen, working in the same field of endeavor, teaches: wherein the image characteristics define colour and brightness semantics of the input image, or comprise data representing a colour histogram of the input image, for example an RGB histogram (See Chen, ¶ [0115], Optionally, the computer device obtains the to-be-processed video frame corresponding to the global colour characteristic, obtaining the to-be-processed video frame corresponding to the image semantic characteristic of the specific way may include: adjusting the size of the video frame to be processed; obtaining the candidate video frame with the target size; according to the color histogram corresponding to the candidate video frame, obtaining the global color characteristic corresponding to the video frame to be processed).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference the first classification output comprises an output of a second trained machine learning model, different to the first trained machine learning model, wherein the second trained machine learning model is trained to classify images into one of a plurality of scene classes based on the method of Chen’s reference. The suggestion/motivation would have been to accurately manipulate the features of the image (See Chen, ¶ [0002–0004]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Chen with Chui and Sharma to obtain the invention as specified in claim 2.
Regarding claim 6, Chen teaches the method of claim 1, [wherein the combination of image characteristics of the input image and the first classification output for the input image is a concatenation of data defining the image characteristics of the input image and the first classification output for the input image].
However, Chui and Sharma fail(s) to teach wherein the combination of image characteristics of the input image and the first classification output for the input image is a concatenation of data defining the image characteristics of the input image and the first classification output for the input image.
Chen, working in the same field of endeavor, teaches: wherein the combination of image characteristics of the input image and the first classification output for the input image is a concatenation of data defining the image characteristics of the input image and the first classification output for the input image (See Chen, ¶ [0131], The global color characteristic extracted by the Lab color histogram has 8000 dimensions; the image semantic feature extracted by the mobileNetV2 has 1280 dimensions; after splicing the global color feature and the image semantic feature, it can generate 9280-dimensional target image feature).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference wherein the combination of image characteristics of the input image and the first classification output for the input image is a concatenation of data defining the image characteristics of the input image and the first classification output for the input image based on the method of Chen’s reference. The suggestion/motivation would have been to accurately manipulate the features of the image (See Chen, ¶ [0002–0004]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Chen with Chui and Sharma to obtain the invention as specified in claim 6.
Claim(s) 4 is rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Chen et al. (CN 113763296 A, hereafter, "Chen") and further in view of Wang et al. (US 20190139262 A1, hereafter, "Wang").
Regarding claim 4, Chui in view of Sharma further in view of Wang teaches the method of claim 1, [wherein the first classification output comprises a feature vector determined by another trained machine learning model].
However, Chui, Sharma and Chen fail(s) to teach wherein the first classification output comprises a feature vector determined by another trained machine learning model.
Wang, working in the same field of endeavor, teaches: wherein the first classification output comprises a feature vector determined by another trained machine learning model (See Wang, ¶ [0052], Step 2, extracting a vehicle type feature x.sub.1 and a color feature x.sub.2, particularly, in this embodiment, 128 dimensional vectors output by a vehicle model classification depth convolution neural network output layer are taken as the extracted vehicle type feature).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference to wherein the first classification output comprises a feature vector determined by another trained machine learning model based on the method of Wang’s reference. The suggestion/motivation would have been to extract more features to accurately process and analyze (See Wang, ¶ [0003–0011]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Wang with Chui, Sharma and Chen to obtain the invention as specified in claim 4.
Claim(s) 12 and 13 are rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Gao et al. (See NPL attached, "NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction", hereafter, "Gao").
Regarding claim 12, Chui in view of Sharma teaches The method of claim 11, [wherein the machine learning model comprises a third multilayer perceptron, the third multilayer perception configured to reduce the dimensionality of the input to the trained machine learning model, wherein the first multilayer perceptron and the second multilayer perception are both attached to the third multilayer perceptron].
However, Chui and Sharma fail(s) to teach wherein the machine learning model comprises a third multilayer perceptron, the third multilayer perception configured to reduce the dimensionality of the input to the trained machine learning model, wherein the first multilayer perceptron and the second multilayer perception are both attached to the third multilayer perceptron.
Gao, working in the same field of endeavor, teaches: wherein the machine learning model comprises a third multilayer perceptron, the third multilayer perception configured to reduce the dimensionality of the input to the trained machine learning model, wherein the first multilayer perceptron and the second multilayer perception are both attached to the third multilayer perceptron (See Gao, [Pg. 3208, Col. 1, ln. 22-30], Figure 1 shows the NDDR-CNN network structure for two tasks. It can easily be extended to K-task problems. Let the number of channels for the single-task features be D. Then NDDR-CNN for K tasks can be constructed by: 1) concatenating the features from K tasks according to the channel dimension, and 2) using 1 × 1 convolution with (filters ×1 × 1× channels) = (C × 1 × 1 × KC) to conduct dimensionality reduction, where C is the channel  dimension size of the output features from each task. See also [Figure 1], Single-Task Network, NDDR layers. Note: Examiner is interpreting the NDDR reduction layer as the third CNN).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference wherein the machine learning model comprises a third multilayer perceptron, the third multilayer perception configured to reduce the dimensionality of the input to the trained machine learning model, wherein the first multilayer perceptron and the second multilayer perception are both attached to the third multilayer perceptron based on the method of Gao’s reference. The suggestion/motivation would have been improving the processing and accuracy of multitasks while maintaining high accuracy (See Pettigrew, [Pg. 3212, Col. 2, ln. 20–43] and [Table 7]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Gao with Chui and Sharma to obtain the invention as specified in claim 12.
Regarding claim 13, Chui teaches the method of claim 12, wherein the first multilayer perception and the second multilayer perception and the third multilayer perception comprise a convolutional neural network (See Chui, ¶ [0025], Training 106 on a dataset 104, for example, using any combination of one or more appropriate machine learning techniques such as deep neural networks and convolutional neural networks, results in a set of one or more low level properties or attributes 110 associated with the dataset 104 to be learned, …, In various embodiments, different training models may be used to learn different attributes).
Claim(s) 14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Pettigrew et al. (US 20160284070 A1, hereafter, "Pettigrew").
Regarding claim 14, Chui in view of Sharma teaches the method claim 1, [wherein the at least one visual parameter corresponds to visual parameter that is adjustable by a slider in a photo editing application].
However, Chui and Sharma fail(s) to teach wherein the at least one visual parameter corresponds to visual parameter that is adjustable by a slider in a photo editing application.
Pettigrew, working in the same field of endeavor, teaches: wherein the at least one visual parameter corresponds to visual parameter that is adjustable by a slider in a photo editing application (See Pettigrew, ¶ [0040], The example system or method receives user input via one or more sliders (a user either moves the slider to the left or to the right), analyzes the image, and enhances the image based on the user input and the analysis. The enhancement is determined based on the image analysis and the details of the image (e.g. light, shadows, colors, contrast, etc.)).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference to wherein the at least one visual parameter corresponds to visual parameter that is adjustable by a slider in a photo editing application based on the method of Pettigrew’s reference. The suggestion/motivation would have been to more control and desirable visualization (See Pettigrew, ¶ [0014]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Pettigrew with Chui and Sharma to obtain the invention as specified in claim 14.
Regarding claim 16, Chui in view of Sharma teaches the method of claim 15, further comprising, by the computer processing system, [causing display on a display device a graphical user interface, wherein the graphical user interface is configured to allow the user to further adjust at least one said visual parameter of the processed image].
However, Chui and Sharma fail(s) to teach causing display on a display device a graphical user interface, wherein the graphical user interface is configured to allow the user to further adjust at least one said visual parameter of the processed image.
Pettigrew, working in the same field of endeavor, teaches: causing display on a display device a graphical user interface, wherein the graphical user interface is configured to allow the user to further adjust at least one said visual parameter of the processed image (See Pettigrew, ¶ [0040], The example system or method receives user input via one or more sliders (a user either moves the slider to the left or to the right), analyzes the image, and enhances the image based on the user input and the analysis. The enhancement is determined based on the image analysis and the details of the image (e.g. light, shadows, colors, contrast, etc.)).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference to causing display on a display device a graphical user interface, wherein the graphical user interface is configured to allow the user to further adjust at least one said visual parameter of the processed image based on the method of Pettigrew’s reference. The suggestion/motivation would have been to more control and desirable visualization (See Pettigrew, ¶ [0014]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Pettigrew with Chui and Sharma to obtain the invention as specified in claim 16.
Claim(s) 17 is rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Saeed Rad et al. (See NPL attached, "Benefiting from multitask learning to improve single image super-resolution", hereafter, "Saeed Rad").
Regarding claim 17, Chui in view Sharma teaches the method of claim 1, [wherein the machine learning was trained based on a plurality of image pairs, each image pair comprising a target training image and a degraded image, the degraded image used to generate the output image of the machine learning model during training].
However, Chui and Sharma fail(s) to teach wherein the machine learning was trained based on a plurality of image pairs, each image pair comprising a target training image and a degraded image, the degraded image used to generate the output image of the machine learning model during training.
Saeed Rad, working in the same field of endeavor, teaches: wherein the machine learning was trained based on a plurality of image pairs, each image pair comprising a target training image and a degraded image, the degraded image used to generate the output image of the machine learning model during training (See Saeed Rad, [Pg. 307, Col. 1, ln. 38-40], Training the proposed network in a supervised manner requires a considerable number of training examples with ground-truth for both semantic segmentation and super resolution tasks. Note: The examiner is interpreting the ground truth as the target training image and training examples are the degraded (lower quality) image).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference wherein the first loss is a mean square error loss between the output image and the target training image, and/or the second loss is a multi-class cross entropy loss between the second classification output and the known classification of the target training image based on the method of Saeed Rad’s reference. The suggestion/motivation would have been to more accurately produce high quality image of known categories and unknown categories (See Saeed Rad, [Pg. 310, Col. 2, 4.5 Results on standard benchmarks]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Saeed Rad with Chui and Sharma to obtain the invention as specified in claim 17.
Claim(s) 18 and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chui et al. (US 20190043210 A1, hereafter, "Chui") further in view of Sharma et al. (See NPL attached, "Classification-Driven Dynamic Image Enhancement", hereafter, "Sharma") further in view of Saeed Rad et al. (See NPL attached, "Benefiting from multitask learning to improve single image super-resolution", hereafter, "Saeed Rad") and further in view of Barbosa et al. (US 11,341,367 B1, hereafter, "Barbosa").
Regarding claim 18, Chui in view of Sharma further in view of Saeed Rad teaches the method of claim 17, wherein: 
[a first image pair of the plurality of image pairs is associated with a first class and the degraded image of the first image pair was generated by applying a first degradation model to the target training image of the first image pair; 
a second image pair of the plurality of image pairs is associated with a second class and the degraded image of the second image pair was generated by applying a second degradation model to the target training image of the second image pair;
the first image pair is different to the second image pair and the first degradation model is different to the second degradation model].
However, Chui and Sharma fail(s) to teach a first image pair of the plurality of image pairs is associated with a first class and the degraded image of the first image pair; a second image pair of the plurality of image pairs is associated with a second class and the degraded image of the second image pair.
Saeed Rad, working in the same field of endeavor, teaches: a first image pair of the plurality of image pairs is associated with a first class and the degraded image of the first image pair (See Saeed Rad, [Pg. 307, Col. 1, ln. 38-40], Training the proposed network in a supervised manner requires a considerable number of training examples with ground-truth for both semantic segmentation and super resolution tasks. Note: The examiner is interpreting the ground truth as the target training image and training examples are the degraded (lower quality) image); 
a second image pair of the plurality of image pairs is associated with a second class and the degraded image of the second image pair (See Saeed Rad, [Pg. 307, Col. 1, ln. 38-40], Training the proposed network in a supervised manner requires a considerable number of training examples with ground-truth for both semantic segmentation and super resolution tasks. Note: The examiner is interpreting the ground truth as the target training image and training examples are the degraded (lower quality) image and there are more than one examples).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference a first image pair of the plurality of image pairs is associated with a first class and the degraded image of the first image pair; a second image pair of the plurality of image pairs is associated with a second class and the degraded image of the second image pair based on the method of Saeed Rad’s reference. The suggestion/motivation would have been to more accurately produce high quality image of known categories and unknown categories (See Saeed Rad, [Pg. 310, Col. 2, 4.5 Results on standard benchmarks]).
However, Chui, Sharma and Saeed Rad fail(s) to teach was generated by applying a first degradation model to the target training image of the first image pair; was generated by applying a second degradation model to the target training image of the second image pair; the first image pair is different to the second image pair and the first degradation model is different to the second degradation model.
Barbosa, working in the same field of endeavor, teaches: was generated by applying a first degradation model to the target training image of the first image pair (See Barbosa, [Pg. 15, Col. 8, ln. 15-23], The synthetic training data generator then generates many synthetic images 132 (and corresponding labels 134 and/or algorithm metadata 133) by overlaying one or more of the object images 204A-204M ( or variants thereof, for example images resulting from being transformed or filtered in some manner as described herein) over ones of the background images 254A-204N (or variants thereof, again resulting from being transformed or filtered in some manner as described herein). Note: the synthetic images are generated using class labels and the combination of the images the examiner is interpreting as a degradation); 
was generated by applying a second degradation model to the target training image of the second image pair (See Barbosa, [Pg. 15, Col. 8, ln. 15-23], The synthetic training data generator then generates many synthetic images 132 (and corresponding labels 134 and/or algorithm metadata 133) by overlaying one or more of the object images 204A-204M (or variants thereof, for example images resulting from being transformed or filtered in some manner as described herein) over ones of the background images 254A-204N (or variants thereof, again resulting from being transformed or filtered in some manner as described herein). Note: the synthetic images are generated using class labels and the combination of the images the examiner is interpreting as a degradation); 
the first image pair is different to the second image pair and the first degradation model is different to the second degradation model (See Barbosa, [Pg. 15, Col. 8, ln. 15-23], The synthetic training data generator then generates many synthetic images 132 (and corresponding labels 134 and/or algorithm metadata 133) by overlaying one or more of the object images 204A-204M ( or variants thereof, for example images resulting from being transformed or filtered in some manner as described herein) over ones of the background images 254A-204N (or variants thereof, again resulting from being transformed or filtered in some manner as described herein). Note: different combination are used to generate different images and degradations).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference was generated by applying a first degradation model to the target training image of the first image pair; was generated by applying a second degradation model to the target training image of the second image pair; the first image pair is different to the second image pair and the first degradation model is different to the second degradation model based on the method of Barbosa’s reference. The suggestion/motivation would have been to provide time efficient data and increase the training performance (See Barbosa, [Pg. 12, Col. 1, ln. 6–34]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Saeed Rad and Barbosa with Chui and Sharma to obtain the invention as specified in claim 18.
Regarding claim 19, Shui in view of Sharma further in view of Saeed Rad and further in view of Barbosa teaches the method of claim 18, [wherein the first degradation model and not the second degradation model was selected for the first image pair due to the association of the first image pair with the first class and not the second class and the second degradation model and not the first degradation model was selected for the second image pair due to the association of the second image pair with the second class and not the first class].
However, Chui, Sharma and Saeed Rad fail(s) to teach wherein the first degradation model and not the second degradation model was selected for the first image pair due to the association of the first image pair with the first class and not the second class and the second degradation model and not the first degradation model was selected for the second image pair due to the association of the second image pair with the second class and not the first class.
Barbosa, working in the same field of endeavor, teaches: wherein the first degradation model and not the second degradation model was selected for the first image pair due to the association of the first image pair with the first class and not the second class and the second degradation model and not the first degradation model was selected for the second image pair due to the association of the second image pair with the second class and not the first class (See Barbosa, [Pg. 15, Col. 8, ln. 15-23], The synthetic training data generator then generates many synthetic images 132 (and corresponding labels 134 and/or algorithm metadata 133) by overlaying one or more of the object images 204A-204M (or variants thereof, for example images resulting from being transformed or filtered in some manner as described herein) over ones of the background images 254A-204N (or variants thereof, again resulting from being transformed or filtered in some manner as described herein). [Pg. 12, Col. 2, ln. 53-57], Accordingly, in some embodiments a synthetic training data generator generates high-quality synthetic training data 55 by merging images from a set of background images with images from a set of user-provided or user-specified images (depicting objects of interest). Note: the user-specified classes implies that they are inherently degrading the images based on the classes).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Chui’s reference wherein the first degradation model and not the second degradation model was selected for the first image pair due to the association of the first image pair with the first class and not the second class and the second degradation model and not the first degradation model was selected for the second image pair due to the association of the second image pair with the second class and not the first class based on the method of Barbosa’s reference. The suggestion/motivation would have been to provide time efficient data and increase the training performance (See Barbosa, [Pg. 12, Col. 1, ln. 6–34]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Saeed Rad and Barbosa with Chui and Sharma to obtain the invention as specified in claim 19.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Kuo et al. (US 20210150764 A1) teaches an object detection method, an electronic apparatus and an object detection system are provided. The method is adapted to the electronic apparatus and includes the following steps. A first image is obtained. A geometric transformation operation is performed on the first image to obtain at least one second image. The first image and the at least one second image are combined to generate a combination image. The combination image including the first image and the at least one second image is inputted into a trained deep learning model to detect a target object.
Dhua et al. (US 10049308 B1) teaches training images can be synthesized in order to obtain enough data to train a convolutional neural network to recognize various classes of a type of item. Images can be synthesized by blending images of items labeled using those classes into selected background images. Catalog images can represent items against a solid background, which can be identified using connected components or other such approaches. Removing the background using such approaches can result in edge artifacts proximate the item region. To improve the results, one or more operations are performed, such as a morphological erosion operation followed by an opening operation. The isolated item portion then can be blended into a randomly selected background region in order to generate a synthesized training image. The training images can be used with real world images to train the neural network.
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DION J SATCHER whose telephone number is (703)756-5849. The examiner can normally be reached Monday - Thursday 5:30 am - 2:30 pm, Friday 5:30 am - 9:30 am PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DION J SATCHER/Patent Examiner, Art Unit 2676                                                                                                                                                                                                        

/Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676
Read full office action
Prosecution Timeline

Mar 09, 2024
Application Filed
Oct 24, 2025
Non-Final Rejection — §103
Jan 26, 2026
Response Filed
Feb 18, 2026
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/119,435
Patent 12586218
MOTION ESTIMATION WITH ANATOMICAL INTEGRITY
2y 5m to grant Granted Mar 24, 2026
18/469,583
Patent 12579787
INSTRUMENT RECOGNITION METHOD BASED ON IMPROVED U2 NETWORK
2y 5m to grant Granted Mar 17, 2026
17/981,891
Patent 12573066
Depth Estimation Using a Single Near-Infrared Camera and Dot Illuminator
2y 5m to grant Granted Mar 10, 2026
18/063,819
Patent 12555263
SYSTEMS AND METHODS FOR TWO-STAGE OBJECTION DETECTION
2y 5m to grant Granted Feb 17, 2026
17/993,651
Patent 12548140
DETERMINING PROCESS DEVIATIONS THROUGH VIDEO ANALYSIS
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
85%
Grant Probability
99%
With Interview (+14.2%)
3y 0m
Median Time to Grant
Moderate
PTA Risk
Based on 39 resolved cases by this examiner. Grant probability derived from career allow rate.