Last updated: April 19, 2026

Application No. 18/484,122

INFRARED AND OTHER COLORIZATION USING GENERATIVE NEURAL NETWORKS

Final Rejection §103

Filed

Oct 10, 2023

Examiner

OMETZ, RACHEL ANNE

Art Unit

2668

Tech Center

2600 — Communications

Assignee

Nvidia Corporation

OA Round

2 (Final)

Interview Optional

— +30.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 26 resolved cases, 2023–2026

Examiner Intelligence

OMETZ, RACHEL ANNE View full profile →

Grants 69% — above average

Career Allow Rate

18 granted / 26 resolved

+7.2% vs TC avg

Strong +30% interview lift

Without

With

+30.1%

Interview Lift

resolved cases with interview

Typical timeline

2y 11m

Avg Prosecution

24 currently pending

Career history

Total Applications

across all art units

Statute-Specific Performance

§101

3.1%

-36.9% vs TC avg

§103

62.1%

+22.1% vs TC avg

§102

18.8%

-21.2% vs TC avg

§112

14.7%

-25.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 26 resolved cases

Office Action

§103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Status
Claims 1-20 were pending for examination in Application No. 18/484,122 filed October 10th, 2023. In the remarks and amendments received on December 9th, 2025, claims 1-5, 9, 11-15, and 19 are amended, no claims are cancelled, and no claims are added. Accordingly, claims 1-20 are currently pending for examination in the application. 

Response to Amendment
Applicant’s amendments filed December 9th, 2025, have overcome some of the objections previously set forth in the Non-Final Office Action mailed October 10th, 2025. Accordingly, the objection is partially withdrawn.

Response to Arguments
Applicant’s arguments filed December 9th, 2025, with respect to the rejection of claims 1, 11, and 19, have been fully considered but are moot because the arguments do not apply to the new combination of references, facilitated by Applicant’s newly submitted amendments being used in the current rejection. 

Claim Objections
Claims 5, 15, and 19 are objected to because of the following informalities: the acronym "RGB" should be spelled out completely at its first mention in each claim (or changed to a different phrase, such as “color” or “visible light”). 
Claims 3, 4, 9, 13, and 14 objected to because of the following informalities: “generate a fill” or “generate the fill” should be “modify a fill” or “modify the fill” (as claimed in claim 1) for consistency.
Appropriate correction is required.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1, 5, 10-11, 15, and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, and further in view of Klomp et al. (DE-102018216806-A1).

Regarding claim 1, Li teaches a processor comprising: 
one or more processing units to: generate infrared image data (“an infrared-to-visible (I2V) video translation method,” using a GAN network i.e., processing units, see Abstract) 
generate, based at least on applying a representation of the infrared image data to a generator of a generative adversarial network (Fig. 1, “I2V-GAN”), RGB image data corresponding to the infrared image data (Fig. 1, “translate videos from the source domain X to the target domain Y”, where X is the infrared domain and Y is the RGB/visible light domain); and

    PNG
    media_image1.png
    530
    589
    media_image1.png
    Greyscale

	modify a fill for one or more segmented regions (Fig. 1, first row; the sky, a building, the ground, etc.) of the synthesized color image data (Fig. 1, color in the “infrared” (first row) domain is modified to enter the “visible” (second row) domain). 
	Li fails to teach the following limitations as further claimed. However, Klomp further teaches generat[ing] infrared image data representing an interior space (“taking infrared images of a user, in particular a driver of a motor vehicle,” Para [0017]) of an ego-machine (“main use case for such a camera is in automated driving, where it checks whether the driver can resume the driving task,” Para [0003]).
	Klomp is considered to be analogous to the claimed invention because they are both in the same field of colorizing infrared images representing cabin spaces of vehicles. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Klomp into Li for the benefit of better detecting the driver in a vehicle regardless of the lighting inside or outside of the vehicle.

	Regarding claim 5, Li in view of Klomp teaches the processor of claim 1, the one or more processing units further to train the generative adversarial network (Fig. 3, “I2V-GAN network architecture”) based at least on training a first branch that chains the generator (Fig. 3, “GY”) of RGB image data (Fig. 3, “ȳt+1”) followed by a generator of infrared image data (Fig. 3,”GX”), and a second branch that chains the generator of infrared image data followed by the generator of RGB image data (Fig. 3 caption, “The opposite direction Y                         
                            →
                        
                     X is similar”, I.E. the positions of the generators can be swapped). 

Regarding claim 10, Li teaches the processor of claim 1, wherein the processor is comprised in at least one of:
a control system for an autonomous or semi-autonomous machine;
a perception system for an autonomous or semi-autonomous machine;
a system for performing simulation operations;
a system for performing digital twin operations;
a system for performing light transport simulation;
a system for performing collaborative content creation for 3D assets;
a system for performing deep learning operations (Fig. 8, the flower life cycle in images is created using I2V-GAN, where GANs are a type of deep learning architecture);

    PNG
    media_image2.png
    591
    602
    media_image2.png
    Greyscale

a system for performing real-time streaming; 
a system implemented using an edge device;
a system implemented using a robot; a system for performing conversational AI operations; 
a system for generating synthetic data; 
a system incorporating one or more virtual machines (VMs);
a system implemented at least partially in a data center; or
a system implemented at least partially using cloud computing resources.  
(Examiner’s Note: Claim 10 as recited is treated as a “field of use” or “intended use” limitation and therefore carries no patentable weight although it has been examined in view of Li above.  The processor as recited has been examined as evidenced in claim 1 above. With respect to the enumerated environments that said processor is “comprised in”, the specification as disclosed merely mentions these environments as preferred intended use environments without specific details that warrant said processor comprised in these environments resulted in a novel and non-obvious structural change to the processor.  Reference to MPEP 2112.01 is also made for applicant’s attention.) 

Claims 11 and 15 are system claims that correspond to processor claims 1 and 5. Claims 11 and 15 are rejected for the same reasons as claims 1 and 5.

Claims 18-20 are method claims that correspond to processor claims 1 or 10. Claims 18-20 are rejected for the same reasons as claims 18-20.

Claim(s) 2, 9, and 12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, and Klomp et al. (DE-102018216806-A1) as applied to claims 1 and 11, and further in view of Herman et al. (US-20230043536-A1).

Regarding claim 2, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. 
Herman, however, further teaches the one or more processing units further to generate the synthesized color image data (“one or more steps may be incorporated into a neural network which may receive one or more input parameters described in the present disclosure, such as raw image data and/or known values”, Para [0022], that is, a step illustrated in Fig. 5, such as “color manipulation”) based on at least on conditioning the generator on a representation of at least one of a color, lighting, or weather condition outside the ego-machine (“When most of the image-affecting variables are known and/or solved, ground-truth color of an image can be estimated,” (Para [0017]), the image-affecting variables being created “when exiting and entering tunnels, parking garages, underpasses” (Para [0016]). 

    PNG
    media_image3.png
    486
    547
    media_image3.png
    Greyscale


Herman is considered to be analogous to the claimed invention because they are both in the same field of color correction for images that represent the interior of vehicles. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Herman into Li and Klomp for the benefit of accurate coloring of an image that represents an interior of a vehicle, regardless of outside lighting.

Regarding claim 9, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. Herman, however, further teaches the one or more processing units further to generate the fill for the one or more segmented regions of the synthesized color image data based at least on applying one or more predetermined colors of the interior space (“one or more steps may be incorporated into a neural network which may receive one or more input parameters described in the present disclosure, such as raw image data and/or known values, such as the color space of objects permanent to the vehicle cabin,” Para [0022]). 
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Herman into Li and Klomp for the benefit of more accurately coloring the image of the interior of the vehicle.

Claim 12 is a system claim that corresponds to processor claim 2. Claim 12 is rejected for the same reason as claim 2.

Claim(s) 3 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, and Klomp et al. (DE-102018216806-A1) as applied to claims 1 and 11, and further in view of Levin et al., “Colorization using Optimization”, ACM Transactions on Graphics (TOG), Volume 23, Issue 3, pp. 689-694, 2004, hereinafter referred to as Levin.

Regarding claim 3, Li teaches the processor of claim 1, but fails to teach the following limitations as further claimed. 
Levin, however, further teaches the one or more processing units further to generate the fill for the one or more segmented regions of the synthesized color image data (pg. 689, Fig. 1, the child’s shirt, skin, etc.) based at least on an energy function that encourages spatial color continuity (pg. 691, Section 3, Results, “First, for pixels covered by the user’s scribbles, the final color should be the color of the scribble. Second, for pixels outside the mask, the color should be the same as the original color. All other colors are automatically determined by the optimization process”).
Levin is considered to be analogous to the claimed invention because they are both in the same field of colorizing infrared or black and white images using computer vision techniques. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Levin into Li and Klomp for the benefit of smooth and seamless colorization.

Claim 13 is a system claim that corresponds to processor claim 3. Claim 13 is rejected for the same reason as claim 3.

Claim(s) 4 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, and Klomp et al. (DE-102018216806-A1) as applied to claims 1 and 11, and further in view of Zhang et al., “TV-GAN: Generative Adversarial Network Based Thermal to Visible Face Recognition”, arXiv:1712.02514v1, 2017, hereinafter referred to as Zhang.

Regarding claim 4, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. 
Zhang, however, further teaches the one or more processing units further to generate the fill for the one or more segmented regions of the synthesized color image data using one or more colors selected from a predetermined range of candidate colors (Fig. 3, the generator learns realistic human skin tones from Y image(s) input into the discriminator) corresponding to a person of at least one of 

    PNG
    media_image4.png
    594
    1026
    media_image4.png
    Greyscale


    PNG
    media_image5.png
    536
    1017
    media_image5.png
    Greyscale

	Zhang is considered to be analogous to the claimed invention because they are both in the same field of generating an image with realistic colors from black-and-white images. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Zhang into Li and Klomp for the benefit of more accurate colorization from the GAN generator.

Claim 14 is a system claim that corresponds to processor claim 4. Claim 14 is rejected for the same reason as claim 4.

Claim(s) 6 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, in view of Klomp et al. (DE-102018216806-A1), as applied to claims 1 and 11 above, and further in view of Niu et al., "Electrical Equipment Identification Method With Synthetic Data Using Edge-Oriented Generative Adversarial Network", in IEEE Access, vol. 8, pp. 136487-136497, 2020, hereinafter referred to as Niu, and further in view of Gafni et al. (US-20240221235-A1).

Regarding claim 6, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. Niu and Gafni, however, further teaches the one or more processing units further to train the generative adversarial network (Niu, Fig. 1, “edge-oriented GAN training”) based at least on emphasizing loss for detected edge pixels (Niu, Fig. 1, identified in “edge feature data”) using higher weights than for detected non-edge pixels (Gafni, “employ a weighted binary cross-entropy face loss over the segmentation face parts classes, emphasizing higher importance for face parts, and (2) include the face parts edges as part of the semantic segmentation edge map,” Para [0039]).

    PNG
    media_image6.png
    548
    1008
    media_image6.png
    Greyscale

Niu is considered to be analogous to the claimed invention because they are in the same field of GAN networks that use edge maps to create sharper output images. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Niu into Li and Klomp for the benefit of sharper output images.
Gafni is considered to be analogous to the claimed invention because they are in the same field of feature emphasis using machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Gafni into Li and Klomp for the benefit of sharper output images.

Claim 16 is a system claim that corresponds to processor claim 6. Claim 16 is rejected for the same reason as claim 6.

Claim(s) 7 and 17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, in view of Klomp et al. (DE-102018216806-A1), as applied to claims 1 and 11 above, and further in view of Niu et al., "Electrical Equipment Identification Method With Synthetic Data Using Edge-Oriented Generative Adversarial Network", in IEEE Access, vol. 8, pp. 136487-136497, 2020, hereinafter referred to as Niu.

Regarding claim 7, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. Niu, however, further teaches wherein the applying of the representation of the infrared image data to the generator of the generative adversarial network (Niu, Fig. 1, “edge-oriented GAN training”) comprises applying an edge map (Niu, Fig. 1, “edge feature data”) extracted from the infrared image data (Niu, Fig. 1, “real infrared image data”) to the generator (Niu, Fig. 1, “edge feature data” is input into “Edge-oriented GAN training”).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Niu into Li and Klomp for the benefit of sharper output images.

Claim 17 is a system claim that corresponds to processor claim 7. Claim 17 is rejected for the same reason as claim 7.

Claim(s) 8 is/are rejected under 35 U.S.C. 103 as being unpatentable over Li et al., "I2V-GAN: Unpaired Infrared-to-Visible Video Translation", arXiv:2108.00913, 2021, hereinafter referred to as Li, in view of Klomp et al. (DE-102018216806-A1), as applied to claim 1 above, and further in view of Niu et al., "Electrical Equipment Identification Method With Synthetic Data Using Edge-Oriented Generative Adversarial Network", in IEEE Access, vol. 8, pp. 136487-136497, 2020, hereinafter referred to as Niu, and further in view of Stein (US-20170154225-A1).

Regarding claim 8, Li in view of Klomp teaches the processor of claim 1, but fails to teach the following limitations as further claimed. However, Niu and Stein further teach the one or more processing units further to extract an edge map from the infrared image data (Niu, Fig. 1, “Edge feature data” from the “Real infrared image data”) and pass the edge map over a wireless communication channel to the generator (Stein, “the neural network (or aspects of the neural network) may be provided via one or more servers located remotely from vehicle 200 and accessible over a network via wireless transceiver 172,” Para [0172]).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Niu into Li and Klomp for the benefit of sharper output images.
Stein is considered analogous to the claimed invention because they are in the same field of using wireless machine learning to determine outputs from an input image. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have incorporated the teachings of Stein into Li, Klomp, and Niu for the benefit of less bandwidth consumed by the machine learning model.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Hu et al., “Joint Image-to-Image Translation for Traffic Monitoring Driver Face Image Enhancement”, in IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 8, pp. 7961-7973, Aug. 2023, teaches a method for improved driver face recognition using I2I translation.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RACHEL A OMETZ whose telephone number is (571)272-2535. The examiner can normally be reached 6:45am-4:00pm ET Monday-Thursday, 6:45am-1:00pm ET every other Friday.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at 571-272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/Rachel Anne Ometz/Examiner, Art Unit 2668                                                                                                                                                                                                        1/12/26


/VU LE/Supervisory Patent Examiner, Art Unit 2668

Read full office action

Prosecution Timeline

Oct 10, 2023

Application Filed

Oct 01, 2025

Non-Final Rejection — §103

Dec 08, 2025

Examiner Interview Summary

Dec 09, 2025

Response Filed

Jan 12, 2026

Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/175,264

Patent 12602925

HYPERSPECTRAL IMAGE ANALYSIS USING MACHINE LEARNING

2y 5m to grant Granted Apr 14, 2026

18/180,643

Patent 12555255

ABSOLUTE DEPTH ESTIMATION FROM A SINGLE IMAGE USING ONLINE DEPTH SCALE TRANSFER

2y 5m to grant Granted Feb 17, 2026

18/246,348

Patent 12548354

METHOD FOR PROCESSING CELL IMAGE, ELECTRONIC DEVICE, AND STORAGE MEDIUM

2y 5m to grant Granted Feb 10, 2026

18/270,420

Patent 12541970

SYSTEM AND METHOD FOR ESTIMATING THE POSE OF A LOCALIZING APPARATUS USING REFLECTIVE LANDMARKS AND OTHER FEATURES

2y 5m to grant Granted Feb 03, 2026

18/155,952

Patent 12530735

IMAGE PROCESSING APPARATUS THAT IMPROVES COMPRESSION EFFICIENCY OF IMAGE DATA, METHOD OF CONTROLLING SAME, AND STORAGE MEDIUM

2y 5m to grant Granted Jan 20, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

69%

Grant Probability

99%

With Interview (+30.1%)

2y 11m

Median Time to Grant

Moderate

PTA Risk

Based on 26 resolved cases by this examiner. Grant probability derived from career allow rate.