Last updated: April 19, 2026
Application No. 18/491,843
DATA GENERATION APPARATUS AND CONTROL METHOD

Non-Final OA §103§112
Filed
Oct 23, 2023
Examiner
DANG, PHILIP
Art Unit
2488
Tech Center
2400 — Computer Networks
Assignee
Canon Kabushiki Kaisha
OA Round
3 (Non-Final)
Interview Optional

— +33.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 470 resolved cases, 2023–2026
Examiner Intelligence

DANG, PHILIP View full profile →
Grants 77% — above average
Career Allow Rate
363 granted / 470 resolved
+19.2% vs TC avg
Strong +33% interview lift
Without
With
+33.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 10m
Avg Prosecution
49 currently pending
Career history
519
Total Applications
across all art units
Statute-Specific Performance

§101
4.5%
-35.5% vs TC avg
§103
48.6%
+8.6% vs TC avg
§102
11.1%
-28.9% vs TC avg
§112
25.5%
-14.5% vs TC avg
Black line = Tech Center average estimate • Based on career data from 470 resolved cases
Office Action

§103 §112
DETAILED ACTIONNotice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 11/21/2025 has been entered.
 
Examiner's Note  
The instant application has a lengthy prosecution history and the examiner encourages the applicant to have a telephonic interview with the examiner prior to filing a response to the instant office action. Also, prior to the interview the examiner encourages the applicant to present multiple possible claim amendments, so as to enable the examiner to identify claim amendments that will advance prosecution in a meaningful manner.

Acknowledgment 
Claims 1 and 12-13, amended on 11/21/2025, are acknowledged by the examiner. 

Response to Arguments 
Presented arguments with respect to claims 1, 12, 13, and their dependent claims have been fully considered, but some are rendered moot in view of the new ground of rejection necessitated by amendments initiated by the applicants. Examiner addresses the main arguments of the Applicant as below.  
Claim Rejection – 35 U.S.C. § 112
The following is a quotation of 35 U.S.C. 112(a): 
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention. 
The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112: 
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 1-13 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The amended independent claims 1, 12, and 13 include “automatically detects/detecting … without user intervention and outputs”. It is noted that there is nowhere in the specification uses the word “automatically” or “automatic”.  Therefore the amended limitations “automatically detects/detecting … without user intervention and outputs” are new matters, which are not described in the application as originally filed. The new matter is required to be canceled from the claims (Please see MPEP 608.04). In this Office action, the claim limitation “automatically … without user intervention” has no patentable weight. 
Claims 1-13 are rejected under 35 U.S.C. 112(a) or pre-AIA  35 U.S.C. 112, first paragraph because of a new matter. The amended independent claims 1, 12, and 13 include “automatically outputs”. It is noted that there is nowhere in the specification uses the word “automatically” or “automatic”.  Therefore the amended limitations “automatically outputs” are new matters, which are not described in the application as originally filed. The new matter is required to be canceled from the claims (Please see MPEP 608.04). In this Office action, the claim limitation “automatically” has no patentable weight.   
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under pre-AIA  35 U.S.C. 103(a) are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
           This application currently names joint inventors. In considering patentability of the claims under pre-AIA  35 U.S.C. 103(a), the examiner presumes that the subject matter of the various claims was commonly owned at the time any inventions covered therein were made absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and invention dates of each claim that was not commonly owned at the time a later invention was made in order for the examiner to consider the applicability of pre-AIA  35 U.S.C. 103(c) and potential pre-AIA  35 U.S.C. 102(e), (f) or (g) prior art under pre-AIA  35 U.S.C. 103(a).

Claims 1-13 are rejected under 35 U.S.C. 103 as being unpatentable over Birnhack (US Patent Application Publication 2021/0217212 A1), (“Birnhack”), in view of 健太 米倉 et al. (JP Patent JP7595421B2), (“健太 米倉”).
Regarding claim 1, Birnhack meets the claim limitations as follows:
A data generation apparatus (a system for automatically colorizing night-vision images) [Birnhack:  para. 0002] comprising:a memory and at least one processor which function as (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123]: a first image acquiring unit ((a color camera) [Birnhack:  para. 0096]; (several cameras or imaging sensors are helpful for capturing the different color and night-vision channels) [Birnhack:  para. 0014]) that acquires a visible light image (a color image, preferably taken under daylight or using visible illumination) [Birnhack:  para. 0076]; 
a second image acquiring unit ((an NIR camera) [Birnhack:  para. 0096]; (several cameras or imaging sensors are helpful for capturing the different color and night-vision channels) [Birnhack:  para. 0014]) that acquires an invisible light image (Night-vision images capture invisible radiation of objects) [Birnhack:  para. 0003] that corresponds temporally to the visible light image and has an identical field of view to the visible light image ((When capturing training images, at least the color images need to be captured with visible light. One option for capturing the pairs of training images, therefore, is to simultaneously capture a night-vision image and a color image under day-light conditions, for example using a night-vision camera and a "regular" color camera, which are positioned next to each other, with the two cameras having parallel optical axes. Preferably, the distance between the two cameras is small, such as between 1 cm and 10 cm, preferable under 5 cm. Thus, the night-vision image and the corresponding color image captured this way contain substantially identical regions of interest) [Birnhack:  para. 0059] – Note: Birnhack teaches that the night-vision camera and the "regular" color camera in his system can be only 1 cm apart.  They can simultaneously take images. As a result, they can capture substantially identical field of view);a subject detection unit (a processor) [Birnhack:  para. 0096] that automatically detects at least one subject in the visible light image without user intervention ((detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1]; (Each of the selected prediction functions f1Pred, f2Pred then determines chrominance values for the input luminance values by scanning S602-1, S602-2, using a grid or tiling of the input image or respective luminance, the luminance values of the input image in order to detect objects) [Birnhack:  para. 0121] – Note: Birnhack teaches that prediction functions detect the objects without user intervention) and outputs (output color image by combining the luminance values of the first night-vision image with the predicted chrominance values) [Birnhack:  para. 0025], as a subject detection result (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], positional and/or class information of the detected subject (Identifying features such as cars, people, trees, traffic signs, general road-side signs, different vehicle classes, animals etc. in a night-vision input image may be beneficial, as features often have specific associated color schemes.) [Birnhack:  para. 0066]; and a supervisory data generation unit (a processor) [Birnhack:  para. 0096] that generates supervisory data (This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029] including pair of ((an NIR camera 201-1 as well as a color camera 201-2 are provided and connected to a processor 203. The processor 203 is configured for receiving and images from either of the cameras) [Birnhack:  para. 0096]; (Alternatively, the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060]) (i) the invisible light image as input data (the prediction function comprises identifying feature maps based on characteristic shapes and shading of the pairs of training images and wherein the predicted chrominance values are determined by the prediction function based on the first night-vision image taking into account said feature maps) [Birnhack:  claim 9] and (ii) the subject detection result (detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1] as output data (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], the supervisory data being adapted for to train a learning model such that ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123]) when the invisible light image is input (The prediction function is, preferably, trained to predict the predicted chrominance values based on the luminance values of the first night-vision image without taking into account color and/or chrominance values) [Birnhack:  para. 0050], the learning model automatically outputs ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123])  a subject detection result for the invisible light image (In particular, the night-vision images which capture energy and/or distance distributions are transformed into a visible monochromic image) [Birnhack:  para. 0096]; (This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029]; (a processing unit configured for determining, by using a prediction function which maps luminance values of night-vision images to chrominance values of corresponding color images, predicted chrominance values for the luminance values of the first night-vision image) [Birnhack:  claim 10].
Birnhack does not explicitly disclose the following claim limitations (Emphasis added).
a supervisory data generation unit that generates supervisory data to train a learning model so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image. 
However, in the same field of endeavor 健太 米倉 further discloses the deficient claim limitations as follows:
a supervisory data generation unit that generates supervisory data to train a learning model (a learning data creation unit that acquires a visible light image in a state where the visible light is irradiated onto the object, and an invisible light image in a state where the invisible light is irradiated onto the object and the light emitting unit is emitting light) [健太 米倉:  page 10] so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image (a trained model that is trained using as learning data a training visible light image in which a training object is captured under visible light and a training invisible light image in which the training object is irradiated under invisible light and a light-emitting unit that is recognizable under invisible light and is provided on the training object is emitting light) [健太 米倉:  page 10].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Birnhack with 健太 米倉 to program the system to implement of 健太 米倉s method.  
Therefore, the combination of Birnhack with 健太 米倉 will enable the system to improve the estimation ability of the trained model and improves the accuracy [健太 米倉: page 5].

Regarding claim 2, Birnhack meets the claim limitations as set forth in claim 1.Birnhack further meets the claim limitations as follow.
wherein the visible light image and the invisible light image are simultaneously (the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060; , 0059] or consecutively captured images (the two respective images
of a training pair are taken right after another, e.g. within a few seconds, preferably within less than one second, for example within just a few milliseconds) [Birnhack:  para. 0061] of the same angle of view ((The pictures depicting an overlapping and/or substantially identical region of interest essentially means that the night-vision image and the corresponding color image show the same scenery. The two corresponding images can, for example, be taken simultaneously and/or from the same or almost the same position) [Birnhack:  para. 0040]; (wherein the night-vision image and the color image depict an overlapping and/or substantially identical region of interest) [Birnhack:  para. 0039]; ( In order to determine the prediction function used for determining the predicted chrominance values based on luminance values of a night-vision image, as discussed with respect to the foregoing embodiments, a training data set of pairs of, on the one hand, NIR images and, on the other hand, corresponding color images, needs to be collected. This can be done by using an NIR camera as well as a color camera which are placed right next to each other and which simultaneously capture images of the same setting. The two cameras have parallel optical axes and are, preferably, placed less than 5 cm apart. For capturing the training images, it is possible to use two identical color cameras which are each equipped with corresponding filters for blocking either the NIR components or the visual components, so that one camera effectively captures NIR images while the other camera effectively captures color images) [Birnhack:  para. 0110, 0059] – Note: Two cameras are set in parallel in 5 cm apart should provide same angle of view]).   

Regarding claim 3, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.	wherein the at least one processor functions as (a processor) [Birnhack:  para. 0096]
a supervisory data generation determination unit (a processor) [Birnhack:  para. 0096] that determines whether the supervisory data can be generated ((training images is obtained by simultaneously, using daylight or artificial light, capturing a night-vision image and a color image containing overlapping regions of interest) [Birnhack:  claim 6]; (an NIR camera 201-1 as well as a color camera 201-2 are provided and connected to a processor 203. The processor 203 is configured for receiving and images from either of the cameras, to determine which image to display, to process the received image, if necessary and to display an output image on the display 204. The processor is furthermore connected to a transceiver 202, via which additional data or information may be sent or received. The transmitter/receiver can also be used for data transfer to a server and a decentralized data processing could be established. Furthermore, with the uploaded images an online-training of the DNNs is possible. The transmission of increased trained models to the processing unit (203) is possible as the transmission of uploaded and online colorized NIR images back to the system.) [Birnhack:  para. 0096]).

Regarding claim 4, Birnhack meets the claim limitations as set forth in claim 3. Birnhack further meets the claim limitations as follow.
the supervisory data generation determination unit (a processor) [Birnhack:  para. 0096] performs the determination based on the invisible light image ((training images is obtained by simultaneously, using daylight or artificial light, capturing a night-vision image and a color image containing overlapping regions of interest) [Birnhack:  claim 6]; (The NIR camera 201-1 then captures NIR images. The processor 203 receives these images from the NIR camera 201 and determines, based on a prediction function stored within the processor or associated memory, chrominance values for the NIR images and generates a colorized output image by combining the predicted chrominance values with the luminance values of the received NR image. The colorized output image is then displayed on the display 204.) [Birnhack:  para. 0099]).

Regarding claim 5, Birnhack meets the claim limitations as set forth in claim 4. Birnhack further meets the claim limitations as follow.
the supervisory data generation determination unit (a processor) [Birnhack:  para. 0096] performs the determination based on luminance values of the invisible light image in the same region as a region of the subject detected in the visible light image ((training images is obtained by simultaneously, using daylight or artificial light, capturing a night-vision image and a color image containing overlapping regions of interest) [Birnhack:  claim 6]; (The NIR camera 201-1 then captures NIR images. The processor 203 receives these images from the NIR camera 201 and determines, based on a prediction function stored within the processor or associated memory, chrominance values for the NIR images and generates a colorized output image by combining the predicted chrominance values with the luminance values of the received NR image. The colorized output image is then displayed on the display 204.) [Birnhack:  para. 0099, 0059] – Note: The combining the predicted chrominance values are from visible image with the luminance values of the received NR image would cover the same region) ; (Each of the selected prediction functions f1Pred, f2Pred then determines chrominance values for the input luminance values by scanning S602-1, S602-2, using a grid or tiling of the input image or respective luminance, the luminance values of the input image in order to detect objects) [Birnhack:  para. 0121]).

Regarding claim 6, Birnhack meets the claim limitations as set forth in claim 3. Birnhack further meets the claim limitations as follow.
the supervisory data generation determination unit (a processor) [Birnhack:  para. 0096] performs the determination based on a predesignated type of subject ((detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1]; (Each of the selected prediction functions f1Pred, f2Pred then determines chrominance values for the input luminance values by scanning S602-1, S602-2, using a grid or tiling of the input image or respective luminance, the luminance values of the input image in order to detect objects) [Birnhack:  para. 0121]; (Each feature map 104 is trained to detect a specific pattern, with the weights of each feature map being fixed, so that the same pattern may be detected at different locations in the input image. Applying a feature map 104 as a filter across the input image 101 in order to detect specific local patterns, therefore, amounts to a convolution of the respective weights with the input image.) [Birnhack:  para. 0019; Fig. 1]). 

Regarding claim 7, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.
the supervisory data generation unit (a processor) [Birnhack:  para. 0096] the performs at least one of processing for creating bokeh, processing for creating blurring (A convolutional neural network (CNN) is trained on these image pairs and the trained CNN is then used to simultaneously predict all three color channels of the target RGB image from input NIR image. The result is rather blurry) [Birnhack:  para. 0022], and processing for correcting luminance on the invisible light image (post-processing is used in order to enhance and sharpen the raw output) [Birnhack:  para. 0022].

Regarding claim 8, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.
a display unit that displays the subject detection result (a display for displaying the first color image to a user) [Birnhack:  para. 0096]  acquired by the subject detection unit (The NIR camera 201-1 then captures NIR images. The processor 203 receives these images from the NIR camera 201 and determines, based on a prediction function stored within the processor or associated memory, chrominance values for the NIR images and generates a colorized output image by combining the predicted chrominance values with the luminance values of the received NR image. The colorized output image is then displayed on the display 204.) [Birnhack:  para. 0099] and a type of detected subject in a superimposed manner on one or both (alternatively, colorized sub-images or objects may be added successively into the original first night-vision image by overlaying the predicted chrominance values over the respective object or sub-image of the first night-vision image) [Birnhack:  para. 0072]  of the visible light image and the invisible light image (The system also comprises a processing unit which is configured for determining, by using a prediction function, predicted chrominance values for the luminance values of the first night-vision image and for generating the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values. The prediction function is a, preferably predetermined, mapping which maps luminance values of night vision images to chrominance values of corresponding color images. The system furthermore comprises a display for displaying the first color image to a user.) [Birnhack:  para. 0071].

Regarding claim 9, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.
wherein the first image acquiring unit (a color camera) [Birnhack:  para. 0096] and the second image acquiring unit (an NIR camera) [Birnhack:  para. 0096] acquire a visible light image and an invisible light image (several cameras or imaging sensors are helpful for capturing the different color and night-vision channels) [Birnhack:  para. 0014] of the same angle of view that have been simultaneously (the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060] or consecutively captured images (the two respective images of a training pair are taken right after another, e.g. within a few seconds, preferably within less than one second, for example within just a few milliseconds) [Birnhack:  para. 0061] by an image capture apparatus ((The pictures depicting an overlapping and/or substantially identical region of interest essentially means that the night-vision image and the corresponding color image show the same scenery. The two corresponding images can, for example, be taken simultaneously and/or from the same or almost the same position) [Birnhack:  para. 0040]; (wherein the night-vision image and the color image depict an overlapping and/or substantially identical region of interest) [Birnhack:  para. 0039]; ( In order to determine the prediction function used for determining the predicted chrominance values based on luminance values of a night-vision image, as discussed with respect to the foregoing embodiments, a training data set of pairs of, on the one hand, NIR images and, on the other hand, corresponding color images, needs to be collected. This can be done by using an NIR camera as well as a color camera which are placed right next to each other and which simultaneously capture images of the same setting. The two cameras have parallel optical axes and are, preferably, placed less than 5 cm apart. For capturing the training images, it is possible to use two identical color cameras which are each equipped with corresponding filters for blocking either the NIR components or the visual components, so that one camera effectively captures NIR images while the other camera effectively captures color images) [Birnhack:  para. 0110, 0059] – Note: Two cameras are set in parallel in 5 cm apart should provide same angle of view]).   

Regarding claim 10, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.
wherein the first image acquiring unit is a first image capturing unit (a color camera) [Birnhack:  para. 0096] for capturing the visible light image (a color image, preferably taken under daylight or using visible illumination) [Birnhack:  para. 0076] and the second image acquiring unit is a second image capturing unit (an NIR camera) [Birnhack:  para. 0096] for capturing the invisible light image (Night-vision images capture invisible radiation of objects) [Birnhack:  para. 0003], and 
wherein the first image capturing unit and the second image capturing unit capture a simultaneously (the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens) [Birnhack:  para. 0060]; (When capturing training images, at least the color images need to be captured with visible light. One option for capturing the pairs of training images, therefore, is to simultaneously capture a night-vision image and a color image under day-light conditions, for example using a night-vision camera and a "regular" color camera, which are positioned next to each other, with the two cameras having parallel optical axes. Preferably, the distance between the two cameras is small, such as between 1 cm and 10 cm, preferable under 5 cm) [Birnhack:  para. 0059] or consecutively captured images (the two respective images of a training pair are taken right after another, e.g. within a few seconds, preferably within less than one second, for example within just a few milliseconds) [Birnhack:  para. 0061] visible light image and invisible light image of the same angle of view ((The pictures depicting an overlapping and/or substantially identical region of interest essentially means that the night-vision image and the corresponding color image show the same scenery. The two corresponding images can, for example, be taken simultaneously and/or from the same or almost the same position) [Birnhack:  para. 0040; 0059]; (the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060]; (wherein the night-vision image and the color image depict an overlapping and/or substantially identical region of interest) [Birnhack:  para. 0039]; ( In order to determine the prediction function used for determining the predicted chrominance values based on luminance values of a night-vision image, as discussed with respect to the foregoing embodiments, a training data set of pairs of, on the one hand, NIR images and, on the other hand, corresponding color images, needs to be collected. This can be done by using an NIR camera as well as a color camera which are placed right next to each other and which simultaneously capture images of the same setting. The two cameras have parallel optical axes and are, preferably, placed less than 5 cm apart. For capturing the training images, it is possible to use two identical color cameras which are each equipped with corresponding filters for blocking either the NIR components or the visual components, so that one camera effectively captures NIR images while the other camera effectively captures color images) [Birnhack:  para. 0110, 0059] – Note: Two cameras are set in parallel in 5 cm apart should provide same angle of view]).

Regarding claim 11, Birnhack meets the claim limitations as set forth in claim 1. Birnhack further meets the claim limitations as follow.
wherein the invisible light image is one of a near-infrared image, a far-infrared image, and an ultraviolet image (The first night-vision image may, preferably, be captured using a sensor or camera capable of detecting invisible radiation based, e.g. on infrared, near infrared (NIR), radar, LiDAR, ultrasound etc.) [Birnhack:  para. 0027].

Regarding claim 12, Birnhack meets the claim limitations as follows:
A method (a method) [Birnhack:  para. 0002] of controlling a data generation apparatus (FIG. 5 shows a detailed control-flow diagram of a preprocessing method in accordance with the described method) [Birnhack:  para. 0093; Fig. 5], the method (a method) [Birnhack:  para. 0002] comprising: acquiring a visible light image (a color image, preferably taken under daylight or using visible illumination) [Birnhack:  para. 0076] and an invisible light image (Night-vision images capture invisible radiation of objects) [Birnhack:  para. 0003] that acquires an invisible light image (Night-vision images capture invisible radiation of objects) [Birnhack:  para. 0003] that corresponds temporally to the visible light image and has an identical field of view to the visible light image ((When capturing training images, at least the color images need to be captured with visible light. One option for capturing the pairs of training images, therefore, is to simultaneously capture a night-vision image and a color image under day-light conditions, for example using a night-vision camera and a "regular" color camera, which are positioned next to each other, with the two cameras having parallel optical axes. Preferably, the distance between the two cameras is small, such as between 1 cm and 10 cm, preferable under 5 cm. Thus, the night-vision image and the corresponding color image captured this way contain substantially identical regions of interest) [Birnhack:  para. 0059] – Note: Birnhack teaches that the night-vision camera and the "regular" color camera in his system can be only 1 cm apart.  They can simultaneously take images. As a result, they can capture substantially identical field of view); automatically detecting at least one subject in the visible light image without user intervention ((detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1]; (Each of the selected prediction functions f1Pred, f2Pred then determines chrominance values for the input luminance values by scanning S602-1, S602-2, using a grid or tiling of the input image or respective luminance, the luminance values of the input image in order to detect objects) [Birnhack:  para. 0121] – Note: Birnhack teaches that prediction functions detect the objects without user intervention) and outputs (output color image by combining the luminance values of the first night-vision image with the predicted chrominance values) [Birnhack:  para. 0025], as a subject detection result (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], positional and/or class information of the detected subject (Identifying features such as cars, people, trees, traffic signs, general road-side signs, different vehicle classes, animals etc. in a night-vision input image may be beneficial, as features often have specific associated color schemes.) [Birnhack:  para. 0066]; and generating supervisory data ((This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029]; (training images is obtained by simultaneously, using daylight or artificial light, capturing a night-vision image and a color image containing overlapping regions of interest) [Birnhack:  claim 6]; (the two respective images of a training pair are taken right after another, e.g. within a few seconds, preferably within less than one second, for example within just a few milliseconds) [Birnhack:  para. 0061] (Alternatively, the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060]) including pair of ((an NIR camera 201-1 as well as a color camera 201-2 are provided and connected to a processor 203. The processor 203 is configured for receiving and images from either of the cameras) [Birnhack:  para. 0096]; (Alternatively, the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060]) (i) the invisible light image as input data (the prediction function comprises identifying feature maps based on characteristic shapes and shading of the pairs of training images and wherein the predicted chrominance values are determined by the prediction function based on the first night-vision image taking into account said feature maps) [Birnhack:  claim 9] and (ii) the subject detection result (detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1] as output data (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], the supervisory data being adapted for to train a learning model such that ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123]) when the invisible light image is input (The prediction function is, preferably, trained to predict the predicted chrominance values based on the luminance values of the first night-vision image without taking into account color and/or chrominance values) [Birnhack:  para. 0050], the learning model automatically outputs ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123])  a subject detection result for the invisible light image (In particular, the night-vision images which capture energy and/or distance distributions are transformed into a visible monochromic image) [Birnhack:  para. 0096]; (This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029]; (a processing unit configured for determining, by using a prediction function which maps luminance values of night-vision images to chrominance values of corresponding color images, predicted chrominance values for the luminance values of the first night-vision image) [Birnhack:  claim 10].
Birnhack does not explicitly disclose the following claim limitations (Emphasis added).
supervisory data to train a learning model so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image. 
However, in the same field of endeavor 健太 米倉 further discloses the deficient claim limitations as follows:
supervisory data to train a learning model (a learning data creation unit that acquires a visible light image in a state where the visible light is irradiated onto the object, and an invisible light image in a state where the invisible light is irradiated onto the object and the light emitting unit is emitting light) [健太 米倉:  page 10] so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image (a trained model that is trained using as learning data a training visible light image in which a training object is captured under visible light and a training invisible light image in which the training object is irradiated under invisible light and a light-emitting unit that is recognizable under invisible light and is provided on the training object is emitting light) [健太 米倉:  page 10].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Birnhack with 健太 米倉 to program the system to implement of 健太 米倉s method.  
Therefore, the combination of Birnhack with 健太 米倉 will enable the system to improve the estimation ability of the trained model and improves the accuracy [健太 米倉: page 5].

Regarding claim 13, Birnhack meets the claim limitations as follows:
A non-transitory computer-readable storage medium storing a program  for causing a computer to function (prediction function is loaded into the memory associated with the processor) [Birnhack:  para. 0115] as a data generation apparatus (a method as well as a system for automatically colorizing night-vision images) [Birnhack:  para. 0002] comprising: a first image acquiring unit ((a color camera) [Birnhack:  para. 0096]; (several cameras or imaging sensors are helpful for capturing the different color and night-vision channels) [Birnhack:  para. 0014]) that acquires a visible light image (a color image, preferably taken under daylight or using visible illumination) [Birnhack:  para. 0076]; 
a second image acquiring unit ((an NIR camera) [Birnhack:  para. 0096]; (several cameras or imaging sensors are helpful for capturing the different color and night-vision channels) [Birnhack:  para. 0014]) that acquires an invisible light image (Night-vision images capture invisible radiation of objects) [Birnhack:  para. 0003] that corresponds temporally to the visible light image and has an identical field of view to the visible light image ((When capturing training images, at least the color images need to be captured with visible light. One option for capturing the pairs of training images, therefore, is to simultaneously capture a night-vision image and a color image under day-light conditions, for example using a night-vision camera and a "regular" color camera, which are positioned next to each other, with the two cameras having parallel optical axes. Preferably, the distance between the two cameras is small, such as between 1 cm and 10 cm, preferable under 5 cm. Thus, the night-vision image and the corresponding color image captured this way contain substantially identical regions of interest) [Birnhack:  para. 0059] – Note: Birnhack teaches that the night-vision camera and the "regular" color camera in his system can be only 1 cm apart.  They can simultaneously take images. As a result, they can capture substantially identical field of view);a subject detection unit (a processor) [Birnhack:  para. 0096] that automatically detects at least one subject in the visible light image without user intervention ((detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1]; (Each of the selected prediction functions f1Pred, f2Pred then determines chrominance values for the input luminance values by scanning S602-1, S602-2, using a grid or tiling of the input image or respective luminance, the luminance values of the input image in order to detect objects) [Birnhack:  para. 0121] – Note: Birnhack teaches that prediction functions detect the objects without user intervention) and outputs (output color image by combining the luminance values of the first night-vision image with the predicted chrominance values) [Birnhack:  para. 0025], as a subject detection result (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], positional and/or class information of the detected subject (Identifying features such as cars, people, trees, traffic signs, general road-side signs, different vehicle classes, animals etc. in a night-vision input image may be beneficial, as features often have specific associated color schemes.) [Birnhack:  para. 0066]; and a supervisory data generation unit (a processor) [Birnhack:  para. 0096]                                  that generates (This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029] supervisory data including pair of ((an NIR camera 201-1 as well as a color camera 201-2 are provided and connected to a processor 203. The processor 203 is configured for receiving and images from either of the cameras) [Birnhack:  para. 0096]; (training images is obtained by simultaneously, using daylight or artificial light, capturing a night-vision image and a color image containing overlapping regions of interest) [Birnhack:  claim 6]; (the two respective images of a training pair are taken right after another, e.g. within a few seconds, preferably within less than one second, for example within just a few milliseconds) [Birnhack:  para. 0061]; (Alternatively, the night-vision image and the color image of a training pair may be captured simultaneously using a multi-chip camera, which can simultaneously capture, e.g., an infrared and a color image using the same camera lens.) [Birnhack:  para. 0060]) (i) the invisible light image as input data (the prediction function comprises identifying feature maps based on characteristic shapes and shading of the pairs of training images and wherein the predicted chrominance values are determined by the prediction function based on the first night-vision image taking into account said feature maps) [Birnhack:  claim 9] and (ii) the subject detection result (detect local patterns in the input image) [Birnhack:  para. 0095; Fig. 1] as output data (The processing unit is, in this case, furthermore configured for determining the first color image by combining, or "stitching", the luminance values of the first night-vision image with the predicted chrominance values and to display the resulting first color image on a display within the vehicle) [Birnhack:  para. 0083], the supervisory data being adapted for to train a learning model such that ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123]) when the invisible light image is input (The prediction function is, preferably, trained to predict the predicted chrominance values based on the luminance values of the first night-vision image without taking into account color and/or chrominance values) [Birnhack:  para. 0050], the learning model automatically outputs ((training a machine learning model on the training images to predict the chrominance of the color images from the luminance of the night-vision images) [Birnhack:  claim 3]; (Each of the prediction functions stored in the memory associated with the processor can be adaptively re-trained) [Birnhack:  para. 0123])  a subject detection result for the invisible light image (In particular, the night-vision images which capture energy and/or distance distributions are transformed into a visible monochromic image) [Birnhack:  para. 0096]; (This method has the advantageous effect that the generated first color image retains the sharpness of the first night-vision image, as only the chrominance is predicted but the luminance is kept from the first night-vision image.) [Birnhack:  para. 0029]; (a processing unit configured for determining, by using a prediction function which maps luminance values of night-vision images to chrominance values of corresponding color images, predicted chrominance values for the luminance values of the first night-vision image) [Birnhack:  claim 10].
Birnhack does not explicitly disclose the following claim limitations (Emphasis added).
a supervisory data generation unit that generates supervisory data to train a learning model so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image. 
However, in the same field of endeavor 健太 米倉 further discloses the deficient claim limitations as follows:
a supervisory data generation unit that generates supervisory data to train a learning model (a learning data creation unit that acquires a visible light image in a state where the visible light is irradiated onto the object, and an invisible light image in a state where the invisible light is irradiated onto the object and the light emitting unit is emitting light) [健太 米倉:  page 10] so that the learning model for subject detection of the invisible light image outputs a subject detection result by inputting the invisible light image based on the invisible light image (a trained model that is trained using as learning data a training visible light image in which a training object is captured under visible light and a training invisible light image in which the training object is irradiated under invisible light and a light-emitting unit that is recognizable under invisible light and is provided on the training object is emitting light) [健太 米倉:  page 10].
It would have been obvious to one with an ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Birnhack with 健太 米倉 to program the system to implement of 健太 米倉s method.  
Therefore, the combination of Birnhack with 健太 米倉 will enable the system to improve the estimation ability of the trained model and improves the accuracy [健太 米倉: page 5].                                                                                                                                                                                         


Reference Notice 
Additional prior arts, included in the Notice of Reference Cited, made of record and not relied upon is considered pertinent to applicant's disclosure.

Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Philip Dang whose telephone number is (408) 918-7529.  The examiner can normally be reached on Monday-Thursday between 8:30 am - 5:00 pm (PST).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath Perungavoor can be reached on 571-272-7455.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Philip P. Dang/            Primary Examiner, Art Unit 2488
Read full office action
Prosecution Timeline

Oct 23, 2023
Application Filed
Apr 24, 2025
Non-Final Rejection — §103, §112
Jul 29, 2025
Response Filed
Sep 30, 2025
Final Rejection — §103, §112
Nov 21, 2025
Response after Non-Final Action
Dec 11, 2025
Request for Continued Examination
Dec 19, 2025
Response after Non-Final Action
Jan 09, 2026
Examiner Interview (Telephonic)
Jan 14, 2026
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/672,798
Patent 12602837
ON SUB-DIVISION OF MESH SEQUENCES
2y 5m to grant Granted Apr 14, 2026
18/983,497
Patent 12593116
IMAGING MEASUREMENT DEVICE USING GAS ABSORPTION IN THE MID-INFRARED BAND AND OPERATING METHOD OF IMAGING MEASUREMENT DEVICE
2y 5m to grant Granted Mar 31, 2026
18/935,098
Patent 12581069
METHOD FOR ENCODING/DECODING VIDEO SIGNAL, AND APPARATUS THEREFOR
2y 5m to grant Granted Mar 17, 2026
18/943,680
Patent 12581106
IMAGE DECODING METHOD AND DEVICE THEREFOR
2y 5m to grant Granted Mar 17, 2026
18/660,193
Patent 12574557
SCALABLE VIDEO CODING USING BASE-LAYER HINTS FOR ENHANCEMENT LAYER MOTION PARAMETERS
2y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
77%
Grant Probability
99%
With Interview (+33.2%)
2y 10m
Median Time to Grant
High
PTA Risk
Based on 470 resolved cases by this examiner. Grant probability derived from career allow rate.