Last updated: April 19, 2026
Application No. 18/609,856
FAST ADAPTATION FOR CROSS-CAMERA COLOR CONSTANCY

Non-Final OA §102§103
Filed
Mar 19, 2024
Examiner
ZHENG, JACKY X
Art Unit
2681
Tech Center
2600 — Communications
Assignee
Black Sesame Technologies (Shanghai) Co. Ltd.
OA Round
1 (Non-Final)
Interview Optional

— +17.2% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 837 resolved cases, 2023–2026
Examiner Intelligence

ZHENG, JACKY X View full profile →
Grants 80% — above average
Career Allow Rate
667 granted / 837 resolved
+17.7% vs TC avg
Strong +17% interview lift
Without
With
+17.2%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
21 currently pending
Career history
858
Total Applications
across all art units
Statute-Specific Performance

§101
8.1%
-31.9% vs TC avg
§103
49.9%
+9.9% vs TC avg
§102
28.7%
-11.3% vs TC avg
§112
11.3%
-28.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 837 resolved cases
Office Action

§102 §103
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This is an initial office action in response to communication(s) filed on March 19, 2024.
Claims 1-20 are pending. 		
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1, 4, 6-8, 11, 13-15, 18 and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by McDonagh et al. (U.S. Pub. No. 2021/0006760 A1, hereinafter as “McDonagh”).
With regard to claim 1, the claim is drawn to a computer-implemented method for white balancing images (see McDonagh, i.e. in fig. 5, 72-78 and etc., disclose an example of an architecture including a camera configured to perform AWB according to the present disclosure), the method comprising: 
obtaining labeled red, green, and blue (RGB) image samples captured by a plurality of cameras (see McDonagh, i.e. in fig. 4, step 401, disclose “acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising image captured by multiple cameras”, and in para. 46 and etc., disclose that “[0046] Consider an RGB image I that has been captured with camera C under a light source of unknown colour. The objective is to estimate a global illuminant correction vector ρ=[r, g, b] such that the corrected image I* appears identical to a canonical image (i.e. an image captured under a white light source). While a scene may contain multiple illuminants, the standard simplifying assumption is followed and a single global illuminant correction is determined per image.”); 
generating a plurality of training tasks, wherein a respective training task is associated with RGB image samples captured by a corresponding camera (see McDonagh, i.e. in fig. 4, step 402, and in para. 70 and etc., disclose that “a set of tasks are formed by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range.”); 
performing meta-training over the plurality of training tasks to obtain a meta model, wherein parameters of the meta model are optimized based on a global loss function (see McDonagh, i.e. in para. 72, disclose that “The camera also comprises a memory 3, a processor 4 and a transceiver 5. The memory stores in non-transient form code that can be run by the processor 4. In some implementations, that code may include a meta-learning algorithm as described above. The algorithm may include code that is directly executable by the processor and/or parameters such as neural network weightings which are not directly executable instructions but serve to configure other executable code that is stored in the memory 3. The transceiver 5 may be capable of transmitting and receiving data over either or both of wired and wireless communication channels. For example, it may support Ethernet, IEEE 802.11B and/or a cellular protocol such as 4G or 5G”; also see para. 65 for teachings of loss function); 
obtaining an image captured by a first camera (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera …”);
fine-tuning the meta model using labeled RGB image samples captured by the first camera to obtain a fine-tuned model specific to the first camera (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising images captured by multiple cameras; forming a set of tasks by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range; training parameters of the model by repeatedly: selecting at least one of the tasks, forming an interim set of model parameters in dependence on a first subset of the images of that task, estimating the quality of the interim set of model parameters against a second subset of the images of that task and updating the parameters of the model in dependence on the interim set of parameters and the estimated quality. This may allow scene illuminants to be inferred accurately for image sensors without access to large training data by enabling the generation of models capable of fast task-adaption”); and 
implementing the fine-tuned model to white balance the image (see McDonagh, i.e. in para. 57 and etc., disclose that “[0057] In one possible approach for generating the task distribution, the camera type is defined as a task, as described above. This would normally require a substantial amount of camera specific data to provide enough task diversity for training. In addition, it would be expected to observe large variability in illuminant correction within one camera dataset, due to both scenes and light source diversity. Achieving good performance on tasks containing too much diversity is difficult, especially when each camera specific model will be fine-tuned in a few-shot setting. Therefore, in a preferred example, each camera task may also be associated with a set of subtasks in which the RGB illuminant corrections are clustered. Gamut based colour constancy methods assume that the colour of the illuminant is constrained by the colors observed in the image. A similar hypothesis is used when defining the subtasks and it is aimed to regroup images with similar dominant colors in the same task.”).  
With regard to claim 4, the claim is drawn to the computer-implemented method of claim 1, wherein the respective training task comprises a regression task based on a camera-specific loss function (SEE McDonagh, i.e. in para. 65 and etc., disclose that “…where α is the learning rate parameter and L.sub.τ.sub.i(ƒ.sub.θ) is the regression loss function as described in Equation (3). Finally, a new set of meta-test images are sampled from the same task τ.sub.i. For each task in the batch, the metatest loss function L.sub.τ.sub.i(ƒ.sub.θ.sub.i) is computed using the task-specific updated parameters.” ). 
With regard to claim 6, the claim is drawn to the computer-implemented method of claim 1, wherein the first camera comprises a new camera not included in the plurality of cameras (see McDonagh, i.e. para. 80 and etc., disclose that “[0080] The present disclosure solves the problem of inferring scene illuminants accurately for image sensors without access to large training data. It allows for the generation of models capable of fast task-adaption, allowing illuminant inference for new camera sensors using very few training images, typically 1+ orders of magnitude fewer than typical imagery for this task.”). 
With regard to claim 7, the claim is drawn to the computer-implemented method of claim 1, wherein performing the meta-training comprises batch training, and wherein a respective batch comprises multiple randomly selected training tasks (see McDonagh, i.e. para. 65 and etc., disclose that “[0065] Considering the set of tasks r as defined in Equation (5), each MAML iteration samples a batch of tasks τ.sub.i. As shown at 24 in FIG. 2(b) (the ‘inner update’), for each task, K meta-training images are randomly sampled and used to train model ƒ.sub.θ with original parameters θ for n standard gradient descent updates. The model's parameters θ are updated to be task-specific parameters θ.sub.i:”). 
8With regard to claim 8, the claim is drawn to a non-transitory computer readable storage medium storing instruction which, when executed by a processor, causes the processor to perform a method for white balancing images (see McDonagh, i.e. in fig. 5, 72-78 and etc., disclose an example of an architecture including a camera configured to perform AWB according to the present disclosure; and further in para. 72, disclose that “[0072] FIG. 5 shows an example of an architecture including a camera that uses the model to perform AWB. A camera 1 is connected to a communications network. Camera 1 comprises an image sensor 2. The camera also comprises a memory 3, a processor 4 and a transceiver 5. The memory stores in non-transient form code that can be run by the processor 4. In some implementations, that code may include a meta-learning algorithm as described above. The algorithm may include code that is directly executable by the processor and/or parameters such as neural network weightings which are not directly executable instructions but serve to configure other executable code that is stored in the memory 3. The transceiver 5 may be capable of transmitting and receiving data over either or both of wired and wireless communication channels. For example, it may support Ethernet, IEEE 802.11B and/or a cellular protocol such as 4G or 5G…”), the method comprising: 
obtaining labeled red, green, and blue (RGB) image samples captured by a plurality of cameras (see McDonagh, i.e. in fig. 4, step 401, disclose “acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising image captured by multiple cameras”, and in para. 46 and etc., disclose that “[0046] Consider an RGB image I that has been captured with camera C under a light source of unknown colour. The objective is to estimate a global illuminant correction vector ρ=[r, g, b] such that the corrected image I* appears identical to a canonical image (i.e. an image captured under a white light source). While a scene may contain multiple illuminants, the standard simplifying assumption is followed and a single global illuminant correction is determined per image.”); 
generating a plurality of training tasks, wherein a respective training task is associated with RGB image samples captured by a corresponding camera (see McDonagh, i.e. in fig. 4, step 402, and in para. 70 and etc., disclose that “a set of tasks are formed by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range.”);  
performing meta-training over the plurality of training tasks to obtain a meta model, wherein parameters of the meta model are optimized based on a global loss function (see McDonagh, i.e. in para. 72, disclose that “The camera also comprises a memory 3, a processor 4 and a transceiver 5. The memory stores in non-transient form code that can be run by the processor 4. In some implementations, that code may include a meta-learning algorithm as described above. The algorithm may include code that is directly executable by the processor and/or parameters such as neural network weightings which are not directly executable instructions but serve to configure other executable code that is stored in the memory 3. The transceiver 5 may be capable of transmitting and receiving data over either or both of wired and wireless communication channels. For example, it may support Ethernet, IEEE 802.11B and/or a cellular protocol such as 4G or 5G”; also see para. 65 for teachings of loss function); 
obtaining an image captured by a first camera (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera …”);
fine-tuning the meta model using labeled RGB image samples captured by the first camera to obtain a fine-tuned model specific to the first camera (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising images captured by multiple cameras; forming a set of tasks by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range; training parameters of the model by repeatedly: selecting at least one of the tasks, forming an interim set of model parameters in dependence on a first subset of the images of that task, estimating the quality of the interim set of model parameters against a second subset of the images of that task and updating the parameters of the model in dependence on the interim set of parameters and the estimated quality. This may allow scene illuminants to be inferred accurately for image sensors without access to large training data by enabling the generation of models capable of fast task-adaption”); and
implementing the fine-tuned model to white balance the image (see McDonagh, i.e. in para. 57 and etc., disclose that “[0057] In one possible approach for generating the task distribution, the camera type is defined as a task, as described above. This would normally require a substantial amount of camera specific data to provide enough task diversity for training. In addition, it would be expected to observe large variability in illuminant correction within one camera dataset, due to both scenes and light source diversity. Achieving good performance on tasks containing too much diversity is difficult, especially when each camera specific model will be fine-tuned in a few-shot setting. Therefore, in a preferred example, each camera task may also be associated with a set of subtasks in which the RGB illuminant corrections are clustered. Gamut based colour constancy methods assume that the colour of the illuminant is constrained by the colors observed in the image. A similar hypothesis is used when defining the subtasks and it is aimed to regroup images with similar dominant colors in the same task.”).  
With regard to claim 11, the claim is drawn to the non-transitory computer readable storage medium of claim 8, wherein the respective training task comprises a regression task based on a camera-specific loss function (See McDonagh, i.e. in para. 65 and etc., disclose that “…where α is the learning rate parameter and L.sub.τ.sub.i(ƒ.sub.θ) is the regression loss function as described in Equation (3). Finally, a new set of meta-test images are sampled from the same task τ.sub.i. For each task in the batch, the metatest loss function L.sub.τ.sub.i(ƒ.sub.θ.sub.i) is computed using the task-specific updated parameters.” ).
With regard to claim 13, the claim is drawn to the non-transitory computer readable storage medium of claim 8, wherein the first camera comprises a new camera not included in the plurality of cameras (see McDonagh, i.e. para. 80 and etc., disclose that “[0080] The present disclosure solves the problem of inferring scene illuminants accurately for image sensors without access to large training data. It allows for the generation of models capable of fast task-adaption, allowing illuminant inference for new camera sensors using very few training images, typically 1+ orders of magnitude fewer than typical imagery for this task.”).
With regard to claim 14, the claim is drawn to the non-transitory computer readable storage medium of claim 8, wherein performing the meta-training comprises batch training, and wherein a respective batch comprises multiple randomly selected training tasks (see McDonagh, i.e. para. 65 and etc., disclose that “[0065] Considering the set of tasks r as defined in Equation (5), each MAML iteration samples a batch of tasks τ.sub.i. As shown at 24 in FIG. 2(b) (the ‘inner update’), for each task, K meta-training images are randomly sampled and used to train model ƒ.sub.θ with original parameters θ for n standard gradient descent updates. The model's parameters θ are updated to be task-specific parameters θ.sub.i:”).
With regard to claim 15, the claim is drawn to a computer system (see McDonagh, i.e. in fig. 5, 72-78 and etc., disclose an example of an architecture including a camera configured to perform AWB according to the present disclosure), comprising: 
a processor (See McDonagh, i.e. in fig.5, para. 72, processor 4); and 
a storage device coupled to the processor, wherein the storage device storing instructions which, when executed by the processor, cause the processor to perform a method for white balancing images (see McDonagh, i.e. in fig. 5, para. 72, memory 3, and “[0072] FIG. 5 shows an example of an architecture including a camera that uses the model to perform AWB. A camera 1 is connected to a communications network. Camera 1 comprises an image sensor 2. The camera also comprises a memory 3, a processor 4 and a transceiver 5. The memory stores in non-transient form code that can be run by the processor 4. In some implementations, that code may include a meta-learning algorithm as described above. The algorithm may include code that is directly executable by the processor and/or parameters such as neural network weightings which are not directly executable instructions but serve to configure other executable code that is stored in the memory 3. The transceiver 5 may be capable of transmitting and receiving data over either or both of wired and wireless communication channels. For example, it may support Ethernet, IEEE 802.11B and/or a cellular protocol such as 4G or 5G.”), the method comprising: 
obtaining labeled red, green, and blue (RGB) image samples captured by a plurality of cameras  (see McDonagh, i.e. in fig. 4, step 401, disclose “acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising image captured by multiple cameras”, and in para. 46 and etc., disclose that “[0046] Consider an RGB image I that has been captured with camera C under a light source of unknown colour. The objective is to estimate a global illuminant correction vector ρ=[r, g, b] such that the corrected image I* appears identical to a canonical image (i.e. an image captured under a white light source). While a scene may contain multiple illuminants, the standard simplifying assumption is followed and a single global illuminant correction is determined per image.”);
generating a plurality of training tasks, wherein a respective training task is associated with RGB image samples captured by a corresponding camera (see McDonagh, i.e. in fig. 4, step 402, and in para. 70 and etc., disclose that “a set of tasks are formed by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range.”);
performing meta-training over the plurality of training tasks to obtain a meta model, wherein parameters of the meta model are optimized based on a global loss function (see McDonagh, i.e. in para. 72, disclose that “The camera also comprises a memory 3, a processor 4 and a transceiver 5. The memory stores in non-transient form code that can be run by the processor 4. In some implementations, that code may include a meta-learning algorithm as described above. The algorithm may include code that is directly executable by the processor and/or parameters such as neural network weightings which are not directly executable instructions but serve to configure other executable code that is stored in the memory 3. The transceiver 5 may be capable of transmitting and receiving data over either or both of wired and wireless communication channels. For example, it may support Ethernet, IEEE 802.11B and/or a cellular protocol such as 4G or 5G”; also see para. 65 for teachings of loss function);
obtaining an image captured by a first camera  (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera …”);
fine-tuning the meta model using labeled RGB image samples captured by the first camera to obtain a fine-tuned model specific to the first camera (see McDonagh, i.e. in para. 15, and etc., disclose “[0015] According to a first aspect of the present disclosure, there is provided a processing entity configured to generate a model for estimating scene illumination colour for a source image captured by a camera, by the steps of: acquiring a set of images, each image having been captured by a respective camera, the set of images as a whole comprising images captured by multiple cameras; forming a set of tasks by assigning each image of the set of images to a respective task such that images in the same task have in common that a property of those images lies in a predetermined range; training parameters of the model by repeatedly: selecting at least one of the tasks, forming an interim set of model parameters in dependence on a first subset of the images of that task, estimating the quality of the interim set of model parameters against a second subset of the images of that task and updating the parameters of the model in dependence on the interim set of parameters and the estimated quality. This may allow scene illuminants to be inferred accurately for image sensors without access to large training data by enabling the generation of models capable of fast task-adaption”); and
implementing the fine-tuned model to white balance the image (see McDonagh, i.e. in para. 57 and etc., disclose that “[0057] In one possible approach for generating the task distribution, the camera type is defined as a task, as described above. This would normally require a substantial amount of camera specific data to provide enough task diversity for training. In addition, it would be expected to observe large variability in illuminant correction within one camera dataset, due to both scenes and light source diversity. Achieving good performance on tasks containing too much diversity is difficult, especially when each camera specific model will be fine-tuned in a few-shot setting. Therefore, in a preferred example, each camera task may also be associated with a set of subtasks in which the RGB illuminant corrections are clustered. Gamut based colour constancy methods assume that the colour of the illuminant is constrained by the colors observed in the image. A similar hypothesis is used when defining the subtasks and it is aimed to regroup images with similar dominant colors in the same task.”).  
With regard to claim 18, the claim is drawn to the computer system of claim 15, wherein the respective training task comprises a regression task based on a camera-specific loss function (SEE McDonagh, i.e. in para. 65 and etc., disclose that “…where α is the learning rate parameter and L.sub.τ.sub.i(ƒ.sub.θ) is the regression loss function as described in Equation (3). Finally, a new set of meta-test images are sampled from the same task τ.sub.i. For each task in the batch, the metatest loss function L.sub.τ.sub.i(ƒ.sub.θ.sub.i) is computed using the task-specific updated parameters.” ).
With regard to claim 20, the claim is drawn to the computer system of claim 15, wherein performing the meta-training comprises batch training, and wherein a respective batch comprises multiple randomly selected training tasks (see McDonagh, i.e. para. 65 and etc., disclose that “[0065] Considering the set of tasks r as defined in Equation (5), each MAML iteration samples a batch of tasks τ.sub.i. As shown at 24 in FIG. 2(b) (the ‘inner update’), for each task, K meta-training images are randomly sampled and used to train model ƒ.sub.θ with original parameters θ for n standard gradient descent updates. The model's parameters θ are updated to be task-specific parameters θ.sub.i:”).


Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 2-3, 9-10 and 16-17 are rejected under 35 U.S.C. 103 as being unpatentable over McDonagh as applied to claims, 1, 8 and 15 above, and further in view of Caron et al. (U.S. Pub. No. 2021/0250565 A1, hereinafter as “Caron”).
With regard to claim 2, the claim is drawn to the computer-implemented method of claim 1, further comprising extracting features from each RGB image sample by computing a two-dimensional (2D) log-chrominance histogram (see McDonagh, i.e. in para. 59, disclose that a histogram H is computed containing M bins of CCT values for camera s and each task is defined as the set of images in each histogram bin). 
The teachings of McDonagh merely lacks  in explicitly disclose the aspect relating to “two-dimensional (2D) log-chrominance histogram”.  
However, Carson discloses an analogous inventio relates to image capture device and image processing (see Caron, i.e. para. 2 and etc.). More specifically, in Carson, i.e. in para. 19-20 and etc., disclose that “[0019] Implementations of this disclosure address problems using AWB processing to improve AWB performance, speed, and flexibility by using scene information. The embodiments disclosed herein include a learning step using labeled images with reference AWB results and additional information, including, for example, metadata, scene information, or both. The additional information may be obtained from the same frame or past frames. Learned parameters may include two-dimensional (2D) filters that may be used to find the log-chrominance representation of the light cast on the scene in a raw image. This position may be used for image white balance correction. [0020] The scene information (i.e., scene classification output) may be obtained using a scene classification algorithm that may be separate or a part of a learning-based AWB algorithm. The scene classification output may be used to improve the accuracy of the AWB correction. The scene classification output may be used to learn scene-specific filters. The scene classification output may disambiguate some cases of metamerisms frequently encountered in photography and videography. The scene-based AWB algorithm may be implemented in an image capture device. Computational complexity may be kept low by combining filters before convolution with log-chrominance histograms.”. 
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McDonagh to include the limitation(s) discussed and also taught by Carson, with the aspect(s) discussed above, as the cited prior arts are at least considered to be analogous arts if not also in the same field of endeavor relating to automatic white balance and image processing arts. Further, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified McDonagh by the teachings of Carson, and to incorporate the limitation(s) discussed and also taught by Carson, thereby “…to improve AWB performance, speed, and flexibility by using scene information” (see Carson, i.e. para. 19 and etc.).
With regard to claim 3, the claim is drawn to the computer-implemented method of claim 2, wherein computing the 2D log-chrominance histogram comprises computing chrominance components of each pixel in the RGB image sample based on RGB values of the pixel (see Carson, in Carson, i.e. in para. 19-20 and etc., disclose that “[0019] Implementations of this disclosure address problems using AWB processing to improve AWB performance, speed, and flexibility by using scene information. The embodiments disclosed herein include a learning step using labeled images with reference AWB results and additional information, including, for example, metadata, scene information, or both. The additional information may be obtained from the same frame or past frames. Learned parameters may include two-dimensional (2D) filters that may be used to find the log-chrominance representation of the light cast on the scene in a raw image. This position may be used for image white balance correction. [0020] The scene information (i.e., scene classification output) may be obtained using a scene classification algorithm that may be separate or a part of a learning-based AWB algorithm. The scene classification output may be used to improve the accuracy of the AWB correction. The scene classification output may be used to learn scene-specific filters. The scene classification output may disambiguate some cases of metamerisms frequently encountered in photography and videography. The scene-based AWB algorithm may be implemented in an image capture device. Computational complexity may be kept low by combining filters before convolution with log-chrominance histograms.”). 
With regard to claim 9, the claim is drawn to the non-transitory computer readable storage medium of claim 8, wherein the method further comprises extracting features from each RGB image sample by computing a two-dimensional (2D) log-chrominance histogram (see McDonagh, i.e. in para. 59, disclose that a histogram H is computed containing M bins of CCT values for camera s and each task is defined as the set of images in each histogram bin; also see the discussion of claim 2 above, also incorporated by reference herein).  
With regard to claim 10, the claim is drawn to the non-transitory computer readable storage medium of claim 9, wherein computing the 2D log-chrominance histogram comprises computing chrominance components of each pixel in the RGB image sample based on RGB values of the pixel (see Carson, in Carson, i.e. in para. 19-20 and etc., disclose that “[0019] Implementations of this disclosure address problems using AWB processing to improve AWB performance, speed, and flexibility by using scene information. The embodiments disclosed herein include a learning step using labeled images with reference AWB results and additional information, including, for example, metadata, scene information, or both. The additional information may be obtained from the same frame or past frames. Learned parameters may include two-dimensional (2D) filters that may be used to find the log-chrominance representation of the light cast on the scene in a raw image. This position may be used for image white balance correction. [0020] The scene information (i.e., scene classification output) may be obtained using a scene classification algorithm that may be separate or a part of a learning-based AWB algorithm. The scene classification output may be used to improve the accuracy of the AWB correction. The scene classification output may be used to learn scene-specific filters. The scene classification output may disambiguate some cases of metamerisms frequently encountered in photography and videography. The scene-based AWB algorithm may be implemented in an image capture device. Computational complexity may be kept low by combining filters before convolution with log-chrominance histograms.”).
With regard to claim 16, the claim is drawn to the computer system of claim 15, wherein the method further comprises extracting features from each RGB image sample by computing a two-dimensional (2D) log-chrominance histogram (see McDonagh, i.e. in para. 59, disclose that a histogram H is computed containing M bins of CCT values for camera s and each task is defined as the set of images in each histogram bin; also see the discussion of claim 2 above, also incorporated by reference herein).  
With regard to claim 17, the claim is drawn to the computer system of claim 16, wherein computing the 2D log-chrominance histogram comprises computing chrominance components of each pixel in the RGB image sample based on RGB values of the pixel (see Carson, in Carson, i.e. in para. 19-20 and etc., disclose that “[0019] Implementations of this disclosure address problems using AWB processing to improve AWB performance, speed, and flexibility by using scene information. The embodiments disclosed herein include a learning step using labeled images with reference AWB results and additional information, including, for example, metadata, scene information, or both. The additional information may be obtained from the same frame or past frames. Learned parameters may include two-dimensional (2D) filters that may be used to find the log-chrominance representation of the light cast on the scene in a raw image. This position may be used for image white balance correction. [0020] The scene information (i.e., scene classification output) may be obtained using a scene classification algorithm that may be separate or a part of a learning-based AWB algorithm. The scene classification output may be used to improve the accuracy of the AWB correction. The scene classification output may be used to learn scene-specific filters. The scene classification output may disambiguate some cases of metamerisms frequently encountered in photography and videography. The scene-based AWB algorithm may be implemented in an image capture device. Computational complexity may be kept low by combining filters before convolution with log-chrominance histograms.”).
Allowable Subject Matter
With regard to Claim 5, 12, and 19, claims are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and overcoming the corresponding rejections and/or objection (if any) set forth in the Office Action above.
The following is a statement of reasons for the indication of allowable subject matter:  
With regard to claim 5, the closest prior arts of record, McDonagh and Carson, do not disclose or suggest, among the other limitations, the additional required limitation of “the computer-implemented method of claim 4, wherein each image sample is labeled with ground truth illumination, and wherein the camera-specific loss function measures angular loss between the ground truth illumination and estimated illumination”.  These additional features in combination with all the other features required in the claimed invention, are neither taught nor suggested by prior art(s) of record. 
With regard to claim 12, the closest prior arts of a record, McDonagh and Carson, do not disclose or suggest, among the other limitations, the additional required limitation of “the computer-implemented method of claim 4, wherein each image sample is labeled with ground truth illumination, and wherein the camera-specific loss function measures angular loss between the ground truth illumination and estimated illumination”.  These additional features in combination with all the other features required in the claimed invention, are neither taught nor suggested by prior art(s) of record. 
With regard to claim 19, the closest prior arts of record, McDonagh and Carson, do not disclose or suggest, among the other limitations, the additional required limitation of “the computer system of claim 18, wherein each image sample is labeled with ground truth illumination, and wherein the camera-specific loss function measures angular loss between the ground truth illumination and estimated illumination”.  These additional features in combination with all the other features required in the claimed invention, are neither taught nor suggested by prior art(s) of record. 
Therefore, claims 5, 12 and 19 are objected to.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Carson et al. (U.S. Pat/Pub No. 200) disclose an invention relates to image capture devices and image processing (see Carson, i.e. para. 2 and etc.). 
The Art Unit (or Workgroup) location of your application in the USPTO has changed.  To aid in correlating any papers for this application, all further correspondence regarding this application should be directed to Art Unit 2681.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Jacky X. Zheng whose telephone number is (571) 270-1122.  The examiner can normally be reached on Monday - Friday, 9:00 am - 5:00 pm, alt. Friday Off.  
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Akwasi Sarpong can be reached on (571) 272-3438.  The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system.  Status information for published applications may be obtained from either Private PAIR or Public PAIR.  Status information for unpublished applications is available through Private PAIR only.  For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACKY X ZHENG/Primary Examiner, Art Unit 2681
Read full office action
Prosecution Timeline

Mar 19, 2024
Application Filed
Feb 12, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/297,386
Patent 12594150
CLIP FOR COUPLING TO SCAN BODY FOR ACCURATE INTRAORAL SCANNING
2y 5m to grant Granted Apr 07, 2026
18/268,466
Patent 12593073
POINT CLOUD ENCODING AND DECODING METHOD AND DEVICE BASED ON TWO-DIMENSIONAL REGULARIZATION PLANE PROJECTION
2y 5m to grant Granted Mar 31, 2026
17/963,247
Patent 12584858
Rapid fresh digital-pathology method
2y 5m to grant Granted Mar 24, 2026
18/157,430
Patent 12587605
SERVICE PROVIDING SYSTEM WITH SYNCHRONIZATION OF ATTRIBUTE DATA
2y 5m to grant Granted Mar 24, 2026
17/910,652
Patent 12581046
PATHOLOGY REVIEW STATION
2y 5m to grant Granted Mar 17, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
80%
Grant Probability
97%
With Interview (+17.2%)
2y 6m
Median Time to Grant
Low
PTA Risk
Based on 837 resolved cases by this examiner. Grant probability derived from career allow rate.