Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 11/03/2025 has been entered.
Response to Amendment
Applicant’s Amendments filed on 11/03/2025 has been entered and made of record.
Currently pending Claim(s):
Independent Claim(s):
Amended Claim(s):
1–20
1 and 11
1 and 11
Response to Applicant’s Arguments
This office action is responsive to Applicant’s Arguments/Remarks Made in an Amendment received on 11/03/2025
Applicant’s Reply (November 3, 2023) includes substantive amendments to the claims. This Office action has been updated with new grounds of rejection addressing those amendments. Further Applicant’s Arguments/Remarks with respect to independent claims 1 and 11 have been considered but are moot because the arguments do not apply to any of the references being used in the current rejection and the arguments are now rejected by newly cited art ‘Won (KR 20200079162 A)’ and ‘Stafford et al. (US 20190384381 A1)’ as explained in the body of rejection below.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or non-obviousness.
Claim(s) 1 and 11 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 2022/0318606 A1, hereafter, "Kim") in view of Won (KR 20200079162 A, hereafter, “Won”) further in view of Stafford et al. (US 20190384381 A1, hereafter, “Stafford”).
Regarding claim 1, Kim discloses an imaging system (See Kim, [Abstract], According to the disclosure of the present specification, a neural processing unit (NPU) is proposed) comprising:
an image sensor (See Kim, ¶ [0182], The input unit 1020 may include a camera 1021 for inputting a video signal); and
at least one processor configured to (See Kim, ¶ [0215], The central processing unit 1080 may control the overall operation of the edge device 1000. For example, the central processing unit 1080 may be a central processing unit (CPU), an application processor (AP), or a digital signal processor (DSP)):
obtain image data read out by the image sensor (See Kim, ¶ [0182], The input unit 1020 may include a camera 1021 for inputting a video signal);
obtain information indicative of a gaze direction of a given user (See Kim, ¶ [0408], The user's motion operation may include at least one of a user's eye gaze direction (e.g., a position of a user's pupil), a user's head direction and a head slope. In order to detect the user's eye gaze direction, a plurality of cameras 1021 may be provided); and
utilise at least one neural network to (See Kim, ¶ [0076], For example, NPU, as an abbreviation of a neural processing unit, may mean a processor that is specialized for computations of an artificial neural network model separately from a central processing unit (CPU)):
perform demosaicking on an entirety of the image data (See Kim, ¶ [0473], For example, the first layer of the ANN model may be for applying the demosaicing method for the video, and the second layer may be for applying the deblur method);
identify a gaze region and a peripheral region of the image data, based on the gaze direction of the given user (See Kim, ¶ [0453], For example, the ANN adapted to improve video quality may determine the ranking of the ROI of the user in order of a region 780 where the ROIs of the left eye and the right eye overlap with each other, the ROIs 760 and 770 of the left eye and the right eye, and the ROI 752 based on the head direction and the head slope. [FIG. 16A], 760, 770, 780, 752);
[apply a first image restoration technique solely to the gaze region of the image data, wherein the first image restoration technique comprises a deblurring technique; and
apply a second image restoration technique solely to the peripheral region of the image data, wherein the second image restoration technique comprises a denoising technique].
However, Kim fail(s) to teach apply a first image restoration technique solely to the gaze region of the image data, wherein the first image restoration technique comprises a deblurring technique; and apply a second image restoration technique solely to the peripheral region of the image data, wherein the second image restoration technique comprises a denoising technique.
Won, working in the same field of endeavor, teaches: apply a first image restoration technique solely to the gaze region of the image data, wherein the first image restoration technique comprises a deblurring technique (See Won, ¶ [0041], The ROI image extraction module (264) can determine the position of the user's gaze on the display device (250) that the user is looking at based on the pupil position information, and can determine the region of interest of the user based on the determined position of the user's gaze. ¶ [0045], For example, the image preprocessing operation module (272) can adjust any image parameter representing the image corresponding to the region of interest. Here, the technique for adjusting image parameters may include a technique for preprocessing an image, and may include at least one of a demosaicing technique, a wide dynamic range (WDR) or high dynamic range (HDR) technique, a deblur technique. Note: the filtering or deblurring is applied solely to the region of interest (i.e. gaze region)).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference to apply a first image restoration technique solely to the gaze region of the image data, wherein the first image restoration technique comprises a deblurring technique based on the method of Won’s reference. The suggestion/motivation would have been to move smoothly without afterimages according to user perspective and maintain high quality on a gaze region (See Won, ¶ [0002–0011]).
However, Kim and Won fail(s) to teach apply a second image restoration technique solely to the peripheral region of the image data, wherein the second image restoration technique comprises a denoising technique.
Won, working in the same field of endeavor, teaches: apply a second image restoration technique solely to the peripheral region of the image data, wherein the second image restoration technique comprises a denoising technique (See Stafford, ¶ [0023], There are a number of approaches to filtering to get rid of high contrast aliasing and pixilation. One approach is to change the color mapping used in the low resolution peripheral region to compress the contrast in the color space in the periphery. Another way is to keep the color mapping but to filter the image with a standard bilinear or Gaussian filter to reduce the contrast in the periphery. The idea behind this approach is to generate a high resolution image but with reduced computation to generate the peripheral regions of the image. Both filtering approaches are effective at reducing the contrast and getting rid of the aliasing. ¶ [0093], In addition, filtering may be selectively applied so that a greater degree of filtering is applied to the peripheral region 483, a lesser degree of filtering in the transition region 482 and little to no filtering in the regions of interest 480. Note: the filtering and getting rid of the aliasing does reduce the noise of the image and it is selectively applied to area outside the gaze region (i.e. peripheral region)).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference to apply a first image restoration technique solely to the gaze region of the image data, wherein the first image restoration technique comprises a deblurring technique based on the method of Stafford’s reference. The suggestion/motivation would have been to reduce the excitation of the peripheral region to prevent distraction (See Stafford, ¶ [0022–0023]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Won and Stafford with Kim to obtain the invention as specified in claim 1.
Regarding claim 11, claim 11 is rejected the same as claim 1 and the arguments similar to that presented above for claim 1 are equally applicable to the claim 11, and all of the other limitations similar to claim 1 are not repeated herein, but incorporated by reference.
Claim(s) 2–6 and 12–16 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 2022/0318606 A1, hereafter, "Kim") in view of Won (KR 20200079162 A, hereafter, “Won”) further in view of Stafford et al. (US 20190384381 A1, hereafter, “Stafford”) and further in view of Xu et al. (See attached NPL, “Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization”, hereafter, “Xu”).
Regarding claim 2, Kim in view of Won and further in view of Stafford teaches the imaging system of claim 1, [wherein the at least one neural network comprises a single neural network comprising a plurality of sub-networks at different levels], wherein an entirety of the single neural network is utilized to perform demosaicking on the one of the gaze region and the peripheral region of the image data (See Kim, ¶ [0473], For example, the first layer of the ANN model may be for applying the demosaicing method for the video, and the second layer may be for applying the deblur method. Note: Demosaicing the image will demosaic the peripheral and gaze region) and to apply the at least one image restoration technique to the one of the gaze region and the peripheral region of the image data, to obtain a region of an output image that corresponds to the one of the gaze region and the peripheral region of the image data (See Kim, ¶ [0420], Thereafter, the XR device 1000 may determine the ROI based on at least one of the detected motion and gaze (S305). Then, the XR device 1000 may perform video improvement processing for the ROI (S307). Finally, the XR device 1000 may output the video subjected to the video improvement processing on the display (S309). ¶ [0417], The video improvement may include, as described below with reference to FIG. 18, a decompressing/decoding process (S401), a video preprocessing process (S403), and a super resolution process (S405)), and wherein a sub-network from amongst the plurality of sub-networks is utilized to perform demosaicking on another of the gaze region and the peripheral region of the image data, to obtain another region of the output image that corresponds to the another of the gaze region and the peripheral region of the image data (See Kim, ¶ [0473], For example, the first layer of the ANN model may be for applying the demosaicing method for the video, and the second layer may be for applying the deblur method. ¶ [0284], If the NPU 100 infers video data in real time, image data of the next frame may be input to the x1 and x2 input nodes of the input layer 110-11. ¶ [0223], FIG. 5 illustrates an exemplary ANN model. [FIG. 5], 110-11. Note: Examiner is interpreting that demosaicing is being applied to the current frame and then the next frame since it’s applied to a video).
However, Kim, Won and Stafford fail(s) to teach wherein the at least one neural network comprises a single neural network comprising a plurality of sub-networks at different levels.
Xu, working in the same field of endeavor, teaches: wherein the at least one neural network comprises a single neural network comprising a plurality of sub-networks at different levels (See Xu, [A. Pre-Demosaicing Network], we have designed a pre-demosaicing network (PDNet) for initially demosaicing the Bayer pattern as a pre-processing step to reduce the gap between LR CFA data and HR color image. As shown in Fig. 2 before the RDSEN module, we have adopted a model-based demosaicing method called iterative-residual interpolation (IRI) [32] to generate an intermediate demosaicing result, which will be used as the input to the refinement module. This intermediate demosaicing results will be refined by PDNet as shown in Fig. 3 (conceptually similar to ResNet [21]). [A. Implementation Details], In our proposed RDSEN networks, we set the number of RDSEB blocks as 16; and each block includes 6 residual-dense SE modules. Most kernel size of Conv layers is 3×3 with 64 filters (C=64) except those described in particular: the Conv layers in CA modules and Conv layers marked as ‘1×1’ with a 1×1 kernel size. The reduction ratio is r=16. The upscale module we have used is the same as [49]. The last layer filter is set to 3 in order to output super-resolved color images. [Fig. 2], PDNet, RDSEN. Note: The PDNet is module/subnetwork used to demosaic the image and the RDSEN is a module/subnetwork used to preform super-resolution processing on the image).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference wherein the at least one neural network comprises a single neural network comprising a plurality of sub-networks at different levels based on the method of Xu’s reference. The suggestion/motivation would have been to produce high-quality images from real-world Bayer pattern data (See Xu, [Abstract], [Table I, II and III]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xu with Kim, Won and Stafford to obtain the invention as specified in claim 2.
Regarding claim 3, Kim teaches the imaging system of claim 2, wherein the sub-network is also utilized to apply the at least one image restoration technique to the another of the gaze region and the peripheral region of the image data (See Kim, ¶ [0452], Further, the ANN adapted to extract ROI computed by the NPU may detect a point (1, m) on the display which meets a gaze direction 730 of a left eye 710 at a position 712 of the pupil of the left eye 710 and a gaze direction 7 40 of a right eye 720 at a position 722 of the pupil of the right eye 720 to determine ROIs 760 and 770 of the left eye 710 and the right eye 720, respectively. ¶ [0453], The ANN adapted to improve video quality computed by the NPU determines a ranking of each ROI and may perform a video improvement processing (e.g., super resolution computation, compression decoding computation, preprocessing computation, etc.) for each ROI stepwise based on the determined ranking. ¶ [0448], According to an embodiment, the ANN adapted to improve video quality may perform the video improvement on a part or all of the video, if necessary, like performing the video improvement processing on the entire video, without performing the video improvement processing only on the target video corresponding to the ROI. ¶ [0476], The super resolution process may also be performed for the entire video. Note: Super resolution is applied to the gaze region of the left and right eye ROI along with other ROIs and frames).
Regarding claim 4, Kim in view of Won further in view of Stafford and further in view of Xu teaches the imaging system of claim 2, [wherein another sub-network from amongst the plurality of sub-networks is utilized to perform demosaicking on an intermediate region of the image data and to apply the at least one image restoration technique to the intermediate region of the image data], to obtain an intermediate region of the output image that corresponds to the intermediate region of the image data, wherein the intermediate region lies between the gaze region and the peripheral region of the image data (See Kim, ¶ [0454], As illustrated in FIG. 16B, based on the determined ranking, the ANN adapted to improve video quality may render the resolution of the region 780 where the ROIs of the left eye and the right eye overlap with each other at the highest quality (e.g., SK). In addition, the ROIs 760 and 770 of the left eye and the right eye may be rendered at high quality (e.g., 4K) lower than the resolution of the region 780 where the ROIs overlap with each other. In addition, the resolution of the ROI 752 based on the head direction and the head slope may be rendered at high quality (e.g., 4K) much lower than the resolution of the ROIs 760 and 770 of the left eye and the right eye), [further wherein the another sub-network is at a higher level than the sub-network].
However, Kim, Won and Stafford fail(s) to teach wherein another sub-network from amongst the plurality of sub-networks is utilized to perform demosaicking on an intermediate region of the image data and to apply the at least one image restoration technique to the intermediate region of the image data.
Xu, working in the same field of endeavor, teaches: wherein another sub-network from amongst the plurality of sub-networks is utilized to perform demosaicking on an intermediate region of the image data and to apply the at least one image restoration technique to the intermediate region of the image data, further wherein the another sub-network is at a higher level than the sub-network (See Xu, [A. Pre-Demosaicing Network], we have designed a pre-demosaicing network (PDNet) for initially demosaicing the Bayer pattern as a pre-processing step to reduce the gap between LR CFA data and HR color image. As shown in Fig. 2 before the RDSEN module, we have adopted a model-based demosaicing method called iterative-residual interpolation (IRI) [32] to generate an intermediate demosaicing result, which will be used as the input to the refinement module. This intermediate demosaicing results will be refined by PDNet as shown in Fig. 3 (conceptually similar to ResNet [21]). [A. Implementation Details], In our proposed RDSEN networks, we set the number of RDSEB blocks as 16; and each block includes 6 residual-dense SE modules. Most kernel size of Conv layers is 3×3 with 64 filters (C=64) except those described in particular: the Conv layers in CA modules and Conv layers marked as ‘1×1’ with a 1×1 kernel size. The reduction ratio is r=16. The upscale module we have used is the same as [49]. The last layer filter is set to 3 in order to output super-resolved color images. [Fig. 2], PDNet, RDSEN. Note: The PDNet is module/subnetwork used to demosaic the image and the RDSEN is a module/subnetwork used to preform super-resolution processing on the image).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference wherein another sub-network from amongst the plurality of sub-networks is utilized to perform demosaicking on an intermediate region of the image data and to apply the at least one image restoration technique to the intermediate region of the image data, further wherein the another sub-network is at a higher level than the sub-network based on the method of Xu’s reference. The suggestion/motivation would have been to produce high-quality images from real-world Bayer pattern data (See Xu, [Abstract], [Table I, II and III]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xu with Kim, Won and Stafford to obtain the invention as specified in claim 4.
Regarding claim 5, Kim in view of Won and further in view of Stafford teaches the imaging system of claim 1, [wherein the at least one neural network comprises a first neural network and a second neural network, wherein the first neural network is utilized to perform demosaicking on the entirety of the image data, to obtain a first intermediate image as an output], and wherein an input of the second neural network comprises the first intermediate image, further wherein the second neural network is utilized to apply the at least one image restoration technique to a region of the first intermediate image that corresponds to the one of the gaze region and the peripheral region of the image data, to obtain an output image (See Kim, ¶ [0477], Alternatively, the super resolution process may be performed only for the video in the ROI. Specifically, the resolution of the ROI in which the user's gaze is positioned may be rendered at high quality (e.g., 4K or SK) and may be rendered at normal quality (e.g., full HD) when out of the user's gaze. That is, if the preprocessing process (S403) is performed on the video corresponding to the ROI to improve the video quality, the super resolution process (S405) may be performed on the video corresponding to the ROI in which the preprocessing process has been performed. According to an embodiment, the ANN adapted to improve video quality may perform the video improvement on a part or all of the video, if necessary, like performing the video improvement processing on the entire video, without performing the video improvement processing only on the target video corresponding to the ROI. Note: Examiner is interpreting super-resolution as an image restoration technique).
However, Kim, Won and Stafford fail(s) to teach wherein the at least one neural network comprises a first neural network and a second neural network, wherein the first neural network is utilized to perform demosaicking on the entirety of the image data, to obtain a first intermediate image as an output.
Xu, working in the same field of endeavor, teaches: wherein the at least one neural network comprises a first neural network and a second neural network , wherein the first neural network is utilized to perform demosaicking on the entirety of the image data, to obtain a first intermediate image as an output (See Xu, [A. Pre-Demosaicing Network], we have designed a pre-demosaicing network (PDNet) for initially demosaicing the Bayer pattern as a pre-processing step to reduce the gap between LR CFA data and HR color image. As shown in Fig. 2 before the RDSEN module, we have adopted a model-based demosaicing method called iterative-residual interpolation (IRI) [32] to generate an intermediate demosaicing result, which will be used as the input to the refinement module. This intermediate demosaicing results will be refined by PDNet as shown in Fig. 3 (conceptually similar to ResNet [21]). [A. Implementation Details], In our proposed RDSEN networks, we set the number of RDSEB blocks as 16; and each block includes 6 residual-dense SE modules. Most kernel size of Conv layers is 3×3 with 64 filters (C=64) except those described in particular: the Conv layers in CA modules and Conv layers marked as ‘1×1’ with a 1×1 kernel size. The reduction ratio is r=16. The upscale module we have used is the same as [49]. The last layer filter is set to 3 in order to output super-resolved color images. [Fig. 2], PDNet, RDSEN. Note: The PDNet is module/subnetwork/neural network used to demosaic the image and the RDSEN is a module/subnetwork/neural network used to preform super-resolution processing on the image).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference wherein the at least one neural network comprises a first neural network and a second neural network, wherein the first neural network is utilized to perform demosaicking on the entirety of the image data, to obtain a first intermediate image as an output based on the method of Xu’s reference. The suggestion/motivation would have been to produce high-quality images from real-world Bayer pattern data (See Xu, [Abstract], [Table I, II and III]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Xu with Kim, Won and Stafford to obtain the invention as specified in claim 5.
Regarding claim 6, Kim teaches the imaging system of claim 5, wherein the first neural network is also utilized to perform the at least one image restoration technique to the entirety of the image data at a coarse level (See Kim, ¶ [0473], For example, the first layer of the ANN model may be for applying the demosaicing method for the video, and the second layer may be for applying the deblur method. Note: Examiner is interpreting deblur as an image restoration technique).
Regarding claim 12, claim 12 is rejected the same as claim 2 and the arguments similar to that presented above for claim 2 are equally applicable to the claim 12, and all of the other limitations similar to claim 2 are not repeated herein, but incorporated by reference.
Regarding claim 13, claim 13 is rejected the same as claim 3 and the arguments similar to that presented above for claim 3 are equally applicable to the claim 13, and all of the other limitations similar to claim 3 are not repeated herein, but incorporated by reference.
Regarding claim 14, claim 14 is rejected the same as claim 4 and the arguments similar to that presented above for claim 4 are equally applicable to the claim 14, and all of the other limitations similar to claim 4 are not repeated herein, but incorporated by reference.
Regarding claim 15, claim 15 is rejected the same as claim 5 and the arguments similar to that presented above for claim 5 are equally applicable to the claim 15, and all of the other limitations similar to claim 5 are not repeated herein, but incorporated by reference.
Regarding claim 16, claim 16 is rejected the same as claim 6 and the arguments similar to that presented above for claim 6 are equally applicable to the claim 16, and all of the other limitations similar to claim 6 are not repeated herein, but incorporated by reference.
Claim(s) 7–9 and 17–19 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 2022/0318606 A1, hereafter, "Kim") in view of Won (KR 20200079162 A, hereafter, “Won”) further in view of Stafford et al. (US 20190384381 A1, hereafter, “Stafford”) and further in view of Xu et al. (See attached NPL, “Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization”, hereafter, “Xu”) further in view of Chen et al. (US 2019/0026864 A1, hereafter, "Chen").
Regarding claim 7, Kim in view of Won further in view of Stafford and further in view of Xu teaches the imaging system of claim 1, [wherein the at least one neural network comprises a third neural network and a fourth neural network that are to be utilized in parallel, wherein the third neural network is utilized to perform demosaicking on the entirety of the image data to obtain a third intermediate image], and wherein the fourth neural network is utilized to apply the at least one image restoration technique to the one of the gaze region and the peripheral region of the image data, to obtain a fourth intermediate image (See Kim, ¶ [0420], Thereafter, the XR device 1000 may determine the ROI based on at least one of the detected motion and gaze (S305). Then, the XR device 1000 may perform video improvement processing for the ROI (S307). Finally, the XR device 1000 may output the video subjected to the video improvement processing on the display (S309). ¶ [0417], The video improvement may include, as described below with reference to FIG. 18, a decompressing/decoding process (S401), a video preprocessing process (S403), and a super resolution process (S405). According to an embodiment, the ANN adapted to improve video quality may perform the video improvement on a part or all of the video, if necessary, like performing the video improvement processing on the entire video, without performing the video improvement processing only on the target video corresponding to the ROI), [further wherein the third intermediate image is combined with the fourth intermediate image to generate an output image].
However, Kim, Won and Stafford fail(s) to teach wherein the at least one neural network comprises a third neural network and a fourth neural network that are to be utilized in parallel, wherein the third neural network is utilized to perform demosaicking on the entirety of the image data to obtain a third intermediate image.
Xu, working in the same field of endeavor, teaches: wherein the at least one neural network comprises a third neural network and a fourth neural network that are to be utilized in parallel, wherein the third neural network is utilized to perform demosaicking on the entirety of the image data to obtain a third intermediate image (See Xu, [A. Pre-Demosaicing Network], we have designed a pre-demosaicing network (PDNet) for initially demosaicing the Bayer pattern as a pre-processing step to reduce the gap between LR CFA data and HR color image. As shown in Fig. 2 before the RDSEN module, we have adopted a model-based demosaicing method called iterative-residual interpolation (IRI) [32] to generate an intermediate demosaicing result, which will be used as the input to the refinement module. This intermediate demosaicing results will be refined by PDNet as shown in Fig. 3 (conceptually similar to ResNet [21]). [A. Implementation Details], In our proposed RDSEN networks, we set the number of RDSEB blocks as 16; and each block includes 6 residual-dense SE modules. Most kernel size of Conv layers is 3×3 with 64 filters (C=64) except those described in particular: the Conv layers in CA modules and Conv layers marked as ‘1×1’ with a 1×1 kernel size. The reduction ratio is r=16. The upscale module we have used is the same as [49]. The last layer filter is set to 3 in order to output super-resolved color images. [Fig. 2], PDNet, RDSEN. Note: The PDNet is module/subnetwork/neural network used to demosaic the image and the RDSEN is a module/subnetwork/neural network used to preform super-resolution processing on the image).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference wherein the at least one neural network comprises a third neural network and a fourth neural network that are to be utilized in parallel, wherein the third neural network is utilized to perform demosaicking on the entirety of the image data to obtain a third intermediate image based on the method of Xu’s reference. The suggestion/motivation would have been to produce high-quality images from real-world Bayer pattern data (See Xu, [Abstract], [Table I, II and III]).
However, Kim, Won, Stafford and Xu fail(s) to teach further wherein the third intermediate image is combined with the fourth intermediate image to generate an output image.
Chen, working in the same field of endeavor, teaches: further wherein the third intermediate image is combined with the fourth intermediate image to generate an output image (See Chen, ¶ [0019], For example, the render pipeline may perform most of its render operations on medium quality images. Advantageously, working with medium quality images may reduce the computational intensity, memory requirements, and/or local/network transmission bandwidth for the graphics subsystem 13. For example, the region of interest portion of the medium quality images may be identified based on the focus information and provided to a super-resolution network to generate super-resolution enhanced images. The medium quality images may then be up-sampled and combined with the super-resolution enhanced images to provide foveated images).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference to further wherein the third intermediate image is combined with the fourth intermediate image to generate an output image based on the method of Chen’s reference. The suggestion/motivation would have been to improve content quality and save network bandwidth and/or rendering cost (See Chen, ¶ [0049]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Chen and Xu with Kim, Won and Stafford to obtain the invention as specified in claim 7.
Regarding claim 8, Kim teaches the imaging system of claim 7, wherein the third neural network is also utilized to perform the at least one image restoration technique to the entirety of the image data at a coarse level (See Kim, ¶ [0473], For example, the first layer of the ANN model may be for applying the demosaicing method for the video, and the second layer may be for applying the deblur method. Note: Examiner is interpreting deblur as an image restoration technique).
Regarding claim 9, Kim teaches the imaging system of claim 7, wherein the at least one neural network further comprises at least one other neural network that is to be utilized in parallel with the third neural network and the fourth neural network, wherein the at least one other neural network is utilized to perform at least one other image restoration technique to the another of the gaze region and the peripheral region of the image data, to obtain at least one other intermediate image, further wherein the at least one other intermediate image is also combined with the third intermediate image and the fourth intermediate image to generate the output image (See Kim, ¶ [0420], Thereafter, the XR device 1000 may determine the ROI based on at least one of the detected motion and gaze (S305). Then, the XR device 1000 may perform video improvement processing for the ROI (S307). Finally, the XR device 1000 may output the video subjected to the video improvement processing on the display (S309). ¶ [0474], when the video is input to the first layer of the learned ANN model, the video output from the first layer may be a video applied with the demosaicing method and the deblur method. ¶ [0475], The super resolution process (S405) may be performed to increase the resolution of the video. Note: The input is preprocessed using demosaicing, then deblurring and then subjected to SR. Deblurring and SR are being interpreted as image restoration techniques).
Regarding claim 17, claim 17 is rejected the same as claim 7 and the arguments similar to that presented above for claim 7 are equally applicable to the claim 17, and all of the other limitations similar to claim 7 are not repeated herein, but incorporated by reference.
Regarding claim 18, claim 18 is rejected the same as claim 8 and the arguments similar to that presented above for claim 8 are equally applicable to the claim 18, and all of the other limitations similar to claim 8 are not repeated herein, but incorporated by reference.
Regarding claim 19, claim 19 is rejected the same as claim 9 and the arguments similar to that presented above for claim 9 are equally applicable to the claim 19, and all of the other limitations similar to claim 9 are not repeated herein, but incorporated by reference.
Claim(s) 10 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Kim (US 2022/0318606 A1, hereafter, “Kim”) in view of Won (KR 20200079162 A, hereafter, “Won”) further in view of Stafford et al. (US 20190384381 A1, hereafter, “Stafford”) further in view of Xu et al. (See attached NPL, “Joint Demosaicing and Super-Resolution (JDSR): Network Design and Perceptual Optimization”, hereafter, “Xu”) further in view of Chen et al. (US 2019/0026864 A1, hereafter, “Chen”) and further in view of Lee (US 2019/0331914 A1, hereafter, "Lee").
Regarding claim 10, Kim in view of Won further in view of Stafford further in view of Xu and further in view of Chen teaches the imaging system of claim 7, [wherein pixels of the third intermediate image are combined with corresponding pixels of the fourth intermediate image by using at least one of: a maximum pixel value, a minimum pixel value, a simple block replacement, a max-min pixel value, a guided filtering, an average pixel value, a weighted average pixel value, a median pixel value].
However, Kim, Won, Stafford, Xu and Chen fail(s) to teach wherein pixels of the third intermediate image are combined with corresponding pixels of the fourth intermediate image by using at least one of: a maximum pixel value, a minimum pixel value, a simple block replacement, a max-min pixel value, a guided filtering, an average pixel value, a weighted average pixel value, a median pixel value.
Lee, working in the same field of endeavor, teaches: wherein pixels of the third intermediate image are combined with corresponding pixels of the fourth intermediate image by using at least one of: a maximum pixel value, a minimum pixel value, a simple block replacement, a max-min pixel value, a guided filtering, an average pixel value, a weighted average pixel value, a median pixel value (See Lee, ¶ [0166], Then, image1 can be combined with image2 to generate an panoramic image of a eleven-meter wide by four-meter high area by either (i) aligning images image1 and image2 and then combining the aligned images using an average or median of the pixel data from each images).
Thus, it would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify Kim’s reference to wherein pixels of the third intermediate image are combined with corresponding pixels of the fourth intermediate image by using at least one of: a maximum pixel value, a minimum pixel value, a simple block replacement, a max-min pixel value, a guided filtering, an average pixel value, a weighted average pixel value, a median pixel value based on the method of Lee’s reference. The suggestion/motivation would have been to enhance the region of interest quality (See Lee, ¶ [0163–0167]).
Further, one skilled in the art could have combined the elements as described above by known method with no change in their respective functions, and the combination would have yielded nothing more than predictable results.
Therefore, it would have been obvious to combine Lee with Kim, Won, Stafford, Xu and Chen to obtain the invention as specified in claim 10.
Regarding claim 20, claim 20 is rejected the same as claim 10 and the arguments similar to that presented above for claim 10 are equally applicable to the claim 20, and all of the other limitations similar to claim 10 are not repeated herein, but incorporated by reference.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Byers (US 20120274734 A1) teaches an apparatus is provided in one example and includes first and second cameras configured to capture image data associated with an end user involved in a video session. They can further include a display configured to interface with the cameras, and a shaft coupled to a rotor. The cameras are secured to the shaft, and the shaft receives a rotational force such that during rotation of the shaft, the cameras pass over the display in order to capture particular image data associated with the end user's face in such a way as to improve eye gaze alignment.
Watanabe (US 20170024866 A1) teaches An image processing apparatus (600) includes a determiner (607) which determines correction data for each of a plurality of positions in an image based on a point spread function (PSF) relating to each of the plurality of positions, and an image restorer (608) which repeats predetermined image processing by using the correction data to perform image restoration processing, the image restorer is configured to repeat the predetermined image processing N times (N is a positive integer) to perform the image restoration processing, the predetermined image processing includes processing of generating an (n+1)-th intermediate image based on an n-th image (1<n≦N), and processing of generating an (n+1)-th image based on the (n+1)-th intermediate image, the n-th image, and the correction data, and the correction data are coefficient data for a difference between the (n+1)-th intermediate image and the n-th image.
Feng et al. (US 20240144717 A1) teaches a method of processing image data includes determining a first region of interest (ROI) in an image. The first ROI is associated with a first object. The method can include determining one or more image characteristics of the first ROI. The method can further include determining whether to perform an upsampling process on image data in the first ROI based on the one or more image characteristics of the first ROI.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DION J SATCHER whose telephone number is (703)756-5849. The examiner can normally be reached Monday - Thursday 5:30 am - 2:30 pm, Friday 5:30 am - 9:30 am PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Henok Shiferaw can be reached at (571) 272-4637. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/DION J SATCHER/Patent Examiner, Art Unit 2676
/Henok Shiferaw/Supervisory Patent Examiner, Art Unit 2676