DETAILED ACTION
Notice of Pre-AIA or AIA Status
1. The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Notice to Applicants
2. This communication is in response to the application filled on 09/20/2023.
3. Claims 1-20 are pending.
4. Limitations appearing inside {} are intended to indicate the limitations not taught by said prior art(s)/combinations.
Information Disclosure Statement
5. The information disclosure statement (IDS) submitted on 09/20/2023 has been considered by the examiner.
Claim Rejections - 35 USC § 112
6. The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
7. Claim 8 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Specifically, claim 8 recites “…for first and second images of an object having an infinite depth…”. Specifically, as currently recited, it is indicative that the object has a depth that is infinite, which is physically impossible. Therefore, it should be clarified that this is an assumption (i.e. mathematical assumption or model assumption) of the property of an object as opposed to a physical property of the object being imaged, analogous to the limitation recited in claim 18, wherein it is claimed as “…under the assumption…”.
Claim Rejections - 35 USC § 103
8. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
9. Claims 1-2, 11-14 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Publication No. 2022/0028039 to Lee et al. (hereinafter Lee) and further in view of U.S. Publication No. 2022/013848 to El-Khamy et al. (hereinafter El-Khamy).
10. Regarding Claim 1, Lee discloses a processor-implemented method ([par. 0022, ln. 1-4] “A non-transitory computer-readable storage medium may store instructions that are executable by a processor to perform the image restoration method…”), comprising:
generating a first optical flow between a first partial image and a second partial image ([par. 0082, ln. 1-19] “… estimates disparity information for each pixel with respect to each of the viewpoints by performing convolution filtering one or more times on the pooling data that is pooled from the input data 510… the image restoration device performs convolution filtering twice at 521 and 522 on the pooling data downsampled from the input data 510… FIG. 5, a result of the convolution filtering is feature data of an H/4×W/4×128 dimension… calculates the disparity information for each pixel with a same resolution as that of the input data 510 while upsampling 523 the result of the convolution filtering… In… FIG. 5, V=9, and thus the dimension of the disparity information is H×W×18. In this example, a total of two disparity values including a disparity on a horizontal axis (e.g., an x axis) and a disparity on a vertical axis (e.g., a y axis) for each of H×W pixels are calculated for each of V viewpoints.”, [par. 0086, ln. 1-11] “…an image restoration device may generate transformed image information by transforming input image information for each of viewpoints into a pixel coordinate system of target image information corresponding to a target viewpoint, using a global transformation parameter… may correct a disparity of the transformed image information with respect to the target image information, using disparity information… global transformation and a disparity correction will be described with reference to FIG. 6.”, [par. 0087, ln. 1-16] “… perform backward warping to warp backward all pixels of the input image information to a pixel coordinate system corresponding to a target image, using a single depth corresponding to a reference disparity… may calculate a coordinate in the input image information corresponding to a position that is based on the pixel coordinate system of the target image information, using a rotation parameter, a translation parameter, and a scale parameter that are included in the global transformation parameter… may determine a pixel value of the coordinate calculated in the input image information to be a pixel value at the position in the transformed image information…”, [par. 0088, ln. 1-9] “A disparity may indicate a difference between two images in terms of a position of a same target point, for example, a difference in pixel coordinate… a disparity with the target image may be set to be the reference disparity for each input image. The reference disparity may be set as an arbitrary value. Based on the reference disparity, a virtual distance (e.g., a depth value) from an image sensor to a target point may be determined.”, [par. 0089, ln. 1-14] “… in FIG. 6, image information of an ith viewpoint image among N viewpoint images may be warped to a pixel coordinate system corresponding to image information of a target viewpoint image, and thereby warped image information may be generated… may generate a warped feature map from the warped image information by warping a feature map extracted from the ith viewpoint image. In this case, the image information of the ith viewpoint image may indicate an ith feature map extracted from the ith viewpoint image by an ith sensing unit
C
i
612, and the image information of the target viewpoint image may indicate a target feature map extracted from the target viewpoint image by a target sensing unit
C
T
611.”, [par. 0091, ln. 1-17] “…a world coordinate of a target point separated from an image sensor may be (X, Y, Z)… a pixel coordinate corresponding to the target point that is sensed by the ith sensing unit
C
i
612 among the N sensing units may be (u′, v′), and a pixel coordinate corresponding to the target point that is sensed by the target sensing unit
C
T
611 may be (u, v). However, an accurate distance to the target point may not be readily determined only with a pixel value sensed by each sensing unit… the image restoration device may assume that the input image information already has the reference disparity with respect to the target image information, and may warp the input image information to the pixel coordinate system of the target image information using a distance value corresponding to the assumed disparity. Here, backward warping may be performed.”), which are captured by respective first and second cameras using an optical flow estimation model ([Fig. 1, see 110-122 and multi-image/viewpoints for each lens/sensing units 111], [Fig. 5, see warping model], [Fig 6 see views 611 and 612], [par. 0061, ln. 1-27] “…the imaging device 110 may capture a plurality of viewpoint images through lenses arranged at different positions… 110 may generate a viewpoint image from sensed information obtained for each sensing region corresponding to each lens. That is, each of the sensing units may obtain a viewpoint image… sensing units may obtain different LF information, and thus viewpoint images captured by the sensing units may include slightly different scene… 110 may include N lenses corresponding to N sensing units. The N sensing units may individually capture viewpoint images, and thus… 110 may obtain N viewpoint images. Here, N denotes an integer greater than or equal to 2… in FIG. 1, an MLA may include 25 (N=5×5=25) lenses, and an image sensor may capture 25 low-resolution viewpoint images 120… a multi-lens image may include 36 (N=6×6=36) viewpoint images… a sensing unit may be an independent image sensing module… a camera sensor… each sensing unit may be arranged at a position different from that of another sensing unit…”, [par. 0079, ln. 1-5] “…FIG. 5, a warping model includes a global transformation parameter estimator 530, a disparity estimator 520, and a warping operation 540. The warping model may include a plurality of convolution operations and a plurality of pooling operations.”);
generating disparity information between the first partial image and a third partial image, captured by a third camera ([par. 0082, ln. 1-19], [par. 0086, ln. 1-11], [par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0089, ln. 1-14], [par. 0091, ln. 1-17]), {based on depth information of the first partial image generated using the first optical flow}; and
estimating a second optical flow between the first partial image and the third partial image based on the generated disparity information for generating a registration image ([par. 0082, ln. 1-19], [par. 0086, ln. 1-11], [par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0089, ln. 1-14], [par. 0091, ln. 1-17], [par. 0069, ln. 1-23] “…operation 250, the image restoration device generates an output image 390 for the viewpoints using an image restoration model 340 from the generated warped image information 330… may generate image information that is realigned according to a single viewpoint by performing a pixel shuffle on the generated warped image information 330. The pixel shuffle may refer to an operation of realigning or rearranging pixels indicating same and/or similar points in feature maps of the viewpoints and/or viewpoint images of the viewpoints to be near to one another. By the pixel shuffle, the image information that is realigned according to the single viewpoint may be generated. Then, the image restoration device may generate the output image 390 having a target resolution by applying the image restoration model 340 to the realigned image information. The output image 390 may be, for example, an image obtained through image registration of pixels of the viewpoint images based on the target image, and the target resolution of the output image 390 may be greater than or equal to a resolution of each of the individual viewpoint images. The output image 390 may be an image 391 obtained as the viewpoint images are integrated according to the single viewpoint.”).
Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that Lee discloses an optical flow between each of a multitude of partial images and a target image. Specifically, the broadest reasonable interpretation (BRI) of the term “optical flow” would encompass any determination as to the displacement of a pixel between two images, and thus the disparity determination and information disclosed in Lee between a target and multiple images would disclose an optical flow determination between partial images. The examiner specifically notes that while Lee does not specifically describe the “third camera” or “third partial image”, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that disparity information is generated for each viewpoint to a target image and/or viewpoint ([par. 0082, ln. 1-19], [par. 0086, ln. 1-11], [par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0089, ln. 1-14], [par. 0091, ln. 1-17]), and since there can be more than three viewpoints ([Fig. 1]), it would necessarily involve determining a disparity between a first viewpoint acquired by a first camera ([Fig. 1, target image 121]) and third viewpoint acquired by a third camera and an optical flow for warping. However, Lee does not specifically disclose wherein the disparity information between the first and third camera is based on depth information of the first partial image generated using the first optical flow. Specifically, while Lee discloses using a “reference disparity” to determine optical flow ([par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0091, ln. 1-17]), and specifically determining a depth value to a target point based on said reference disparity and using said single depth value ([par. 0087, ln. 1-16], [par. 0088, ln. 1-9]), Lee does not disclose that said depth is determined using the first optical flow between a first and second partial image.
However, El-Khamy teaches generating disparity information between the first partial image and a third partial image based on depth information of the first partial image generating using the first optical flow ([par. 0030, ln. 1-6] “The present system and method provides estimation of the real world depth of elements in a scene captured by two cameras with different fields of view (FOVs). Accurate estimation of depth from two stereo rectified images can be obtained by calculating the disparity (e.g., the horizontal displacement) between pixels in both images.”, [par. 0032, ln. 1-3] “The present system and method may be extended to multiple (e.g., more than two) cameras to determine the disparity from multiple stereo cameras.”, [par. 0036, ln. 1-11] “FIG. 2 is a diagram of a stereo matching system 200, according to an embodiment. Systems with no accurate depth estimation can rely on stereo matching between two rectified images captured from two cameras 201 and 202 with same FOV to determine the disparity d (horizontal shift as the difference between the horizontal distance xl of point P with respect to camera 201 and the horizontal distance x2 of point P with respect to camera 202) between two corresponding pixels. For each pixel, the disparity can then be converted to a measure of the depth z of the subject P by knowledge of the camera baseline b and focal length/f.”, [par. 0072, ln. 1-21] “A straightforward approach is that each reference image is rectified and stereo matched with (N_cameras—1) rectified images, respectively. A very accurate depth estimate can be obtained for the FOV 1604 which is overlapped across cameras, by using a deep learning approach. Because the locations of the cameras with respect to each other are fixed, a disparity between any pair of rectified images should translate to certain values between the remaining pairs, which can be used to get a more accurate result for the overlapping FOV 1604. Parts of the union FOV 1606, will be overlapping between two cameras, but not all the cameras. SM between these camera pairs can be used to get a good estimate of the disparity in this region. Regions in the union FOV 1606, which are only seen by one camera will utilize single image disparity estimation. Alternatively, the union FOV 1606 can utilize all input images, as well disparity estimates for parts in the union FOV 1606 which are overlapping between at least cameras.”, [par. 0073, ln. 1-14] “An alternative example with respect to FIG. 16 is a system in which three cameras are utilized. The first camera may have a tele FOV, the second camera may have a wide FOV, and the third camera may have an ultra-wide FOV. As described above with respect to FIG. 5, the union FOV disparity estimate and the overlapping FOV disparity estimate may be merged for an image from the tele FOV and an image from the wide FOV. This method may be repeated recursively to generate a depth estimate for the ultra-wide FOV by SM between the ultra-wide FOV and the wide FOV, using the previously estimated depth for the wide FOV. In such examples, the ultra-wide FOV may be utilized as the union FOV, and the wide FOV may be utilized as the overlapping FOV.”). One of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize Lee and El-Khamy as within the same field of mutli-view stereo matching for image registration, and as analogous to the claimed invention. Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that El-Khamy specifically teaches using the previously estimated depth from a first optical flow to determine disparity information of a third partial image ([par. 0072, ln. 1-21], [par. 0073, ln. 1-14]), the first optical flow in this case being the disparity of the tele FOV (second camera/partial image) and a wide FOV (first camera/partial image). The motivation to combine is disclosed in El-Khamy, wherein an image quality of a registered image can be improved via the recursive depth estimation ([par. 0034, ln. 1-15] “…a deep neural network that can perform depth estimation for the union of the FOVs from 2 or more cameras, rather than the overlapped intersection of FOVs only, a method for training the unified architecture on multiple tasks concurrently, and a method for fusion of results from single image depth estimation and stereo depth estimation algorithms/processes. Advantages include depth estimation for the entire FOV spanning all cameras rather than from the overlapped intersection of FOVs only, and generation of aesthetically better images which span the whole wide FOV by applying Bokeh on the entire wide FOV rather than on the intersection FOV, which is the narrower telephoto FOV, in case of dual cameras with fixed preset zoom, as wide lx zoom, and tele photo 2× fixed zoom.”). One of ordinary skill in the art, before the effective filling date of the claimed invention, would have combined the method of Lee with the recursive depth information of El-Khamy, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 1.
11. Regarding Claim 2, Lee further discloses wherein the generating of the first optical flow comprises: performing local refinement on the generated first optical flow by updating a motion vector of a candidate pixel among pixels in the generated first optical flow ([par. 0096, ln. 1-6, Equation 3] “The image restoration device may transform the 3D camera coordinate of the target image information that is transformed using the disparity as represented by Equation 2 above into a 3D camera coordinate (
X
i
,
Y
i
,
Z
i
) of the ith sensing unit
C
i
612, as represented by Equation 3 below.
X
i
Y
i
Z
i
=
R
i
X
T
Y
T
Z
T
+
T
i
[Equation 3]”, [par. 0097, ln. 1-14] “In Equation 3, Ri denotes rotation information between the target sensing unit
C
T
611 and the ith sensing unit
C
i
612 in the world coordinate system.
T
i
denotes parallel translation information between the target sensing unit
C
T
611 and the ith sensing unit
C
i
612 in the world coordinate system. The rotation information may also be referred to as a rotation parameter, and the translation information may also be referred to as a translation parameter. As represented by Equation 3 above, the image restoration device may calculate the 3D camera coordinate (
X
i
,
Y
i
,
Z
i
) of an ith sensing unit
C
i
612 corresponding to each pixel of the target sensing unit
C
T
611 by transforming the 3D camera coordinate (
X
T
,
Y
T
,
Z
T
) using a rotation parameter
R
i
and a translation parameter
T
i
.”, [par. 0104, ln. 1-11] “Through Equations 1 through 7 above, the image restoration device may transform the pixel coordinate (u, v) of the targeting sensing unit
C
T
611 into the pixel coordinate (u′, v′) of the ith sensing unit
C
i
612. The image restoration device may determine a pixel value of a pixel coordinate (u, v) of warped image information
B
i
613 to be the pixel value of the pixel coordinate (u′, v′) of the ith image information. In other words, the warped image information
B
i
613 may associate the pixel value of the pixel coordinate (u′, v′) of the ith image information with the pixel coordinate (u, v) of the target sensing unit
C
T
611.”, [par. 0105, ln. 3-8, Equation 8] “… device may additionally perform disparity correction in addition to the global transform (e.g., global warping)… may correct a disparity in each pixel with respect to the x axis and y axis, as represented by Equation 8 below.
d
T
→
i
x
T
,
y
T
T
=
[
d
x
T
→
i
x
T
,
y
T
T
,
d
y
T
→
i
x
T
,
y
T
T
]
)
[Equation 8]”, [par. 0106, ln. 1-5, Equation 9] “In Equation 8,
d
T
→
i
(
x
T
,
y
T
T
)
denotes a disparity value from a current coordinate (
x
T
,
y
T
) to an ith sensing unit. An entire warping operation including Equation 8 above may be represented by Equation 9 below.
[
x
i
,
y
i
]
T
=
G
x
T
,
y
T
T
+
d
T
→
i
(
x
T
,
y
T
T
)
[Equation 9]”, [par. 0107, ln. 1-2] “In Equation 9,
G
x
T
,
y
T
T
indicates a series of operations of Equation 1 through 7”). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 2.
12. Regarding Claim 11, a combination of Lee and El-Khamy teaches the method of claim 1. Lee further discloses wherein rearranging pixels in the second partial image to correspond to the first partial image based on the first optical flow ([Fig. 7], [par. 0069, ln. 1-23]), and rearranging pixels in the third partial image to correspond to the second partial image based on the second optical flow ([Fig. 7], [par. 0069, ln. 1-23], [par. 0115, ln. 1-12] “The training device may generate a temporary output image 704 having a target resolution from the temporary warped feature maps 702. For example, the training device may perform a pixel shuffle 730 on the temporary warped feature maps 702 to obtain the target resolution, and generate the temporary output image 704 using the temporary image restoration model 740 of the U-net structure. The training device may calculate a high-resolution loss
l
H
R
793 between the restored temporary output image 704 and the ground truth value image, as represented by Equation 12 below.
l
H
R
=
l
S
R
-
l
H
R
2
[Equation 12]”); and generating a high-resolution image, as the registration image, by performing image registration using the first partial image, an image generated by rearranging the pixels in the second partial image, and an image generated by rearranging the pixels in the third partial image, wherein the high-resolution image has a resolution that is greater than each of the first partial image, the second partial image, and the third partial image ([Fig. 7], [par. 0069, ln. 1-23], [par. 0115, ln. 1-12]). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the non-transitory computer-readable storage medium of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 11.
13. Regarding Claim 12, a combination of Lee and El-Khamy teaches the method of claim 1. Lee further discloses a non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method ([par. 0022, ln. 1-4]). Rejections analogous to claim 1 are further applicable to claim 12. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the non-transitory computer-readable storage medium of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 12.
14. Regarding Claim 13, the claim language is analogous to claim 1, with the exception of “An electronic device, comprising: a camera array comprising a plurality of cameras; and a processor configured to…”. Lee discloses an electronic device, comprising: a camera array comprising a plurality of cameras ([Fig. 1, see 110-122 and multi-image/viewpoints for each lens/sensing units 111], [par. 0061, ln. 1-27]); and a processor configured to perform the method ([Fig. 9, see 920], [par. 0126, ln. 1-4] “…an image restoration device 900 includes an image sensor 910, a processor 920, and a memory 930.”). Rejections analogous to claim 1 are further applicable to the remainder of claim 13. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 13.
15. Regarding Claim 14, a combination of Lee and El-Khamy teaches the electronic device of claim 13. The claim language is analogous to claim 2. Rejections analogous to claim 2 are further applicable to claim 14 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy to obtain the invention as specified in claim 14.
16. Claims 3-4, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Publication No. 2022/0028039 to Lee, in view of U.S. Publication No. 2022/013848 to El-Khamy, and further in view of U.S. Patent No. 9,916,646 to Bouzaraa et al. (hereinafter Bouzaraa).
17. Regarding Claim 3, a combination of Lee and El-Khamy teaches the method of claim 2. Lee discloses wherein the performing of the local refinement comprises: generating a warped image by performing image warping on the second partial image based on the generated first optical flow ([par. 0096, ln. 1-6, Equation 3], [par. 0097, ln. 1-14], [par. 0104, ln. 1-11], [par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]); and selecting the candidate pixel ([par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]), {based on a difference in intensity value between corresponding pixels in the first partial image and the warped image.} Lee does not specifically disclose selecting the candidate pixel based on a difference in intensity value between corresponding pixels in the first partial image and the warped image, though one of ordinary skill in the art would recognize a correction/selection form an initial candidate is performed ([par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]). Likewise, El-Khamy does specifically disclose selecting the candidate pixel based on a difference in intensity value between corresponding pixels in the first partial image and the warped image.
However, Bouzaraa teaches selecting the candidate pixel for local refinement, based on a difference in intensity value between corresponding pixels in the first partial image and the warped image ([col. 3, ln. 60 to col. 4, ln. 4] “…the first error detector comprises a difference unit adapted to compute intensity differences of each pixel of the first warped image to a corresponding pixel of the reference image. Also it comprises a histogram unit adapted to compute a histogram of the determined intensity differences. Moreover it comprises a threshold determination unit adapted to determine an error threshold from the histogram and an error deciding unit adapted to determine geometric warping errors by comparing the intensity differences to the threshold resulting in a binary map of geometric warping error pixels. A very high accuracy of detected errors can thereby be reached.”, [col. 10, ln. 10-32] “The difference unit 30 is provided with the reference image and the first warped image… is adapted to compute intensity differences of each pixel of the first warped image to a corresponding pixel of the reference image. The resulting intensity differences are handed on to the histogram unit 31, which is adapted to compute a histogram of the determined intensity differences. The histogram is handed on to the threshold determination unit 32 which is adapted to determine an error threshold from the histogram. This error threshold is selected so that intensity differences below the threshold are considered photometric errors and intensity differences above the threshold are considered geometric errors. The determined error threshold is handed on to the error deciding unit 33, which decides for each calculated intensity difference, if a photometric error or a geometric error is present… 33 then allows the identification of geometric warping error pixels, for example creating a binary map, where different values are assigned to geometric warping error pixels or photometric errors. In order to limit memory consumption, an indexed table, or a look-up-table can be created, indicating the indexes of the pixel positions where geometric warping errors have been detected.”, [col. 11, ln. 20-41] “The search window setting unit 41 is adapted to set search windows of predetermined or user-defined size around each pixel of the first warped image, for which geometric warping errors were detected. Information regarding these search windows is handed on to the valid pixel detection unit 42, which determines, using the information of the binary map, at which pixels within each search window no geometric warping error was detected resulting in valid pixels for each search window. The information regarding the valid pixels is handed on to the correction value determination unit 43, which calculates correction values for the pixels of the first warped image, at which geometric warping errors were detected by calculating an average of the intensities of at least some of the valid pixels of the search window. The weighted map provided by the weighting unit 40 is used for weighting the valid pixels of the search windows before calculating the correction values. The resulting correction values are handed on to the correction value replacement unit 44, which replaces the pixels of the first warped image, at which geometric warping errors were detected by the according calculated correction values resulting a first error corrected image.”). One of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize Lee, El-Khamy, and Bouzaraa as within the same field of mutli-view stereo matching for image registration, and as analogous to the claimed invention. Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that the pixel error detection and correction as taught in Bouzaraa to be analogous to a local refinement. The motivation to combine is disclosed in Bouzaraa, wherein warping accuracy is significantly improved ([col. 3, ln. 60 to col. 4, ln. 4]). One of ordinary skill in the art, before the effective filling date of the claimed invention, would have combined the method of Lee with the recursive depth information usage of El-Khamy, and further combined the method of the combination of Lee and El-Khamy with the candidate pixel selection based on intensity value as disclosed in Bouzaraa, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the candidate pixel selection based on intensity value of Bouzaraa to obtain the invention as specified in claim 3.
18. Regarding Claim 4, a combination of Lee, El-Khamy, and Bouzaraa teaches the method of claim 3. Lee teaches wherein the selecting of the candidate pixel comprises: {in response to a difference in intensity value} between a first pixel in the first partial image and a second pixel in the warped image disposed at a position corresponding to the first pixel {being greater than or equal to a threshold intensity}, determining that a third pixel in the generated first optical flow disposed at a position corresponding to the first pixel corresponds to the candidate pixel ([par. 0096, ln. 1-6, Equation 3], [par. 0097, ln. 1-14], [par. 0104, ln. 1-11], [par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]). Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that in Equation 8 and Equation 9 of Lee, there is an adjustment of an initial correspondence between a first pixel in a partial image and a second pixel in the warped image at a corresponding position to the first pixel to a third pixel disposed at a position corresponding to the first pixel at the corrected disparity position. Therefore, Lee discloses the determining that a third pixel in the generated first optical flow disposed at a position corresponding to the first pixel corresponds to the candidate pixel (i.e. the “correct” pixel to match the first pixel too), but does not specifically disclose that this selecting is made in response to a difference in intensity value being greater than or equal to a threshold intensity. Likewise, El-Khamy fails to disclose selecting a candidate pixel of in response to a difference in intensity value.
However, Bouzaraa teaches wherein the candidate pixel is selected, in response to a difference in intensity value between a first pixel and a second pixel in the warped image being greater than or equal to a threshold intensity ([col. 3, ln. 60 to col. 4, ln. 4], [col. 10, ln. 10-32], [col. 11, ln. 20-41]). The motivation to combine the candidate pixel selection based on intensity value of Bouzaraa remains analogous to claim 3, wherein it improves the accuracy of the warping. Specifically, in combining the candidate pixel selection based on intensity value of Bouzaraa with the method of the combination of Lee and El-Khamy, would recognize that you the corrected candidate pixel would be a third pixel selected by an intensity value as disclosed Bouzaraa. One of ordinary skill in the art, before the effective filling date of the claimed invention, would have combined the method of Lee with the recursive depth information usage of El-Khamy, and further combined the method of the combination of Lee and El-Khamy with the candidate pixel selection based on intensity value as disclosed in Bouzaraa, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the candidate pixel selection based on intensity value of Bouzaraa to obtain the invention as specified in claim 4.
19. Regarding Claim 15, a combination of Lee and El-Khamy teaches the electronic device of claim 14. The claim language is analogous to claim 3. Rejections analogous to claim 3 are further applicable to claim 15 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy and the candidate pixel selection based on intensity value of Bouzaraa to obtain the invention as specified in claim 15.
20. Claims 6-7, 9-10, 16-17, and 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Publication No. 2022/0028039 to Lee, in view of U.S. Publication No. 2022/013848 to El-Khamy, and further in view of U.S. Patent No. 9,858,673 to Ciuera et al. (hereinafter Ciuera).
21. Regarding Claim 6, a combination of Lee and El-Khamy teaches the method of claim 1. Lee further discloses wherein the calculating of the depth information comprises: {generating} a first corrected {image} and a second corrected {image} by correcting lens distortions for the first partial image and the second partial image, respectively ([par. 0092, ln. 1-5] “The image restoration device may calculate a normalized coordinate (
x
T
,
y
T
) of the target sensing unit
C
T
611 by normalizing a pixel coordinate (
u
T
,
v
T
) of each pixel of the target image information, as represented by Equation 1 below.
x
T
=
u
T
-
c
x
T
θ
x
T
,
y
=
v
T
-
c
y
T
θ
y
T
[Equation 1]”, [par. 0093, ln. 1-12] “In Equation 1, (
C
x
T
,
C
y
T
) denotes a principal point of the target sensing unit
C
T
611 on the x axis and the y axis, respectively.
f
x
T
,
f
y
T
denotes a focal length of the target sensing unit
C
T
611 with respect to the x axis and they axis, respectively. As represented by Equation 1, the image restoration device may normalize each pixel of the target sensing unit
C
T
611 by using a principal point of the target sensing unit
C
T
611 as an original point and dividing it by a focal length. Here, when principal point and focal length information of each sensing unit are unknown, a center position of an image may be used as a principal point and an arbitrary value may be used as a focal length.”, [par. 0094, ln. 1-6] “…may calculate a 3D camera coordinate (
X
T
,
Y
T
,
Z
T
) of the target sensing unit
C
T
611 using a single depth z corresponding to the reference disparity with respect to the normalized coordinate (
x
T
,
y
T
)…”, [par. 0101, ln. 1-9] “As represented by Equation 5 above, the image restoration device may
X
i
and
Y
i
by a depth
Z
i
in the 3D camera coordinate that is based on the ith sensing unit
C
i
612, and thus obtain a normalized coordinate (
x
i
,
y
i
) for the ith sensing unit
C
i
612. Here, the image restoration device may consider a case in which sensing units have different focal lengths, and may multiply the normalized coordinate (
x
i
,
y
i
) by a scake parameter as represented by Equation 6 below.
u
'
=
f
x
i
∙
x
i
+
c
x
i
v
'
=
f
y
i
∙
y
i
+
c
y
i
”, [par. 0103, ln. 1-5] “In Equation 7,
c
x
i
and
c
y
i
and denote principal points of the ith sensing unit
C
i
612 with respect to the x axis and the y axis, respectively. In addition,
f
x
i
and
f
y
i
denote focal lengths of the ith sensing unit
C
i
612 with respect to the x axis and the y axis, respectively.”). Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, would recognize that Lee discloses correcting for lens distortions for each image mathematically during disparity calculation and depth estimation, but does not specifically disclose generating an image based on these corrections. Likewise, El-Khamy does not specifically disclose correcting distortions for the depth estimation.
However, Ciuera specifically teaches generating an image to correct lens distortions ([col. 23, ln. 37-57] “Prior to determining a depth map, the raw image data is normalized (406) to increase the similarity of corresponding pixels in the captured images…. normalization involves utilizing calibration information to correct for variations in the images captured by the cameras including (but not limited to) photometric variations and scene-independent geometric distortions introduced by each camera's lens stack… normalization of the raw image data also involves pre-filtering to reduce the effects of aliasing and noise on the similarity of corresponding pixels in the images, and/or rectification of the image data to simplify the geometry of the parallax search. The filter can be a Gaussian filter or an edge-preserving filter, a fixed-coefficient filter (box) and/or any other appropriate filter… normalization also includes resampling the captured images to increase the similarity of corresponding pixels in the captured images by correcting for geometric lens distortion, for example.”). The motivation to combine the combination of Lee and El-Khamy with the corrected images of Ciuera is disclosed in Ciuera, wherein it provides an improvement in similarity correspondence between pixels of the captures images of the array ([col. 24, ln. 21-25] “A normalization process involving resampling the raw image data to reduce scene-independent geometric differences can reduce errors by correcting linear and/or non-linear lens distortion which might otherwise compromise the ability to match corresponding pixels in each of the captured images.”). One of ordinary skill in the art, before the effective filling date of the claimed invention, would have combined the method of Lee with the recursive depth information usage of El-Khamy, and further combine the combination of Lee and El-Khamy with the corrected image generation of Ciuera, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 6.
22. Regarding Claim 7, a combination of Lee, El-Khamy, and Ciuera teaches the method of claim 6. Lee further discloses calculating the depth information, comprising: generating, from the first optical flow, a transformed optical flow comprising information about motion vectors of respective pixels in the first corrected image, based on the first corrected image and the second corrected image ([par. 0092, ln. 1-5], [par. 0093, ln. 1-12], [par. 0094, ln. 1-6], [par. 0101, ln. 1-9], [par. 0103, ln. 1-5]); extracting a motion vector of a target pixel from the motion vectors of the transformed optical flow ([par. 0092, ln. 1-5] [par. 0093, ln. 1-12] [par. 0094, ln. 1-6] [par. 0101, ln. 1-9] [par. 0103, ln. 1-5], [par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]), and estimating the extracted motion vector as a first disparity vector between the target pixel and a pixel in the second corrected image corresponding to the target pixel ([par. 0105, ln. 3-8, Equation 8], [par. 0106, ln. 1-5, Equation 9], [par. 0107, ln. 1-2]); and calculating a depth value of the target pixel based on the estimated first disparity vector ([par. 0087, ln. 1-16]). Specifically, one of ordinary skill in the art, before the effective filling date of the claimed invention, in combining the combination of the method of Lee and El-Khamy with the corrected image generation of Ciuera, that a transformed optical flow would be generated from the first optical flow using the corrected first and second partial images as disclosed in Ciuera ([col. 23, ln. 37-57]). Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 7.
23. Regarding Claim 9, a combination of Lee, El-Khamy, and Ciuera teaches the method of claim 7. Rejections analogous to claim 1 and 6 are further applicable to claim 9. Specifically, Lee discloses wherein the estimating of the second optical flow comprises: {generating} a third corrected {image} by correcting lens distortion of the third partial image ([par. 0092, ln. 1-5], [par. 0093, ln. 1-12], [par. 0094, ln. 1-6], [par. 0101, ln. 1-9], [par. 0103, ln. 1-5]), and estimating, {from the depth information of the first partial image}, a second disparity vector between the target pixel and a pixel in the third corrected image corresponding to the target pixel ([par. 0082, ln. 1-19], [par. 0086, ln. 1-11], [par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0089, ln. 1-14], [par. 0091, ln. 1-17], [par. 0069, ln. 1-23]). Specifically, Lee discloses a correction, but does not specifically discloses generating an image as a result of the correction, and Lee likewise does not specifically disclose the estimating of the second disparity vector is performed form the depth information of the first partial image ([par. 0087, ln. 1-16], [par. 0088, ln. 1-9]).
However, El-Khamy teaches wherein the depth information of the first partial image can be used to determine a disparity for a third or second partial image ([par. 0030, ln. 1-6], [par. 0032, ln. 1-3], [par. 0036, ln. 1-11], [par. 0072, ln. 1-21], [par. 0073, ln. 1-14]). The motivation to combine remains analogous to claim 1. One of ordinary skill in the art, before the effective filling date of the claimed inventio, would have combined the method of Lee with the recursive depth information usage of El-Khamy, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results. A combination of Lee and El-Khamy does not specifically teach generating an image as a result of the correction to obtain a third corrected image.
However, Ciuera teaches to generate a third corrected image by correcting lens distortions of the third partial image ([col. 23, ln. 37-57]). Specifically, the motivation to combine remains analogous to claim 6, wherein it improves similarity correspondence and thus can result in more accurate depth ([col. 24, ln. 21-25]). One of ordinary skill in the art, before the effective filling date of the claimed invention, would have combined the method of Lee with the recursive depth information usage of El-Khamy, and further combine the combination of Lee and El-Khamy with the corrected image generation of Ciuera, through known means, with no change to their respective function, and the combination would have yielded nothing more than predicable results. Specifically, in combining Ciuera with the method of the combination of Lee and El-Khamy, it would have been obvious to one of ordinary skill in the art to further apply the corrected image generation of Ciuera to the third partial image of the method of Lee and El-Khamy to obtain a corrected third image.
Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 9.
24. Regarding Claim 10, a combination of Lee, El-Khamy, and Ciuera teaches the method of claim 9. Rejections analogous to claim 7 and 9 are further applicable to claim 10. Specifically, Lee discloses wherein the estimating of the second optical flow comprises: based on an applying of lens distortion to each of the first corrected image and the third corrected image, estimating the second optical flow using the estimated second disparity vector ([par. 0082, ln. 1-19], [par. 0086, ln. 1-11], [par. 0087, ln. 1-16], [par. 0088, ln. 1-9], [par. 0089, ln. 1-14], [par. 0091, ln. 1-17], [par. 0069, ln. 1-23]). Specifically, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, that in combining Lee, El-Khamy, and Ciuera to obtain the inventio as specified in claim 9, that the estimation of the second optical flow would use the second disparity vector computed between the corrected first image and third image. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the method of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 9.
25. Regarding Claim 16, a combination of Lee and El-Khamy teaches the electronic device of claim 13. The claim language is analogous to claim 6. Rejections analogous to claim 6 are further applicable to claim 16 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 16.
26. Regarding Claim 17, a combination of Lee, El-Khamy, and Ciuera teaches the electronic device of claim 16. The claim language is analogous to claim 7. Rejections analogous to claim 7 are further applicable to claim 17 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 17.
27. Regarding Claim 19, a combination of Lee, El-Khamy, and Ciuera teaches the electronic device of claim 17. The claim language is analogous to claim 9. Rejections analogous to claim 9 are further applicable to claim 19 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 19.
28. Regarding Claim 20, a combination of a combination of Lee, El-Khamy, and Ciuera teaches the electronic device of claim 19. The claim language is analogous to claim 9. Rejections analogous to claim 10 are further applicable to claim 20 in view of the electronic device of Lee. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filling date of the claimed invention, to combine the electronic device of Lee with the recursive depth information usage of El-Khamy and the corrected image generation of Ciuera to obtain the invention as specified in claim 20.
Allowable Subject Matter
29. Claims 5, 8, and 18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
30. The following is a statement of reasons for the indication of allowable subject matter:
While Bouzaraa teaches what one of ordinary skill in the art would recognize as a patch search, Bouzaraa fails to disclose calculating a motion vector by updating for the third pixel based on the patches. Specifically, the examiner also highlights that the patch search performed in Bouzaraa is performed between two non-warped images, as opposed to a warped image and a non-warped image. Therefore, while Bouzaraa teaches selecting patches, Bouzaraa selects said patches before the warping, which is opposed to the intensity determination of Bouzaraa, which is performed after having warped the image, and there is not a motivation in the references of record to have modified the patch search of Bouzaraa to operate between a warped and non-warped image. Likewise, in claims 8 and 18, the references of record fail to teach/suggest wherein for calculating a depth value of the target pixel, a value is inversely proportional to the magnitude of a vector generated by subtracting a preset disparity vector from the estimated first disparity vector.
Conclusion
31. The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See PTO-892.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAULO ANDRES GARCIA whose telephone number is (703)756-5493. The examiner can normally be reached Mon-Fri, 8-4:30PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached on (571)272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAULO ANDRES GARCIA/Examiner, Art Unit 2669 /CHAN S PARK/Supervisory Patent Examiner, Art Unit 2669