DETAILED ACTION
This action is in response to the application filed on January 19th, 2024. Claims 1-20 are pending and have been examined.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on January 19th, 2024 is being considered by the examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.
The following is a quotation of the first paragraph of pre-AIA 35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA 35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention.
Independent claims 1, 9 and 17 recite “generating refined disparity estimates for a predefined range of disparities”. “Range of disparities” is not a common term in the art, and therefore is interpreted by its English meaning, which in this context is interpreted to mean a range of differences between a stereo camera pair. Details of what is meant by a range of disparities are not present within the specification. The best section which examiner has found is described in paragraph [0036], which states “The predefined number of shifting iterations and the shifting step size may be defined by the user or a connected system as discussed above based on a desired range (e.g., depth range of interest, such as close field depths, middle field depths, or far field depths). Since the learned stereo architecture 100 is trained across a range of baselines and comprehensively at near and far field depth estimates, in some embodiments, the learned stereo architecture 100 can be parameterized in an online environment to deliver depth information for features located at particular depth ranges.” Examiner best understand this paragraph to be saying that the learned stereo architecture can focus on a number of depth ranges (e.g. close, middle, or far) by using a parameter to focus on the given range. It is unclear if this section does further describe the range of disparities, or if this is describing another parameter not yet claimed. Claims 2-8, 10-16, and 18-20 are also rejected due to their dependency on a rejected independent claim.
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Independent claims 1, 9 and 17 recite “generating refined disparity estimates for a predefined range of disparities”. As discussed above with respect to the 112(a) rejections, the metes and bounds of “range of disparities” is not clear. The specification does not provide enough detail to clearly define what is included within this term. Claims 2-8, 10-16, and 18-20 are also rejected due to their dependency on a rejected independent claim.
For the purposes of examination, “range of disparities” will be treated as meaning a range of differences between a left and right stereo camera pair.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4, 7-10, 12, 15-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over US20200273192 (herein after referred to by its primary author, Cheng) in view of “Adaptive Camera Control Method for Efficient Stereoscopic Photography” (herein after referred to by its primary author, Lei Wang), US20220198694 (herein after referred to by its primary author, Zhong), and US20200084427 (herein after referred to by its primary author, Sun).
In regards to claim 1, Cheng teaches a method for controlling generation of a refined disparity estimate with a learned stereo architecture, the method comprising: implementing, with a computing device having one or more processors and one or more memories, a learned stereo architecture capable of generating refined disparity estimates for a predefined range of disparitiesCheng Figure 6 Part 604; Paragraph [0082] “In embodiments, the full network architecture in FIG. 6 comprises a stereo image pair (e.g., left and right) images 602 that are input into two weight-sharing CNNs 604 yielding corresponding feature maps, a spatial pooling module 606 for feature harvesting, e.g., by concatenating representations from sub-regions with different sizes.”); receiving, from a first stereo system having a first baseline, a first stereo image pair (Cheng Figure 6 Part 602); generating, with a cost volume stage of the learned stereo architecture, a first disparity estimate, wherein disparities estimated in the first disparity estimate correspond to the first range parameter; outputting the disparity estimate (Cheng Figure 6 Part 608 and “Left Disparity”).
Cheng does not teach a learned stereo architecture capable of generating refined disparity estimates for a predefined range of disparities based on training with fully synthetic image data; inputting a first range parameter into the learned stereo architecture, wherein the first range parameter is a subset of the predefined range of disparities; upsampling the first disparity estimate to a resolution corresponding to a resolution of the first stereo image pair to form a first full resolution disparity estimate; and refining the first full resolution disparity estimate with a first disparity residual thereby generating a first refined full resolution disparity estimate. However, Lei Wang, Zhong, and Sun remedy these deficiencies.
Lei Wang teaches inputting a first range parameter into the learned stereo architecture, wherein the first range parameter is a subset of the predefined range of disparities (Lei Wang Figure 2; Section II A “When we compare the images from the left and right camera, the difference between the positions of the object in the images is the parallax of the object.”; Section III A “For a scene without awareness objects, such as a landscape scene, making the parallax range of the image coincide with the desired parallax bounds can provide a better 3D experience. As shown in Fig. 2, the upper and lower limit of the specified parallax range is denoted by p_upper and p_lower respectively, while the maximum and minimum parallax value on the image are denoted by p_max and p_min respectively. By the parallax range adjustment method, interaxial distance and convergence can be controlled to make p_max coincide with p_upper while p_min coincide with p_lower.” Examiner note: Section II A teaches that parallax is analogous to disparity, as they both represent the difference between the positions of the objects in the left and right stereoscopic images. Section III A teaches a parallax range parameter, where the desired maximum and minimum parallax are specified and the actual parallax is adjusted to match these bounds.);
Lei Wang is considered to be analogous to the claimed invention because they are both in the same field of stereo camera imaging. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system of Cheng to include the teachings of Lei Wang, to provide the advantage of automatic adjustment of camera baselines to provide a realistic viewing effect (Lei Wang Section I “In this paper, we propose an adaptive camera control method for stereoscopic photography. Operators simply need to set the desired 3D effect by assigning some target parallax values, such as parallax bounds and parallax of the awareness object, and the method will calculate the parallax distribution dynamically, and set a proper interaxial distance and convergence angle automatically.”)
Zhong teaches upsampling the first disparity estimate to a resolution corresponding to a resolution of the first stereo image pair to form a first full resolution disparity estimate (Zhong Figure 1; Paragraph [0038] “ A disparity map with one resolution higher is obtained from the disparity map with minimum resolution by a propagation upsampling module and an exact rematching module, and the process is repeated until the original resolution is restored.”)
Zhong is considered to be analogous to the claimed invention because they are both in the same field of disparity estimation. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system of Cheng in view of Lei Wang to include the teachings of Zhong, to provide the advantage of more accurate disparity values when performing upsampling (Zhong Paragraph [0006] “The present invention aims to overcome the defects of the existing deep learning methods and provides a disparity estimation optimization method based on upsampling and exact rematching, which conducts exact rematching within a small range in an optimized network, improves previous upsampling methods such as neighbor interpolation and bilinear interpolation for disparity maps or cost maps, and works out a propagation-based upsampling method by the way of network so that accurate disparity values can be better restored from disparity maps in the upsampling process”)
Sun teaches a learned stereo architecture capable of generating refined disparity estimates for a predefined range of disparities based on training with fully synthetic image data (Sun Paragraph [0068] “Adding extra supervision for occlusion estimation is helpful for the optical flow decoder 160 and disparity decoder 165 to extrapolate optical flow and disparity estimations to regions where ground-truth annotations are missing, yielding visually appealing results. In an embodiment, a model that is pre-trained on synthetic data is used to provide the occlusion estimations.”); and refining the first full resolution disparity estimate with a first disparity residual thereby generating a first refined full resolution disparity estimate (Sun Paragraph [0061] “When used for disparity estimation refinement, the hourglass model receives an input that is a concatenation of upsampled disparity (by a factor of 2), the feature map of the first image (128-dimensional), and the warped feature map of the second image (128-dimensional). The output is a residual disparity estimation that is added to the twice upsampled disparity estimate produced by disparity estimator layer(s) corresponding to the optical flow estimator layer(s) 140.”)
Sun is considered to be analogous to the claimed invention because they are both in the same field of disparity estimation. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system of Cheng in view of Lei Wang and Zhong to include the teachings of Zhong, to provide the advantage of more accurate disparity values when performing upsampling (Zhong Paragraph [0006] “The present invention aims to overcome the defects of the existing deep learning methods and provides a disparity estimation optimization method based on upsampling and exact rematching, which conducts exact rematching within a small range in an optimized network, improves previous upsampling methods such as neighbor interpolation and bilinear interpolation for disparity maps or cost maps, and works out a propagation-based upsampling method by the way of network so that accurate disparity values can be better restored from disparity maps in the upsampling process”)
In regards to claim 2, Cheng in view of Lei Wang, Zhong, and Sun teach the method of claim 1, further comprising receiving, from a second stereo system having a second baseline, a second stereo image pair, wherein the second baseline is different than the first baseline (Lei Wang Section III A “For a scene without awareness objects, such as a landscape scene, making the parallax range of the image coincide with the desired parallax bounds can provide a better 3D experience. As shown in Fig. 2, the upper and lower limit of the specified parallax range is denoted by p_upper and p_lower respectively, while the maximum and minimum parallax value on the image are denoted by p_max and p_min respectively. By the parallax range adjustment method, interaxial distance and convergence can be controlled to make p_max coincide with p_upper while p_min coincide with p_lower.” Examiner note: This reference teaches that the interaxial distance and the convergence between the camera pair can be changed to control the parallax. This is analogous to a second camera pair with a second baseline, as the baseline changes when the interaxial distance between the camera changes).
In regards to claim 4, Cheng in view of Lei Wang, Zhong, and Sun teach the method of claim 2, further comprising: inputting a second range parameter into the learned stereo architecture, wherein the second range parameter is a subset of the predefined range of disparities (Lei Wang Figure 6; Section IV “The target parallax bounds values (p_upper, p_lower) were set as [-1%, 1%], [0%, 1%], and [-1%, -0.1%], and a simulation of the Parallax Range Adjustment Method was conducted to validate it. Initially, the parallax range of image was [1%, 2.2%], interaxial distance is 70 mm, and convergence angle is 2.0 degree. It was found that the parallax range computed by the interaxial distance and convergence control system coincides with the target parallax range (Fig. 6).” Examiner note: This reference teaches that a second target parallax value can be chosen (in this case [0%, 1%]). This second parallax target is then used to change the baseline of the system which in turn changes the parallax, until the actual parallax matches the target parallax); generating, with the cost volume stage of the learned stereo architecture, a second disparity estimate, wherein disparities estimated in the second disparity estimate correspond to the second range parameter (Cheng Part 608; Figure 6 “Left Disparity”); upsampling the second disparity estimate to a resolution corresponding to a resolution of the second stereo image pair to form a second full resolution disparity estimate (Zhong Figure 1; Paragraph [0038] “ A disparity map with one resolution higher is obtained from the disparity map with minimum resolution by a propagation upsampling module and an exact rematching module, and the process is repeated until the original resolution is restored.”); refining the second full resolution disparity estimate with a second disparity residual thereby generating a second refined full resolution disparity estimate (Sun Paragraph [0061] “When used for disparity estimation refinement, the hourglass model receives an input that is a concatenation of upsampled disparity (by a factor of 2), the feature map of the first image (128-dimensional), and the warped feature map of the second image (128-dimensional). The output is a residual disparity estimation that is added to the twice upsampled disparity estimate produced by disparity estimator layer(s) corresponding to the optical flow estimator layer(s) 140.”); and outputting the second refined full resolution disparity estimate.
In regards to claim 7, Cheng in view of Lei Wang, Zhong, and Sun teaches the method of claim 1, wherein the cost volume stage of the learned stereo architecture further comprises a cross-correlation cost volume to create a cost volume comprising a 4D feature volume at a configurable number of disparities for input into one or more 3D convolution networks to generate the first disparity estimate (Cheng Figure 6; Paragraph [0082] “The produced feature maps may be used to form 4D cost volume 608 that, in embodiments, may be fed into 3D module 610 for disparity regression.”).
In regards to claim 8, Cheng in view of Lei Wang, Zhong, and Sun teaches the method of claim 7, wherein the cost volume is created through one or more shifting operations of a first feature map corresponding to a first image of the first stereo image pair with respect to a second feature map corresponding to a second image of the first stereo image pair (Cheng Paragraph [0089] “In embodiments, the output spatial pyramid features may be concatenated into a 4D volume with size l×h×w×c and learn a transformation kernel with size of l×3×3, yielding a fused feature map with size h×w×c. In embodiments, this may be fed to a cost volume computation at later stages, as mentioned with respect to FIG. 6.” Examiner note: This reference teaches that the features from the respective left and right images may be concatenated to form a 4D volume, which would then be fed into the cost volume computation to obtain a disparity estimate. This is analogous to a shifting operation, as one of the left and right images are fused into the other, which can be seen in figure 6, as the right image is fused into the left image to obtain the left disparity estimate).
In regards to claim 9, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 1.
In regards to claim 10, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 2.
In regards to claim 12, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 4.
In regards to claim 15, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 7.
In regards to claim 16, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 8.
In regards to claim 17, Cheng in view of Lei Wang, Zhong, and Sun teaches a non-transitory computer-readable medium comprising processor-executable instructions that, when executed by one or more processors of an apparatus, causes the apparatus to perform a method (Cheng Paragraph [0124] “Aspects of the present disclosure may be encoded upon one or more non-transitory computer-readable media with instructions for one or more processors or processing units to cause steps to be performed.”) and renders obvious the remaining claim language as in the consideration of claim 1.
In regards to claim 18, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 2.
In regards to claim 20, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 4.
Claim 5-6 and 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Cheng in view of Lei Wang, Zhong, and Sun, and further in view of “Anytime Stereo Image Depth Estimation on Mobile Devices” (herein after referred to by its primary author, Yan Wang).
In regards to claim 5, Cheng in view of Lei Wang, Zhong, and Sun teaches the method of claim 1, but fails to teach receiving, from a system configured to interface with the learned stereo architecture, a first input resolution; and preprocessing the first stereo image pair based on the first input resolution, wherein the first input resolution is less than a full resolution of the first stereo image pair.
However, Yan Wang teaches receiving, from a system configured to interface with the learned stereo architecture, a first input resolution (Yan Wang Figure 2 “1/16”); and preprocessing the first stereo image pair based on the first input resolution, wherein the first input resolution is less than a full resolution of the first stereo image pair (Yan Wang Figure 2 “Disparity Stage 1” Examiner note: This reference teaches that the stereo image pair could be initially downsampled to 1/16, 1/8, or 1/4 of the original resolution and preprocessed based on this resolution).
Yan Wang is considered to be analogous to the claimed invention because they are both in the same field of disparity estimation. Therefore, it would have been obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the system of Cheng in view of Lei Wang, Zhong, and Sun to include the teachings of Yan Wang, to provide the advantage of a flexible depth estimation system which can be accurate when needed or quick when needed (Yan Wang Section I “For example, an autonomous drone flying at high speed can poll our 3D depth estimation method at a high frequency. If an object appears in its flight path, it will be able to perceive it rapidly and react accordingly by lowering its speed or performing an evasive maneuver. When flying at low speed, latency is not as detrimental, and the same drone could compute a higher resolution and more accurate 3D depth map, enabling tasks such as high precision navigation”)
In regards to claim 6, Cheng in view of Lei Wang, Zhong, Sun, and Yan Wang teaches The method of claim 5, wherein the system configured to interface with the learned stereo architecture comprises at least one of a robot system or a vehicle system (Yan Wang Abstract “Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints.”).
In regards to claim 13, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 5.
In regards to claim 14, Cheng in view of Lei Wang, Zhong, and Sun renders obvious the claim language as in the consideration of claim 6.
Allowable Subject Matter
Claims 3, 11, and 19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Claims 3, 11, and 19 recite “generating, with the cost volume stage of the learned stereo architecture, a second disparity estimate, wherein disparities estimated in the second disparity estimate correspond to the first range parameter; upsampling the second disparity estimate to a resolution corresponding to a resolution of the second stereo image pair to form a second full resolution disparity estimate.” This section requires a second stereo system having a second baseline (as defined in claim 2, which claim 3 depends upon) which is used to estimate disparities within a first range parameter, the same range parameter which was used in independent claim 1. Cheng in view of Lei Wang, Zhong, and Sun does not teach these limitations. Instead, Cheng in view of Lei Wang, Zhong, and Sun teaches that a second stereo camera system with a second baseline would be focused on a second range parameter, this is shown best in Lei Wang Equation (2), where the parallax (analogous to disparity) is approximated based on the interaxial distance and the convergence angle between the two cameras of the stereo system. This would then suggest that as the interaxial distance or convergence of the cameras change, the parallax/disparity of the cameras would also change. Therefore, the system used to reject claim 2 could not be used to reject claim 3, as it teaches that as the baseline changes, the range parameter would also be affected, this can be seen in Lei Wang Figures 1(a) and 1(b), where as the interaxial distance and convergence of the cameras change, the p_max and p_min bounds would change (while p_max and p_min are not explicitly shown in this figure, it can be appreciated that these measurements correspond to the top and bottom tangent lines of the soccer ball and apple image).
This can again be seen in reference “MorphEyes: Variable Baseline Stereo For Quadrotor Navigation” which teaches a drone with a variable baseline camera. This variable baseline is used to change the perception of the drone, which in turn changes the range which the drone focuses on. This reference also suggests that as the baseline changes, the range parameter (in this reference, a range of depths) would change.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
“Moving Object Distance Estimation Method Based on Target Extraction with a Stereo Camera” teaches a method of estimating distances using disparity. The disparity map in this reference is refined as a final step.
US20210264632 teaches performing depth estimation using a plurality of resolutions of the input images.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CALEB LOGAN ESQUINO whose telephone number is (703)756-1462. The examiner can normally be reached M-Fr 8:00AM-4:00PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Andrew Bee can be reached at (571) 270-5183. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CALEB L ESQUINO/ Examiner, Art Unit 2677
/ANDREW W BEE/ Supervisory Patent Examiner, Art Unit 2677