DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 10 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 10 recites “wherein a difference between a maximum value and a minimum value of the one or more items of distance information is less than a first value.”. Distance information represents a distance value. It’s not clear what are “one or more items” of a distance value. Clarification is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-4, 6-9, 14-18 is/are rejected under 35 U.S.C. 103 as being unpatentable by Benhimante et al. (US 2014/0293016 A1).
Regarding claim 1, Benhimante teaches:
A three-dimensional model generation method executed by an information processing device, the three-dimensional model generation method comprising:
obtaining subject information including a plurality of positions on a subject in a three-dimensional space; (FIG. 8, depth values are obtained. Abstract: “providing a set of reference two-dimensional imaged points captured by the camera at a first camera pose and reference depth samples; providing a set of current two-dimensional imaged points captured by the camera at a second camera pose and current depth samples associated to the set of current two-dimensional imaged points”)
obtaining a first camera image of the subject shot from a first viewpoint and a second camera image of the subject shot from a second viewpoint; (Abstract: “providing a set of reference two-dimensional imaged points captured by the camera at a first camera pose and reference depth samples; providing a set of current two-dimensional imaged points captured by the camera at a second camera pose and current depth samples associated to the set of current two-dimensional imaged points”)
determining a search range in the three-dimensional space, based on the subject information and without using map information, the search range including a first three-dimensional point on the subject, the first three-dimensional point corresponding to a first point in the first camera image,([0059], “According to an embodiment of the invention, a threshold for the distance of the three-dimensional point to the camera may be defined and only three-dimensional points having a distance below this threshold are selected for the meshing. In other words, at least one of the current depth samples may be discarded if it is determined that the depth value is above a defined threshold. That way, only points that are not farther than a certain distance from the depth sensor are retained in the current three-dimensional model. This helps to improve the quality of the current three-dimensional model in case the uncertainty of the depth measurement increases significantly with the depth.” According to the published specification, the map is defined as 3D points: “the map information including three-dimensional points each indicating a position on the subject in the three-dimensional space”) the map information being generated by camera calibration executed by causing one or more cameras to shoot the subject from a plurality of viewpoints including the first viewpoint and the second viewpoint, the map information including three-dimensional points each indicating a position on the subject in the three-dimensional space; ([0061], “In FIG. 2b, the simultaneous tracking and three-dimensional reconstruction of an unknown environment according to an embodiment of the present invention is depicted, wherein at least one reference intensity image, reference depth samples associated to the at least one reference intensity image and the at least one associated camera pose as well as an initial estimate of the three-dimensional model of the environment (determined as described above) serve as input. According to the depicted embodiment, a current intensity image and current depth samples associated to that current intensity image are provided. Then a current three-dimensional point cloud, i.e. a current three-dimensional model, is determined using the current depth samples associated to the current intensity image and the intrinsic parameters (S3).” Also fig. 2a teaches using intrinsic parameter of the capturing device.)
searching for a similar point that is similar to the first point, in a range in the second camera image, the range corresponding to the search range; ([0066], “In the following step S7, it is thus determined depending on the similarity measure whether to update the estimate of the three-dimensional model of the environment using at least one point of the three-dimensional point cloud determined from the current depth samples associated to the current intensity image (S8a) and whether the current intensity image including its determined associated camera pose is added to the set of reference images (S8b). For example, if the similarity measure does not meet a first condition, for example does not exceed a defined threshold, the current intensity image and the current depth samples associated to the current intensity image would be discarded. In this case, steps S3 to S7 would be repeated for a new current intensity image and current depth samples associated to that new current intensity image based on the non-updated three-dimensional model and the set of reference images including at least the reference image. If the similarity measure does exceed the threshold, step 8 is processed and steps S3 to S7 are then repeated based on the updated reference three-dimensional model of the environment and the updated set of reference images including at least the reference image and the current intensity image of the previous cycle as additional reference intensity image.”) and
generating a three-dimensional model using a search result in the searching. (FIG. 2b, [0066], “The updating of the three-dimensional model of the environment (S8a) may comprise concatenating two three-dimensional point clouds which may be achieved by transforming the three-dimensional points with the inverse of the camera pose associated to the current intensity image.”)
The above citations are from different embodiments. However, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to provide a more correct 3D model generation method.
Regarding claim 2, Benhimante teaches:
The three-dimensional model generation method according to claim 1, wherein in the searching, an epipolar line in the second camera image is limited to a length that is in accordance with the search range, and the similar point that is similar to the first point is searched for on the epipolar line in the second camera image, the epipolar line corresponding to the first point.([0081]-[0082], “At least two cameras, recording intensities with known relative pose and ideally known intrinsic parameters, can capture images at approximately the same time or, when not moving, at different times. Correspondences can be found in both images and the relative pose and intrinsic of the cameras can be used to calculate the correspondences depth in either image coordinate system. It is advantageous to retrieve the relative pose and intrinsic parameters before trying to find correspondences, because they can be used to simplify the creation of correspondences through introducing additional constrains (e.g. epipolar geometry). For example, the finding of correspondences based on point features can be implemented as follows: To match a 2D feature from one image to another, the patch around the 2D feature of specified size is searched in the other image. For instance, the sum-of-square-differences (SSD) or the normalized cross-correlation (NCC) can be used as distance or similarity measure, respectively. To reduce the number of comparisons needed to search the corresponding patch, it is only searched along the epipolar line of the feature point in the other image. To simplify the search along the epipolar line to a 1D-search, the images are first rectified. The two patches with the highest similarity are set into relation. If the one with the highest similarity is significantly more similar than the second highest similarity, the former one will be considered as matching correspondence.”)
Regarding claim 3, Benhimante teaches:
The three-dimensional model generation method according to claim 1, wherein the subject information includes a distance image generated according to measurement performed by a distance image sensor, the distance image includes a plurality of pixels each including distance information indicating distance from the distance image sensor to the subject, (FIG. 8, depth values are obtained. Abstract: “providing a set of reference two-dimensional imaged points captured by the camera at a first camera pose and reference depth samples; providing a set of current two-dimensional imaged points captured by the camera at a second camera pose and current depth samples associated to the set of current two-dimensional imaged points” [0096], “According to another embodiment to determine a depth of at least one element in an intensity image, there is provided at least one sensor for retrieving depth information or range data and at least a relative position and/or orientation of the at least one sensor with respect to the capturing device, wherein the depth information or range data is used to calculate a depth of at least one element in the intensity image. Preferably, the pose (position and orientation) and intrinsic parameters of, both, the sensor and the capturing device are known.”) and in the determining, the search range is determined based on distance information included in a pixel, in the distance image, that corresponds to the first point.([0059], “According to an embodiment of the invention, a threshold for the distance of the three-dimensional point to the camera may be defined and only three-dimensional points having a distance below this threshold are selected for the meshing. In other words, at least one of the current depth samples may be discarded if it is determined that the depth value is above a defined threshold. That way, only points that are not farther than a certain distance from the depth sensor are retained in the current three-dimensional model. This helps to improve the quality of the current three-dimensional model in case the uncertainty of the depth measurement increases significantly with the depth.”)
Regarding claim 4, Benhimante teaches:
The three-dimensional model generation method according to claim 1, wherein the subject information includes a plurality of distance images each generated according to measurement by a corresponding one of a plurality of distance image sensors, each of the plurality of distance images includes a plurality of pixels each including distance information indicating distance from the distance image sensor that generated the distance image to the subject, the plurality of pixels included in each of the plurality of distance images are each associated with a corresponding one of a plurality of pixels included in, among a plurality of camera images, a camera image corresponding to the distance image, the plurality of camera images include the first camera image and the second camera image, (FIG. 8, depth values are obtained. Abstract: “providing a set of reference two-dimensional imaged points captured by the camera at a first camera pose and reference depth samples; providing a set of current two-dimensional imaged points captured by the camera at a second camera pose and current depth samples associated to the set of current two-dimensional imaged points” [0096], “According to another embodiment to determine a depth of at least one element in an intensity image, there is provided at least one sensor for retrieving depth information or range data and at least a relative position and/or orientation of the at least one sensor with respect to the capturing device, wherein the depth information or range data is used to calculate a depth of at least one element in the intensity image. Preferably, the pose (position and orientation) and intrinsic parameters of, both, the sensor and the capturing device are known.” FIG. 6) and in the determining, the search range is determined based on one or more items of distance information included in one or more pixels in one or more distance images among the plurality of distance images, the one or more pixels each corresponding to the first point. ([0059], “According to an embodiment of the invention, a threshold for the distance of the three-dimensional point to the camera may be defined and only three-dimensional points having a distance below this threshold are selected for the meshing. In other words, at least one of the current depth samples may be discarded if it is determined that the depth value is above a defined threshold. That way, only points that are not farther than a certain distance from the depth sensor are retained in the current three-dimensional model. This helps to improve the quality of the current three-dimensional model in case the uncertainty of the depth measurement increases significantly with the depth.”)
Regarding claim 6, Benhimante teaches:
The three-dimensional model generation method according to claim 4, wherein a position and an orientation of each of the plurality of distance image sensors corresponds to a position and an orientation of a corresponding one of a plurality of cameras including the one or more cameras, and the determining includes identifying, using positions and orientations of the plurality of cameras obtained through the camera calibration, the one or more pixels, in the one or more distance images, that each correspond to the first point.(FIG. 6, [0073], “Optionally, in S8, the estimated camera motion based on the at least one reference intensity image and the current intensity image may be used to refine the current three-dimensional point cloud and/or the reference three-dimensional model. This may be achieved by determining based on the estimated camera motion the depth of a three-dimensional point in the three-dimensional model and the depth of that three-dimensional point from the current depth samples associated to the set of current two-dimensional imaged points; determining a refined depth of that three-dimensional point from the depth of the three-dimensional point in the three-dimensional model and the depth of the associated three-dimensional point in the current three-dimensional model; and updating the depth of the three-dimensional point in the current three-dimensional point cloud and/or in the three-dimensional model with the determined refined depth (see FIG. 6). For example, by means of an image matching method, the feature point F1 from the reference intensity image is determined in the current intensity image. Then, a refined depth value may be determined from the depth value associated to the feature point F1 in the reference intensity image and the depth value associated to the determined matched feature point F1 in the current intensity image.”)
Regarding claim 7, Benhimante teaches:
The three-dimensional model generation method according to claim 6, wherein the one or more distance images include a first distance image corresponding to the first camera image ([0057], “A more detailed view on the invention is presented in the embodiment depicted in FIG. 2a and FIG. 2b. FIG. 2a shows the creation of the initial three-dimensional reference model from the reference depth samples associated to the at least one reference intensity image. In S1, the reference three-dimensional point cloud is computed using the depth samples associated to the reference intensity image.”) and a second distance image corresponding to the second camera image, and the second camera image is determined from the plurality of camera images in feature point matching in the camera calibration, based on a total number of feature points between the first camera image and each of the plurality of camera images other than the first camera image.(FIG. 6 and corresponding paragraph. [0059], “According to an embodiment of the invention, a threshold for the distance of the three-dimensional point to the camera may be defined and only three-dimensional points having a distance below this threshold are selected for the meshing. In other words, at least one of the current depth samples may be discarded if it is determined that the depth value is above a defined threshold. That way, only points that are not farther than a certain distance from the depth sensor are retained in the current three-dimensional model. This helps to improve the quality of the current three-dimensional model in case the uncertainty of the depth measurement increases significantly with the depth.”)
Regarding claim 8, Benhimante teaches:
The three-dimensional model generation method according to claim 6, wherein the second camera image is determined based on a difference in shooting orientation calculated from a first position-and-orientation of a camera that shot the first camera image at a time the first camera image was shot and a second position-and-orientation of a camera that shot the second camera image at a time the second camera image was shot. (FIG. 6, [0073], “Optionally, in S8, the estimated camera motion based on the at least one reference intensity image and the current intensity image may be used to refine the current three-dimensional point cloud and/or the reference three-dimensional model. This may be achieved by determining based on the estimated camera motion the depth of a three-dimensional point in the three-dimensional model and the depth of that three-dimensional point from the current depth samples associated to the set of current two-dimensional imaged points; determining a refined depth of that three-dimensional point from the depth of the three-dimensional point in the three-dimensional model and the depth of the associated three-dimensional point in the current three-dimensional model; and updating the depth of the three-dimensional point in the current three-dimensional point cloud and/or in the three-dimensional model with the determined refined depth (see FIG. 6). For example, by means of an image matching method, the feature point F1 from the reference intensity image is determined in the current intensity image. Then, a refined depth value may be determined from the depth value associated to the feature point F1 in the reference intensity image and the depth value associated to the determined matched feature point F1 in the current intensity image.”)
Regarding claim 9, Benhimante teaches:
The three-dimensional model generation method according to claim 6, wherein the second camera image is determined based on a difference in shooting position calculated from a first position-and-orientation of a camera that shot the first camera image at a time the first camera image was shot and a second position-and-orientation of a camera that shot the second camera image at a time the second camera image was shot. (FIG. 6, [0073], “Optionally, in S8, the estimated camera motion based on the at least one reference intensity image and the current intensity image may be used to refine the current three-dimensional point cloud and/or the reference three-dimensional model. This may be achieved by determining based on the estimated camera motion the depth of a three-dimensional point in the three-dimensional model and the depth of that three-dimensional point from the current depth samples associated to the set of current two-dimensional imaged points; determining a refined depth of that three-dimensional point from the depth of the three-dimensional point in the three-dimensional model and the depth of the associated three-dimensional point in the current three-dimensional model; and updating the depth of the three-dimensional point in the current three-dimensional point cloud and/or in the three-dimensional model with the determined refined depth (see FIG. 6). For example, by means of an image matching method, the feature point F1 from the reference intensity image is determined in the current intensity image. Then, a refined depth value may be determined from the depth value associated to the feature point F1 in the reference intensity image and the depth value associated to the determined matched feature point F1 in the current intensity image.”)
Regarding claim 14, Benhimante teaches:
The three-dimensional model generation method according to claim 1, wherein the subject information is generated based on sensor information of two or more types.(Abstract and [0097]-[0098] “providing a set of current two-dimensional imaged points captured by the camera” “Particularly, a method to retrieve depth information is using special sensors, specialized on retrieving depth information or range data. That can for example be a time of flight mechanism, like a laser scanner or a time of flight camera. Another example are sensors, which project a known pattern of light into the environment and retrieve the pattern after it was reflected by the environment with a sensor”)
Regarding claim 15, Benhimante teaches:
The three-dimensional model generation method according to claim 14, wherein the sensor information of two or more types includes a plurality of two-dimensional images obtained from a stereo camera (Abstract: “providing a set of current two-dimensional imaged points captured by the camera”) and three-dimensional data obtained from a measuring device that emits an electromagnetic wave and obtains a reflected wave which is the electromagnetic wave reflected by the subject. ([0097]-[0098], “Particularly, a method to retrieve depth information is using special sensors, specialized on retrieving depth information or range data. That can for example be a time of flight mechanism, like a laser scanner or a time of flight camera. Another example are sensors, which project a known pattern of light into the environment and retrieve the pattern after it was reflected by the environment with a sensor. By matching the projected information and the received pattern and by knowing the pose of the projector towards the retrieving sensor and by knowing the intrinsic parameters of both projector and sensor, depth can be calculated. Another sensor allowing the retrieval of depth data is a plenoptic camera;” [0086], teaches using GPS)
Regarding claim 16, Benhimante teaches:
A three-dimensional model generation device comprising:a processor; and memory, wherein using the memory, the processor: ([0035], “Another aspect of the invention is also related to a computer program product which is adapted to be loaded into the internal memory of a digital computer and comprises software code sections by means of which the method according to the invention is performed when said product is running on said computer.”) The rest of claim 16 recites similar limitations of claim 1, thus are rejected accordingly.
Regarding claim 17, Benhimante teaches:
A three-dimensional model generation device comprising: memory; and a processor coupled to the memory, wherein the processor: ([0035], “Another aspect of the invention is also related to a computer program product which is adapted to be loaded into the internal memory of a digital computer and comprises software code sections by means of which the method according to the invention is performed when said product is running on said computer.”)
obtains a first camera image generated by shooting a subject in a three-dimensional space from a first viewpoint and a second camera image generated by shooting the subject from a second viewpoint; (Abstract: “providing a set of reference two-dimensional imaged points captured by the camera at a first camera pose and reference depth samples; providing a set of current two-dimensional imaged points captured by the camera at a second camera pose and current depth samples associated to the set of current two-dimensional imaged points”)
searches for a second point in a search range on an epipolar line identified by projecting, on the second camera image, a straight line that passes through the first viewpoint and a first point in the first camera image, the second point being similar to the first point; ([0082], “For example, the finding of correspondences based on point features can be implemented as follows: To match a 2D feature from one image to another, the patch around the 2D feature of specified size is searched in the other image. For instance, the sum-of-square-differences (SSD) or the normalized cross-correlation (NCC) can be used as distance or similarity measure, respectively. To reduce the number of comparisons needed to search the corresponding patch, it is only searched along the epipolar line of the feature point in the other image. To simplify the search along the epipolar line to a 1D-search, the images are first rectified. The two patches with the highest similarity are set into relation. If the one with the highest similarity is significantly more similar than the second highest similarity, the former one will be considered as matching correspondence.”) and
generates a three-dimensional model of the subject based on a result of the search, (FIG. 2b, [0066], “The updating of the three-dimensional model of the environment (S8a) may comprise concatenating two three-dimensional point clouds which may be achieved by transforming the three-dimensional points with the inverse of the camera pose associated to the current intensity image.”)
the search range is provided based on a position of a first three-dimensional point, in the three-dimensional space, that corresponds to the first point, ([0059], “According to an embodiment of the invention, a threshold for the distance of the three-dimensional point to the camera may be defined and only three-dimensional points having a distance below this threshold are selected for the meshing. In other words, at least one of the current depth samples may be discarded if it is determined that the depth value is above a defined threshold. That way, only points that are not farther than a certain distance from the depth sensor are retained in the current three-dimensional model. This helps to improve the quality of the current three-dimensional model in case the uncertainty of the depth measurement increases significantly with the depth.” According to the published specification, the map is defined as 3D points: “the map information including three-dimensional points each indicating a position on the subject in the three-dimensional space”) and
the position is calculated based on a reflected wave of an electromagnetic wave emitted toward the subject. ([0097]-[0098], “Particularly, a method to retrieve depth information is using special sensors, specialized on retrieving depth information or range data. That can for example be a time of flight mechanism, like a laser scanner or a time of flight camera”)
The above citations are from different embodiments. However, it would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to provide a more correct 3D model generation method.
Regarding claim 18, Benhimante teaches:
The three-dimensional model generation device according to claim 17, wherein the position is calculated based on a distance image generated by a sensor that receives the reflected wave. ([0097]-[0098], “Particularly, a method to retrieve depth information is using special sensors, specialized on retrieving depth information or range data. That can for example be a time of flight mechanism, like a laser scanner or a time of flight camera”)
Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable by Benhimante in view of Park et al. (US 2015/0281727 A1).
Regarding claim 10, Benhimante teaches:
The three-dimensional model generation method according to claim 6,
However, Benhimante does not, but Park teaches:
wherein a difference between a maximum value and a minimum value of the one or more items of distance information is less than a first value.([0111], “n the “1620” step, it is determined whether a difference between a maximum value and a minimum value among four-corner values of a current CU from a depth value distribution information is less than 5 or not. In this instance, when the difference between the maximum and minimum values is less than 5, a step of “1630” is performed. When the difference between the maximum and minimum values is not less than 5, a step of “1680” is performed.”)
Benhimante teaches the depth (distance) value matters in searching the range for similar points. Park teaches a specific method of deciding different operation based on the range difference between points.
It would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to have combined the teachings of Benhimante with the specific teachings of Park to more accurately find similar points.
Allowable Subject Matter
Claims 5, 11-13 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
The following is a statement of reasons for the indication of allowable subject matter: none of the references alone or in combination teaches the limitations of “wherein in the determining, when a detection accuracy of first distance information included in a pixel that is included in a first distance image and corresponds to the first point is lower than a predetermined accuracy, the search range is determined using, as the one or more items of distance information, third distance information corresponding to the first point, the first distance image corresponding to the first camera image, the third distance information being calculated using two or more camera images other than the first camera image.” Recited in claim 5.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to YANNA WU whose telephone number is (571)270-0725. The examiner can normally be reached Monday-Thursday 8:00-5:30 ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Alicia Harrington can be reached at 5712722330. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/YANNA WU/Primary Examiner, Art Unit 2615