Detailed Action
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Rejections made under 35 U.S.C. 112(b) are withdrawn.
Applicant’s arguments with respect to claims 1-4, 6-12, and 14-20 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-5, 7-13, 15-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chen (CN 112581545 B) in view of Kamat et al. (US 20150310669 A1), (hereinafter Kamat).
Regarding claim 1, Chen teaches an object recognition method performed by an electronic device (Chen, “The invention relates to the fields of heat source tracking, thermal imaging, visual perception, image recognition and image positioning, in particular to a multi-mode spectral information recognition and three-dimensional space positioning method.”, pg. 2, lines 5-7), the method comprising:
simultaneously acquiring an infrared image and a visible image for a target object (Chen, “According to the invention, thermal imaging data, visible light data and depth information are taken as a complete target object to be processed, so that fusion detection and analysis of different attributes of the target object, such as visible light, thermal imaging and depth information, can be completed synchronously”, pg. 6, lines 28-31);
determining reference pixel points from pixel points in the infrared image, and obtaining depth information of the reference pixel points in the infrared image relative to the target object (Chen, “Thermal imagery is calibrated with visible light by using a special calibration plate, which calibrates calibration features on the plate, and has thermal infrared (temperature difference) and color difference, so that a color camera and thermal imagery can simultaneously see a related set of features in physical space. When the image data and the depth information are needed to be calibrated, the calibration plate is required to be provided with recognizable concave-convex height information besides the difference of temperature or/and color. To align the depth with the other two data source calibrations.”, pgs. 6 and 7, lines 37-38 and 1-5, respectively, “The invention is composed of at least one visible light camera (including a common near infrared camera), an area array thermal infrared camera (capable of outputting the temperature information of each pixel point) and a depth sensor… The data of the sensors need to be calibrated mutually, any sensor can be used as a main sensor during calibration, and the other two sensors are calibrated and corrected by taking the coordinate systems of the sensors as references; therefore, the color image data and the temperature information can be mapped to a uniform three-dimensional space after being calibrated.”, pg. 7, lines 12-19, The system determines depth data corresponding to each pixel of an infrared image by using coordinate transformations derived from sensor calibration with a calibration plate.);
determining mapping point spatial position information corresponding to the reference pixel points in an imaging space of the infrared image according to the position information of the reference pixel points in the infrared image and the depth information; constructing a plane calibration equation corresponding to the target object based on the mapping point spatial position information (Chen, “Through calibration, external parameters and internal parameters of each sensor can be known, and the information is brought into calculation, so that relevant fusion data information taking the position of one data source as a reference can be obtained.”, pg. 7, lines 9-11, “Because the depth information is calibrated by the RGBD image sensor with the depth information, the position relation (external parameters of each sensor) of the thermal image, the visible light and the depth information in the unified space can be obtained through conversion. In this way, each pixel point collected on the thermal image and the visible light image can be mapped to the coordinates of the depth point cloud, and during data mapping, the optimal scheme is to map the depth information on the thermal image in a variable manner, so that corresponding temperature data are matched for each coordinate in the depth information”, pg. 9, lines 18-24, The initial calibration establishes a mapping between the infrared camera and depth sensor, allowing depth data to be projected into the infrared image at pixel position with direct correspondence. This mapping is done using a plane calibration equation to convert 3D depth coordinates to the 2D infrared images.)
aligning the pixel points in the infrared image with pixel points in the visible image based on the depth information of the reference pixel points in the infrared image (Chen, “Through the fusion of the data, the effects of mutual mapping alignment and correction of the data can be achieved, the accurate alignment of the RGB images with different depths of field and the thermal image data is achieved”, pg. 8, lines 1-3, This depth data allows the system to determine aligned coordinates, which corresponds to the infrared and visible light image pixels, in a unified 3D coordinate system.); and
performing object recognition on the target object based on the aligned infrared image and visible image, to obtain an object recognition result of the target object (Chen, “The existing pure thermal imaging image recognition technology has the defects of incomplete information, small realizable service application scene, only temperature information, lack of visible light information and inconvenience for judging objects such as human faces, license plates and other information.”, pg. 2, lines 13-16, “The invention provides guarantee in the aspects of fire source identification and position positioning in moving, non-contact human body temperature detection in moving, alignment of thermal image data and image data in a larger distance interval, object temperature measurement accuracy in dynamic distance change and the like. The invention treats the thermal imaging data, the visible light data and the depth information as a complete target object, and can help to synchronously complete the fusion detection and analysis of different attributes of the target object, such as visible light, thermal imaging, depth information and the like.”, pg. 8, lines 21-27, The alignment process produces fused multi-modal image data that serves as a basis for object recognition tasks, such as human body temperature monitoring, fire source identification, and other thermal-based analyses.).
Chen does not teach obtaining depth information of other pixel points in the infrared image relative to the target object using the plane calibration equation and a position relationship between the other pixel points in the infrared image and the reference pixel points and aligning the pixel points in the infrared image with pixel points in the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image.
However, Kamat teaches obtaining depth information of other pixel points in the infrared image relative to the target object using the plane calibration equation and a position relationship between the other pixel points in the infrared image and the reference pixel points (Kamat, “The depth sensing
device 12 acquires the depth map image 16 of the real-world scene, and its result is written into the depth buffer 20. And the OpenGL camera projects virtual objects overtop of the real world scene, with its results being written into both the color buffer 22 and depth buffer 20. In some cases, in order to help ensure Suitable alignment and occlusion, the three cameras should share similar projection parameters. Projection parameters may include principle points and focal lengths… image registration methodology may be needed in order to find correspondences between the depth map image 16 and RGB image 18. The projection parameters of the OpenGL camera may be adjustable and could therefore accommodate either the depth sensing device 12 or the RGB image-capturing device 14.”, pg. 3, paragraph 0023, “If the depth sensing device 12 (e.g., TOF camera) has a lower resolution (e.g., 200x200), interpolation may need to be carried out if a higher resolution augmented image is to be rendered. But due to the perspective projection involved, it may not be suitable to render a 200x200 augmented image and then leave it to the texture to (bi)linearly interpolate to higher resolution, as previously described. Here, in this embodiment,
X
2
D
_
T
O
F
can be an intermediate point for interpolation between
X
1
2
D
_
T
O
F
and
X
2
2
D
_
T
O
F
whose corresponding points are
X
1
2
D
_
R
G
B
and
X
2
2
D
_
R
G
B
on the RGB image... In this embodiment, the depth value is first interpolated for each intermediate pixel on the depth map image, then the RGB values for all original and intermediate depth map pixels are interpolated by projecting the points from the depth map image onto the RGB image using the extrinsic and intrinsic matrices of the devices 12, 14. The computation cost may be higher than that of projective texture mapping on a central processing unit (CPU), but may be negligible on the GPU using render to texture (RTT) techniques.”, pg. 5, paragraphs 0038-0039, see Eqs. (2) and (3), Image registration is performed to map each pixel from a depth map to a corresponding pixel in a RGB image. This mapping is a projection operation that uses the intrinsic and extrinsic matrices of the cameras, which mathematically define a plane calibration equation of the image sensor and position relationship between the cameras. To generate a full depth representation for projection, interpolation is performed for depth values of intermediate pixels.).
Chen teaches establishing a plane calibration equation with respect to an infrared image to map pixel points to 3D coordinates (Chen, pg. 7, lines 9-11 and pg. 9, lines 18-24). Chen further teaches performing interpolation of pixel values (Chen, “ …the depth coordinates are not in one-to-one correspondence with the coordinates on the thermal image, and corresponding data information can be obtained in a manner of interpolation, proximity and the like.”, pg. 9, lines 18-27), but does not teach interpolating missing depth information using the plane calibration equation. Kamat teaches interpolating depth information of points in an image by using a plane as a geometric constraint (see above). Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to have modified the plane calibration equation of Chen to include interpolation of depth information as taught by Kamat (pg. 5, paragraphs 0038-0039, see Eqs. (2) and (3). The motivation for doing so would have been to account for missing depth correspondence values, thereby improving the accuracy of the alignment and fusion of image data. Further, one skilled in the art could have combined the elements as described above by known methods with no change in their respective functions, and the combination would have yielded nothing more than predictable results. Therefore, it would have been obvious to combine the teachings of Chen with Kamat to obtain the invention as specified in claim 1.
The combination of Chen in view of Kamat would include using the plane calibration equation established for the infrared image by Chen to perform interpolation of depth values for pixels on that plane. This interpolation is based on a position relationship between reference pixels with depth values and pixels which lack depth values in order to fill missing data. As a result, a full representation of depth values would be used for the subsequent coordinate alignment (Chen, pg. 8, lines 1-3 and pg. 8, lines 21-27). Therefore, Chen in view of Kamat teaches obtaining depth information of other pixel points in the infrared image relative to the target object using the plane calibration equation and a position relationship between the other pixel points in the infrared image and the reference pixel points and aligning the pixel points in the infrared image with pixel points in the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image.
Regarding claim 2, Chen in view of Kamat teaches the method according to claim 1, wherein the aligning the pixel points in the infrared image with pixel points in the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image comprises:
obtaining, within the visible image, mapping point position information corresponding to the pixel points in the infrared image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image; and aligning the pixel points in the infrared image with pixel points in the visible image according to the mapping point position information (Chen, “As shown in fig. 1, visible light, thermal imaging and depth information are fused, and the three are fused into unified data through a certain combination and calibration relationship, so that corresponding and relatively accurate color data, temperature data and spatial position data are obtained when the data are processed. Through the fusion of the data, the effects of mutual mapping alignment and correction of the data can be achieved, the accurate alignment of the RGB images with different depths of field and the thermal image data is achieved, and the problem of alignment dislocation at different distances after the traditional double-light fusion is solved, particularly the problem of large short-distance dislocation is solved.”, pgs. 7 and 8, lines 38-41 and 1-5, Depth data is determined for both the infrared and color images in order to map the data of each source into a unified 3D coordinate system for image alignment. The combination of Chen in view of Kawat would consider all depth values generated from interpolation in the alignment of the infrared and color images.).
Regarding claim 3, Chen in view of Kamat teaches the method according to claim 2, wherein the obtaining, within the visible image, mapping point position information corresponding to the pixel points in the infrared image on the pixel points in the infrared image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image in the infrared image comprises:
mapping the pixel points in the infrared image to an imaging space of the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image, to obtain target mapping point spatial position information corresponding to the pixel points in the infrared image in the imaging space; and performing position transformation on the target mapping point spatial position information according to a position transformation relationship between a pixel plane of the visible image and the imaging space, to obtain the mapping point position information corresponding to the pixel points in the infrared image in the visible image (Chen, “the position relation of the visible light camera corresponding to the information collected by the depth sensor is obtained, and the position relation of the visible light characteristic image, the thermal image and the depth information in the same space is obtained; and mapping the pixel points acquired on the thermal image and the visible light image to the coordinates of the depth information through a perspective relation.”, pg. 3, lines 32-36, “Through calibration, external parameters and internal parameters of each sensor can be known, and the information is brought into calculation, so that relevant fusion data information taking the spatial position of one data source as a reference can be obtained.”, pg. 7, lines 35-37, The depth data is used to establish a mapping of the infrared image and color image pixels to a unified 3D coordinate system. This coordinate system can be defined relative to any of the three data sources (infrared image, color image, or depth sensor), including the imaging space of the color camera. Image fusion would then be performed by transforming infrared pixels into a position in the 3D coordinate system, and when the system is set relative to the color image plane, the infrared pixels are effectively aligned with the visible image.).
Regarding claim 4, Chen in view of Kamat teaches the method according to claim 3, wherein the mapping the pixel points in the infrared image to an imaging space of the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image, to obtain target mapping point spatial position information corresponding to the pixel points in the infrared image in the imaging space comprises:
obtaining mapping point spatial position information corresponding to the pixel points in the infrared image in an imaging space of the infrared image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image; and performing position transformation on the mapping point spatial position information according to a position transformation relationship between the imaging space of the infrared image and the imaging space of the visible image, to obtain the target mapping point spatial position information corresponding to the pixel points in the infrared image in the imaging space of the visible image (Chen, “The invention is composed of at least one visible light camera (including a common near infrared camera), an area array thermal infrared camera (capable of outputting the temperature information of each pixel point) and a depth sensor… The data of the sensors need to be calibrated mutually, any sensor can be used as a main sensor during calibration, and the other two sensors are calibrated and corrected by taking the coordinate systems of the sensors as references; therefore, the color image data and the temperature information can be mapped to a uniform three-dimensional space after being calibrated.”, pg. 7, lines 12-19, During the calibration depth data is used to obtain mapping information for each of the infrared image and color image in their respective imaging space to coordinates of the 3D coordinate system. In a case the system is set relative to the color image plane, the infrared pixels are effectively transformed and aligned with the coordinate system defined by an imaging space of the visible image.).
Regarding claim 7, Chen in view of Kamat teaches the method according to claim 1, wherein the reference pixel points comprise at least three pixel points that are not on the same line (Chen, “In this way, each pixel point collected on the thermal image and the visible light image can be mapped to the coordinates of the depth point cloud, and during data mapping, the optimal scheme is to map the depth information on the thermal image in a variable manner… ”, pg. 9, lines 21-23, Depth data is mapped to coordinates of the infrared image. This includes multiple reference pixels on different axes of the infrared image.).
Regarding claim 8, Chen in view of Kamat the method according to claim 1, wherein the aligning the pixel points in the infrared image with pixel points in the visible image based on the depth information of the reference pixel points and the depth information of the other pixel points in the infrared image comprises:
maintaining coordinates of the pixel points in the visible image unchanged in a pixel coordinate system, and moving a position of the infrared image in the pixel coordinate system in at least one movement mode of rotation and translation, wherein coordinates of pixel points of the moved infrared image are the same as those of the pixel points representing the same target object in the visible image (Chen, “Through calibration, external parameters and internal parameters of each sensor can be known, and the information is brought into calculation, so that relevant fusion data information taking the position of one data source as a reference can be obtained.”, pg. 7, lines 9-11, In the case where the system is calibrated relative to the color camera, the 3D coordinates obtained from the color image define the reference frame. Infrared image pixels are then mapped into the 3D coordinate system via the appropriate rotation and translation to align corresponding coordinates with the color image coordinates, while the color image pixels require no additional transformation.).
Claim 9 corresponds to claim 1, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (Chen, “According to the invention, thermal imaging data, visible light data and depth information are taken as a complete target object to be processed, so that fusion detection and analysis of different attributes of the target object… accurate color data, temperature data and distance or/and direction data can be obtained on each pixel point coordinate when a frame of image is processed, and the data can be mapped, aligned and corrected mutually.”, pg. 6, lines 28-36, A processor and memory is required to execute the function of storing the acquired images and processing the images for mapping and alignment.) to perform the method according to claim 1. As indicated in the analysis of claim 1, Chen in view of Kamat teaches all the limitations of claim 1. Therefore, claim 9 is rejected for the same reasons as claim 1.
Claim 10 corresponds to claim 2, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 2. As indicated in the analysis of claim 2, Chen in view of Kamat teaches all the limitations of claim 2. Therefore, claim 10 is rejected for the same reasons as claim 2.
Claim 11 corresponds to claim 3, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 3. As indicated in the analysis of claim 3, Chen in view of Kamat teaches all the limitations of claim 3. Therefore, claim 11 is rejected for the same reasons as claim 3.
Claim 12 corresponds to claim 4, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 4. As indicated in the analysis of claim 4, Chen in view of Kamat teaches all the limitations of claim 4. Therefore, claim 12 is rejected for the same reasons as claim 4.
Claim 13 corresponds to claim 5, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 5. As indicated in the analysis of claim 5, Chen in view of Kamat teaches all the limitations of claim 5. Therefore, claim 13 is rejected for the same reasons as claim 5.
Claim 15 corresponds to claim 7, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 7. As indicated in the analysis of claim 7, Chen in view of Kamat teaches all the limitations of claim 7. Therefore, claim 15 is rejected for the same reasons as claim 7.
Claim 16 corresponds to claim 8, additionally reciting an electronic device, comprising a memory and a processor. Chen in view of Kamat teaches an electronic device, comprising a memory and a processor (see analysis of claim 9) to perform the method according to claim 8. As indicated in the analysis of claim 8, Chen in view of Kamat teaches all the limitations of claim 8. Therefore, claim 16 is rejected for the same reasons as claim 8.
Claim 17 corresponds to claim 1, additionally reciting a non-transitory computer-readable storage medium. Chen in view of Kamat teaches a non-transitory computer-readable storage medium (Chen, “According to the invention, thermal imaging data, visible light data and depth information are taken as a complete target object to be processed, so that fusion detection and analysis of different attributes of the target object… accurate color data, temperature data and distance or/and direction data can be obtained on each pixel point coordinate when a frame of image is processed, and the data can be mapped, aligned and corrected mutually.”, pg. 6, lines 28-36, A non-transitory computer-readable storage medium is required to execute the function of storing the acquired images for processing.) to perform the method according to claim 1. As indicated in the analysis of claim 1, Chen in view of Kamat teaches all the limitations of claim 1. Therefore, claim 17 is rejected for the same reasons as claim 1.
Claim 18 corresponds to claim 2, additionally reciting a non-transitory computer-readable storage medium. Chen in view of Kamat teaches a non-transitory computer-readable storage medium (see analysis of claim 17) to perform the method according to claim 2. As indicated in the analysis of claim 2, Chen in view of Kamat teaches all the limitations of claim 2. Therefore, claim 18 is rejected for the same reasons as claim 2.
Claim 19 corresponds to claim 7, additionally reciting a non-transitory computer-readable storage medium. Chen in view of Kamat teaches a non-transitory computer-readable storage medium (see analysis of claim 17) to perform the method according to claim 7. As indicated in the analysis of claim 7, Chen in view of Kamat teaches all the limitations of claim 7. Therefore, claim 19 is rejected for the same reasons as claim 7.
Claim 20 corresponds to claim 8, additionally reciting a non-transitory computer-readable storage medium. Chen in view of Kamat teaches a non-transitory computer-readable storage medium (see analysis of claim 17) to perform the method according to claim 8. As indicated in the analysis of claim 8, Chen in view of Kamat teaches all the limitations of claim 8. Therefore, claim 20 is rejected for the same reasons as claim 8.
Allowable Subject Matter
Claims 6 and 14 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CONNOR LEVI HANSEN whose telephone number is (703)756-5533. The examiner can normally be reached Monday-Friday 9:00-5:00 (ET).
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at (571) 272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/CONNOR L HANSEN/Examiner, Art Unit 2672
/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672