DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The following is a quotation of 37 CFR 1.84(p)(4)-(5):
(4) The same part of an invention appearing in more than one view of the drawing must always be designated by the same reference character, and the same reference character must never be used to designate different parts.
(5) Reference characters not mentioned in the description shall not appear in the drawings. Reference characters mentioned in the description must appear in the drawings.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(4) because FIG. 8 uses “CA1” to designate both “CA1” and “CA2” as described in paragraph 116 of the specification.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because FIG. 10 uses the reference character “S3”, which is not found in the specification.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) to replace one of the instances of “CA1” in FIG. 8 with “CA2” and an amendment to the specification in compliance with 37 CFR 1.121(b) to add “S3” to the description of FIG. 10 are required in reply to the Office action to avoid abandonment of the application.
Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
Specification
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
Claims 6 and 7 are rejected under 35 U.S.C. 112(b) as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor regards as the invention.
Claim 6 recites, in part, “the third two-dimensional data” (emphasis added). However, there is no other “third two-dimensional data” recited in the claims. Therefore, claim 6 is indefinite because “the third two-dimensional data” lacks antecedent basis. For purposes of applying prior art, “the third two-dimensional data” is interpreted to be “the three-dimensional model”.
Claim 7 recites, in part, “the fourth two-dimensional data” (emphasis added). However, there is no other “fourth two-dimensional data” recited in the claims. Therefore, claim 7 is indefinite because “the fourth two-dimensional data” lacks antecedent basis. For purposes of applying prior art, “the fourth two-dimensional data” is interpreted to be “the three-dimensional model”.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-9 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Detection and Fine 3D Pose Estimation of Texture-less Objects in RGB-D Images to Hodan et al. (hereinafter “Hodan”).
Regarding claim 1, Hodan teaches an information processing apparatus comprising at least one processor (section IV.B, “Computation and storage ... is delegated to the GPU.” ), the at least one processor performing:
a depth information acquiring process of acquiring depth information (Acquiring an input depth image) obtained via a depth sensor (Hodan, section I, “Kinect. Such sensors provide aligned color and depth images that concurrently capture both the appearance and geometry of a scene.”) having a sensing range within which a target object lies (In a depth map, each pixel location corresponding to the object is assigned a depth value. The entire collection of depth values establishes a sensing range of depth values, i.e., object distances. More generally, any part of an object that is detected by a depth sensor is within the sensing range of the sensor.);
a captured image acquiring process of acquiring a captured image (At runtime, the Kinect sensor receives RGB images and corresponding depth images. See section I, “The input to the
method consists of RGB-D images provided by a consumer grade depth sensor such as Kinect. Such sensors provide aligned color and depth images that concurrently capture both the appearance and geometry of a scene.” ) obtained via an imaging sensor (Hodan, section I, “Kinect”) having an angle of view within which the target object lies (If a captured image includes an object, then the object is within the camera’s field of view.);
a generating process of generating (Generating and verified the generated pose hypotheses. See Fig. 2, “Hypothesis Generation” and “Verification”), with reference to first two-dimensional data (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”; The binary image E is a 2D edge map.) and a three-dimensional model regarding the target object (section IV, “Objects are represented with arbitrary 3D mesh models, which can originate from CAD drawings or from digital scans.”), one or more candidate solutions (section IV.B, “candidate poses”) regarding at least one selected from the group consisting of a position and an attitude of the target object (position + orientation (attitude) = pose) in a three-dimensional space (section IV.B, “A rendering process simulates depth images of the target object at a hypothesized pose against a blank background. Pose rendering is formulated as follows. Transform {R0, t0} brings the model in an approximate location and orientation, in the depth sensor’s reference frame. Candidate poses are parametrized relative to this initial pose”), the first two-dimensional data (The edge image E. See section IV.A) being obtained through a first feature extracting process in which the depth information is referred to (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”); and
a calculating process of calculating, with reference to second two-dimensional data (The RGB color image contributes 2D data in the verification stage. See sections IV.C-D.) and the three-dimensional model and with use of the one or more candidate solutions (section IV.C, “A candidate pose is evaluated with respect to the extent to which it explains the input depth image.”), at least one selected from the group consisting of the position and the attitude (pose) of the target object in the three-dimensional space, the second two-dimensional data being obtained through a second feature extracting process in which the captured image is referred to (During training, feature points of templates are extracted based on intensity gradients. See section III.C; see also Fig. 4. These locations inform the system during test time of where to sample the captured RGB frame.).
Regarding claim 2, Hodan teaches the information processing apparatus according to claim 1, wherein the first feature extracting process (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”) and the second feature extracting process (During training, feature points of templates are extracted based on intensity gradients. See section III.C; see also Fig. 4. These locations inform the system during test time of where to sample the captured RGB frame. Under the broadest reasonable interpretation, calculating the intensity gradient is an edge extracting process because during training, feature points are chosen at “locations with large gradient magnitude (i.e., typically on the object contour).” Section III.C. A large gradient magnitude defines an edge. At test time, gradient orientations at those points are sampled in the captured RGB image and compared to the template. See section IV.C.) include an edge extracting process, and the three-dimensional model contains data on an edge of the target object (During pose initialization at test time, as explained in section IV.A, the 3D model is projected onto the 2D image sensor plane. That projection necessarily includes the contour edges of the object. When pose hypotheses are evaluated, as explained in section IV.C, using the objective function, “[depth] values are directly compared between D and Si for pairs of pixels.” The synthetic depth image Si is produced when the 3D model is rendered from the camera sensor’s viewpoint. The objective function compares the filtered depth image from the sensor with Si to determine if the rendered view explains the observed depth image, which includes computing edge maps as disclosed in section IV. C.).
Regarding claim 3, Hodan teaches the information processing apparatus according to claim 1, wherein
in the depth information acquiring process (acquiring an input depth image), the at least one processor acquires depth information in the sensing range (In a depth map, pixel locations corresponding to the object are each associated with a depth value. The collection of depth values of the entire input depth image establishes a sensing range of depth values.) even in a case where the target object is not present within the sensing range (The Kinect sensor will obtain a depth map within its sensitivity or operational range of measurable depth values regardless of whether the target object is present or not.), and
the first feature extracting process is (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”)
a feature extracting process in which subtraction information is referred to, the subtraction information being a difference between the depth information for the target object lying within the sensing range and the depth information for the target object not present within the sensing range (section IV.C, “Edge differences are aggregated in an edge cost using E and Ei.”. The edge cost (subtraction information between D and the synthetic depth image.) is an aggregation of differences between observed depth edges D of the target object within the sensing range and synthetic edges of the hypothesized target within that range. The hypothesized target object is not the actual target object in the input image data. Thus, the synthetic edge information is depth information for the target object not present in the sensing range because that target object is not in the synthesized image data, the hypothesized target object is.).
Regarding claim 4, Hodan teaches the information processing apparatus according to claim 3, wherein
the first feature extracting process (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”) is a feature extracting process in which binarized subtraction information (section IV.A, “E” and “Ei”) is referred to, the binarized subtraction information (section IV.A, “E” and “Ei”) being obtained by applying a binarizing process to the subtraction information (E is created from D. See section IV.A; see also section IV.C, “Edge differences are aggregated in an edge cost using E and Ei.”).
Regarding claim 5, Hodan teaches the information processing apparatus according to claim 1, wherein
in the calculating process, the at least one processor applies, to the captured image or the second two-dimensional data (The RGB color image contributes 2D data in the verification stage. See sections IV.C-D.), a data deleting process of deleting data (i.e., removing from the process) which indicates being at a predetermined distance or longer away (solutions outside the bounding box b are removed/deleted from consideration) from positions indicated by the one or more candidate solutions (section IV.C, “As no segmentation is employed, inaccurate pose hypotheses might cause the rendered object to be compared against pixels imaging background or occluding surfaces. To counter this, only pixels located within b are considered. Hypotheses that correspond to renderings partially outside b obtain a poor similarity score and the solution does not drift towards an irrelevant surface.”), and
calculates at least one selected from the group consisting of the position and the attitude (pose) of the target object in the three-dimensional space, with reference to the captured image or the second two-dimensional data having undergone the data deleting process (The method continues as shown in Fig. 2).
Regarding claim 6, Hodan teaches the information processing apparatus according to claim 1, wherein
in the generating process (Generating and verified the generated pose hypotheses. See Fig. 2, “Hypothesis Generation” and “Verification”; see sections III.B-C.), the at least one processor
generates the one or more candidate solutions through a template matching process (section III.C, “The verification stage corresponds to the traditional template matching.”) in which [the three-dimensional model - per the rejection under 35 U.S.C. 112(b) above] and the first two-dimensional data are referred to (section IV.C, “A candidate pose is evaluated with respect to the extent to which it explains the input depth image.”).
Regarding claim 7, Hodan teaches the information processing apparatus according to claim 1,wherein
in the calculating process (The verification stage refines the list of candidate poses. See section III.C), the at least one processor calculates at least one selected from the group consisting of the position and the attitude (pose) of the target object in the three-dimensional space through a template matching process (section III.C, “The verification stage corresponds to the traditional template matching.”) in which [the three-dimensional model - per the rejection under 35 U.S.C. 112(b) above] and the second two-dimensional data (The RGB color image contributes 2D data in the verification stage. See sections IV.C-D.) are referred to.
Claim 8 substantially corresponds to claim 1 by reciting an information processing apparatus comprising at least one processor (section IV.B, “Computation and storage ... is delegated to the GPU.” ), the at least one processor performing the same functions as the processor of claim 1. Claim 8 differs from claim 1 by additionally reciting a calculating process of calculating at least one selected from the group consisting of a position and an attitude (Fig. 2, “Fine 3D Pose Estimation”; see section IV.D) of the target object in the three-dimensional space, with reference to a result of the first matching process (section IV.A, “To suppress sensor noise, the acquired depth image is median filtered with a 5×5 kernel and the result is retained as image D. ... Binary image E is computed from D, by thresholding the output of the Sobel operator applied to D.”; The binary image E is a 2D edge map.) and a result of the second matching process (The RGB color image contributes 2D data in the verification stage. See sections IV.C-D.). Claim 8 is anticipated by Hodan for the same reasons as claim 1.
Claim 9 substantially corresponds to claim 1 by reciting an information processing method comprising steps that correspond to the functions of the processor of claim 1 and is anticipated by Hodan for the same reasons.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN P POTTS whose telephone number is (571)272-6351. The examiner can normally be reached M-F, 9am-5pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sumati Lefkowitz can be reached at 571-272-3638. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RYAN P POTTS/Examiner, Art Unit 2672
/SUMATI LEFKOWITZ/Supervisory Patent Examiner, Art Unit 2672