Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s arguments and claim amendments, see P. 1 - P. 3, filed 02/18/2026, with respect to amended claims 1, 12, and 17 have been fully considered but are not found convincing. The 35 U.S.C. 103 rejection of 08/18/2025 has NOT been withdrawn.
Regarding claims 1, 12, and 17, examiner respectfully disagree with the arguments and amendments. In particular, the applicant states that current claim language of “identifying, by the processor, voxels containing a number of data points less than a threshold value” is identifying voxels based on “population count” of data points. However, examiner believes that the current claim language does not accurately capture “population count”. The current meaning of “voxels containing a number of data points less than a threshold value” can mean “voxels with less than a number of data points”, “voxels with data points that are less than a threshold value” or even “voxels with less than a threshold value”. In other words, the threshold can be either applied to the voxels or the data points. As such, the original value of Yao could be used for identifying “voxels with data points that are less than a threshold value”.
In addition, threshold value, more specifically threshold, under BRI, includes the meaning of a percentage. As such, examiner believes that Yao’s use of thresholding is appropriate. If applicant wishes to be based on “population count” of each voxel as described within the specification and remark, further clarification of such language within the claim is needed.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 7-14, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Matei et al. (US 2018/0205963 A1, hereinafter Matei) in view of Dӧring et al. (US 2021/0374978 A1, hereinafter Dӧring) and Yao et al. (US 20220147791 A1, hereinafter Yao)
Regarding claim 1, Matei discloses
A method for performing three dimensional imaging, the method comprising: capturing, by an imaging system (Fig. 17 Left RGB Camera 31, Right RGB Camera 33), a first image of a target in a first field of view of the imaging system (Para [0063]: “A Holocam Orb in accord with the present invention is not limited to the above-listed examples of depth image capture devices/units/systems, and may use any one, or any combination, of different depth image capture devices/units/systems (i.e. 3D imaging/sensing devices/units/systems). For example, a Holocam Orb may include stereo RGB camera system (or other intensity image capture device) and a structured light 3D capture device. Alternatively, a Holocam Orb may include a stereo RGB camera system and a time-of-flight capturing unit”, Para [0071]: “Correspondence matching tries to determine which parts of a first image correspond to (i.e. are matched to) what parts of a second image, assuming that the second image was taken after the camera that took the first image had moved, time had elapsed, and/or the captured subjects/ objects had moved. For example, the first image may be of a real-world scene taken from a first view angle, defining a first field-of-view (i.e. FOY), and the second image may be of the same real-world scene taken from a second view angle defining a second FOY”);
capturing, by the imaging system, a second image of the target in a second field of view of the imaging system, the second field of view being different than the first field of view (Para [0063], Para [0071] See rejection as applied in previous limitations);
generating, by a processor (Fig. 13 Local CPU 63), a first point cloud, corresponding to the target, from the first image (Para [0065]: “Consequently, for ease of implementation a preferred embodiment of the present invention makes use of a first point cloud generated from a stereo pair of 2D images, and further correlates points in this first point cloud to points in a second point cloud created with a time-of-flight capturing unit (or module)”);
generating, by the processor, a second point cloud, corresponding to the target, from the second image (Para [0065] See rejection as applied to previous limitations);
identifying, by the processor, one or more noisy data points in the merged point cloud (cloud (Para [0197]: “Problems arise when faulty depth data (non-visual data) is associated with visual data, or vice versa. That is, when erroneous depth data is assigned visual texture, the previously invisible depth data errors become visual errors (or image artifacts) that detract from the final 3D (model) immersion construct. The present invention identifies regions of bad depth data that result from (LED) shadow, as well as incorrectly assigned textured data caused by parallax between depth and RGB cameras”)and
removing, by the processor, at least one of the one or more noisy data points from the merged point cloud and generating an aggregated point cloud from the merged point cloud (Para [0197]: “Problems arise when faulty depth data (non-visual data) is associated with visual data, or vice versa. That is, when erroneous depth data is assigned visual texture, the previously invisible depth data errors become visual errors (or image artifacts) that detract from the final 3D (model) immersion construct. The present invention identifies regions of bad depth data that result from (LED) shadow, as well as incorrectly assigned textured data caused by parallax between depth and RGB cameras… The present embodiment removes these bad regions to avoid their incorporation into the final 3D immersion construct.”).
However Matei does not explicitly disclose
identifying, by the processor, a position and orientation of a reference feature of the target in the first image; identifying, by the processor, a position and orientation of the reference feature in the second image; performing, by the processor, point cloud stitching to combine the first point cloud and the second point cloud to form a merged point cloud, the point cloud stitching performed according to the orientation and position of the reference feature in each of the first point cloud and second point cloud; wherein identifying one or more noisy data points comprises: determining, by the processor, voxels in the merged point cloud; determining, by the processor, a number of data points of the merged point cloud in each voxel; identifying, by the processor, voxels containing a number of data points less than a threshold value; and identifying, by the processor, the noisy data points as data points in voxels containing equal to or less than the threshold value of data points.
Dӧring teaches
identifying, by the processor, a position and orientation of a reference feature of the target in the first image (Para [0038]: “Anchor object is an object in a captured point cloud, where the anchor object is detectable from a range of viewing directions, distances, and orientations. An anchor object is stored in the data captured by the scanner 120 by storing one or more positions, orientations, degrees of freedom, and other measurements of the scanner 120 when capturing the anchor object”);
identifying, by the processor, a position and orientation of the reference feature in the second image (Para [0038]: “Different types of anchor objects provide different constraints in different degrees of freedom when matching two or more instances of the anchor objects in different point clouds”);
performing, by the processor, point cloud stitching to combine the first point cloud and the second point cloud to form a merged point cloud, the point cloud stitching performed according to the orientation and position of the reference feature in each of the first point cloud and second point cloud (Para [0086]: “Further, if the same anchor object is detected in multiple scans, the anchor object is used as a constraint for matching of the multiple scans, for example during stitching the scans”).
wherein identifying one or more noisy data points comprises: determining, by the processor, voxels in the merged point cloud (Para [0088]: “Point cloud 130 can be a 2D or 3D representation of the environment seen through the different sensors. Point cloud 130 can be represented internally as a grid map. A grid map is a 2D or 3D arranged collection of cells (2D) or voxels (3D), representing an area of the environment. In one or more embodiments of the present disclosure, the grid map stores for every cell/voxel, a probability indicating if the cell area is occupied or not”).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Matei to include identifying the position and orientation of same objects in multiple images and stitching the images based on the same objects with point cloud detection and other aspects of Dӧring to effectively improve the flexibility and efficiency when processing scan data.
However Matei in view of Dӧring does not explicitly teach
determining, by the processor, a number of data points of the merged point cloud in each voxel; identifying, by the processor, voxels containing a number of data points less than a threshold value; and identifying, by the processor, the noisy data points as data points in voxels containing equal to or less than the threshold value of data points.
Yao teaches
determining, by the processor, a number of data points of the merged point cloud in each voxel (Para [0261]: “the input feature map 1910 to the first layer (i.e., a 3D depth/point cloud) or the other layers (i.e., a 3D tensor for each feature channel) may not be sparse when a high quality data collection device is available and/or when dense object instances occupy the image content. In such cases, a channel-wise thresholding operation/module (not shown) can be added to create sparsity within each input channel by, for example, retaining a certain percentage (e.g., 40%) of the voxels and inactivating/invalidating (e.g., setting to zero values) all other voxels whose original values were less than a fixed threshold value (which is directly calculated based on the target sparse ratio and the number of voxels in the particular input channel)”);
identifying, by the processor, voxels containing a number of data points less than a threshold value (Para [0261] See rejection as applied to previous limitations); and
identifying, by the processor, the noisy data points as data points in voxels containing equal to or less than the threshold value of data points (Para [0261]: “Those skilled in the art will appreciate, this feature thresholding is also applicable to scenarios in which the input feature map 1910 is already sparse, in which case it not only further accelerates the computation, but also suppresses the negative influence of a noisy input feature map.”).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Matei in view of Dӧring to include determining data points in voxels and identifying voxels containing noisy data points based on a threshold value of data points of Yao to effectively reduce the computation and memory needed associated with 3D CNN that can be used in 3D shape analysis and 3D semantic scene completion.
Regarding claim 2, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Dӧring further teaches
identifying, by the processor, a position and orientation of a reference feature of the target in the first image (Para [0038] See rejection as applied to claim 1”);
identifying, by the processor, a position and orientation of the reference feature in the second image (Para [0038] See rejection as applied to claim 1”); and
performing, by the processor, the point cloud stitching according to the (1) identified position and orientation of the reference feature of the target in the first image and (ii) position and orientation of a reference feature of the target in the second image (Para [0086] See rejection as applied to claim 1).
Regarding claim 3, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Dӧring further teaches
the reference feature comprises one of a surface, a vertex, a corner, and one or more line edges (Para [0081]: “It should be noted that "landmarks," which are used in existing systems, can be used to define anchor objects, which are used in embodiments herein. Typically, a landmark is a natural feature (e.g., actual intersection of three edges) or artificial targets that are placed in the environment, and is not a virtual point in the point cloud like the anchor object”).
Regarding claim 4, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Dӧring further teaches
determining, by the processor, a first position of the imaging system from the position and orientation of the reference feature in the first point cloud (Para [0042]: “In one or more embodiments of the present invention, the anchor objects are used to perform a loop closure algorithm to determine and compensate for any error ("drift") that can accumulate as the scanner 120 travels along a trajectory/path. FIG. 2 schematically illustrates an example scenario in which an offset (referred to as "drift") is continuously introduced into the scan data. Consider that the scanner 120 is moving from a starting position 1510 (real pose). After some movements the scanner 120 is designated to return to an already mapped region, such as the starting position 1510, however the measured position due to sensor variation and the subsequent measurement error is a different position 1520 (estimated pose)”, Para [0048]: “Embodiments herein facilitate performing the loop closure using the anchor objects that are detected using locations in the environment as described herein. The relative observation of an anchor object from the scanner 120 delivers an accurate position information and can correct the position of the scanner 120 in the absolute world and remove absolute inaccuracies accumulated from the mapping process” To correct the position of scanner 120, the position of scanner 120 must be known first);
determining, by the processor, a second position of the imaging system from the position and orientation of the reference feature in the second point cloud (Para [0042] See rejection as applied to previous limitations); and
performing, by the processor, the point cloud stitching further according to the determined first position of the imaging system and second position of the imaging system Para [0042], [0048] See rejection as applied to previous limitations, Para [0086]: “Further, if the same anchor object is detected in multiple scans, the anchor object is used as a constraint for matching of the multiple scans, for example during stitching the scans”, Para [0087]: “Additionally, the anchor objects can be reused as indicator for loop closure in the case where the anchor objects can be identified globally e.g. line/wall segments through their length. If multiple such anchor objects are identified between two submaps the loop closure can be evaluated using the time stamp and the anchor objects for the alignment of the multiple submaps.”).
Regarding claim 7, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Dӧring further teaches
the threshold value is dependent on one or more of an image frame count, image resolution, and voxel size (Para [0059]: “the notification 2308 is provided when another anchor-recording criterion is met. For example, if a number of frames, or points captured by the scanner 120 since the previous instance of the anchor objects exceeds a predetermined threshold, the notification 2308 is provided”).
Regarding claim 8, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Matei further discloses
performing, by the processor, a three-dimensional construction of the target from the aggregated point cloud (Para [0184]: “Model Builder Module 81 uses the received visual and depth data from the Holocam Orbs to generate a 3D structural model of the scene (e.g. a polygon model, such as a (triangular) mesh construct/model)”); and
determining, by the processor and from the three-dimensional construction, a physical dimension of the target (Para [0073]: “Epipolar geometry is basically the geometry of stereo vision. For example in FIG. 3, two cameras 11 and 13 create two 2D images 11A and 13A, respectively, of a common, real-world scene 18 consisting of a larger sphere 19 and a smaller sphere 21. 2D images 11A and 13A are taken from two distinct view angles 11C and 13C. Multiple view angles result in parallax (i.e. horizontal parallax and/or vertical parallax), and as is explained below, parallax can be a source of error for a Holocam Orb, but the preferred embodiment corrects for errors resulting from either/both horizontal or/and vertical parallax. However, the multiple view angles also make possible the use of epipolar geometry, which describes the geometric relations between points in 3D scene 18 (for example spheres 19 and 21) and their relative projections in 2D images 11A and 13A.”, Geometric relations between points in 3D scene includes distance between points, which leads to dimension of the target).
Regarding claim 9, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Matei further discloses
the first field of view provides a first perspective of the target, and the second field of view provides a second perspective of the target, the second perspective of the target being different than the first perspective of the target (Para [0059]: “As is described more fully below, information from all Holocam Orbs 6, 7 and 8 is combined within a computing system, or base station, such as computer 2, to generate a 3D artificial reality scene of room 1 from multiple perspectives. Preferably, Holocam Orb 6 provides information from the back of room 1 toward the front (i.e. the foreground in FIG. 1), Holocam Orb 7 provides information from the front of room 1 toward the back, and Holocam Orb 8 providing information from above toward the floor”).
Regarding claim 10, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Yao further teaches
performing z-buffering on at least one of the first point cloud, second point cloud, or merged point cloud to exclude data points outside of the first field of view or second field of view of the imaging system (Para [0082]: “The ray tracing cores 245 may also include circuitry for performing depth testing and culling ( e.g., using a Z buffer or similar arrangement).”).
Regarding claim 11, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
Matei further discloses
the imaging system comprises an infrared camera, a color camera, two-dimensional camera, a three-dimensional camera, a handheld camera, or a plurality of cameras (Para [0062]: “Examples of sensor data gathering devices include intensity image capture devices/units/systems (which may include 2D imaging devices such as RGB sensors and/or infrared image sensors), depth image capture devices/units/systems (i.e. 3D imaging/sensing devices/units/systems), and audio sensing devices/units/systems. Examples of a depth image capture device may include stereo imaging equipment (e.g. a 3D, or stereo, camera), a structured light 3D capture device/unit/system, and a time-of-flight capturing device/unit/system. A structured light 3D capture device/unit/system, such as a structured light 3D scanner or MICROSOFT CORP. KINECT® 1 sensor… A time-of-flight capturing device/unit/system, such as a 3D laser scanner, a MICROSOFT CORP. KINECT® 2 sensor, and a range camera, may use infrared sensors or laser sensors to obtain 3D depth information.”).
Regarding claim 12, An imaging system for performing three dimensional imaging, the system comprising: one or more imaging devices (Fig. 17 Left RGB Camera 31, Right RGB Camera 33) configured to capture images; one or more processors (Fig. 13 Local CPU 63) configured to receive data from the one or more imaging devices; and one or more non-transitory memories (Para [0113]: “FPGA module 43 may have additional (preferably nonvolatile) memory, such as flash memory 54 and/or Electrically Erasable Programmable Read-Only Memory (EEPROM) memory 56”) storing computer-executable instructions, is rejected as applied to claim 1 above
Regarding claim 13, dependent upon claim 12, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 12. Claim 13 is rejected as applied to claim 2 above.
Regarding claim 14, dependent upon claim 12, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 12. Claim 14 is rejected as applied to claim 4 above.
Regarding claim 16, dependent upon claim 12, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 12. Claim 16 is rejected as applied to claim 9 above.
Regarding claim 17, One or more non-transitory computer-readable (Para [0113]: “FPGA module 43 may have additional (preferably nonvolatile) memory, such as flash memory 54 and/or Electrically Erasable Programmable Read-Only Memory (EEPROM) memory 56”) media storing computer-executable, is rejected as applied to claim 1 above.
Regarding claim 18, dependent upon claim 17, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 17. Claim 18 is rejected as applied to claim 2 above
Regarding claim 19, dependent upon claim 17, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 17. Claim 19 is rejected as applied to claim 4 above.
Claims 5 is rejected under 35 U.S.C. 103 as being unpatentable over Matei et al. (US 2018/0205963 A1, hereinafter Matei) in view of Dӧring et al. (US 2021/0374978 A1, hereinafter Dӧring), Yao et al. (US 20220147791 A1, hereinafter Yao) and Yu et al. (US 20190028693 A1 ,hereinafter Yu).
Regarding claim 5, dependent upon claim 1, Matei in view of Dӧring and Yao teaches all of the element as stated above regarding claim 1.
However Matei in view of Dӧring and Yao does not teach
determining, by the processor, a transformation matrix from the position and orientation of the reference feature in the first point cloud and position and orientation of the reference feature in the second point cloud.
Yu teaches
determining, by the processor, a transformation matrix from the position and orientation of the reference feature in the first point cloud and position and orientation of the reference feature in the second point cloud (Para [0080-0083]: “Step 704: estimating the rotation of the first camera from the first position to the second position. A structure from motion method can be used to first identify a feature in the images taken by the first camera at the first position and the second position, and then estimate the rotation of the first camera from the first position to the second position based on the position of the identified feature in the pictures. Step 705: determining a rotation between the first camera and the reference point by minimizing a defined distance. Step 706: determining a transformation between the first camera and the reference point by minimizing a re-projection error. Step 707: determining a homogenous transformation matrix between the first camera and the second camera based on the images captured by the first camera and the second camera at the first position and the second position. Here, the relative rotation and translation between the first camera and the second camera can be calculated from the parameters determined in previous steps.”).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Matei in view of Dӧring and Yao to include determining a transformation matrix of Yu to effectively increase the efficiency when calibrating cameras with small or even no overlapping field of view.
Relevant Prior Art Directed to State of Art
Lloyd et al. (US 11,017,548 B2, hereinafter Lloyd) is prior art not applied in the rejection(s) above. Lloyd discloses Various embodiments described herein relate to techniques for computing dimensions of an object using multiple range images. In this aspect, the multiple range images are captured from selective locations and satisfy a pre-defined criterion.
Wolke et al. (US 2024/0054731 A1, hereinafter Wolke) is prior art not applied in the rejection(s) above. Wolke discloses a computer-implemented method that includes comparing at least one selected image to the 3D point cloud to align common locations in both the at least one selected image and the 3D point cloud.
Sha et al. (US 2022/0285009 A1, hereinafter Sha) is prior art not applied in the rejection(s) above. Sha discloses a method for aligning multiple depth cameras in an environment based on image data.
Castillo et al. (US 2021/0243362 A1, hereinafter Castillo) is prior art not applied in the rejection(s) above. Castillo discloses techniques for enhancing two-dimensional (2D) image capture of subjects (e.g., a physical structure, such as a residential building) to maximize the feature correspondences available for three-dimensional (3D) model reconstruction.
Zhang et al. (US 11,995,900 B2, hereinafter Zhang) is prior art not applied in the rejection(s) above. Zhang discloses A method and system for performing indicia recognition includes obtaining, at an image sensor, an image of an object of interest and identifying at least one region of interest in the image.
Conclusion
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA CHEN whose telephone number is (703)756-5394. The examiner can normally be reached M-Th: 9:30 am - 4:30pm ET F: 9:30 am - 2:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, STEPHEN R KOZIOL can be reached at (408)918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J. C./ Examiner, Art Unit 2665
/Stephen R Koziol/Supervisory Patent Examiner, Art Unit 2665