Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This action is responsive to the application filed June 24, 2025, claims 21-40 are presented for examination. Claims 21, 30 and 36 are independent claims.
Information Disclosure Statement
The Applicant’s Information Disclosure Statement filed (June 24, 2025, July 14, 2025 and August 13, 2025) has been received, entered into the record, and considered.
Priority
Examiner acknowledges the claims for domestic priority under 35 U.S. C. 119 (e) to continuation application PCT/US2020/017121 which was filed February 7, 2019.
Oath/Declaration
The Office acknowledges receipt of a properly signed Oath/Declaration submitted June 24, 2025.
Drawings
The drawings filed June 24, 2025 are accepted by the examiner.
Abstract
The abstract filed June 24, 2025 is accepted by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory obviousness-type double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428,46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046,29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Omum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969). A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) or 1.321(d) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent either is shown to be commonly owned with this application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CPR 3.73(b).
Claims 21-40 are rejected on the ground of nonstatutory obviousness-type double patenting as being unpatentable over claims 1-20 of application No. 18655518 Patent 12373025 B2. Although the conflicting claims are not identical, they are not patentably distinct from each other because the claims recites : a wearable display system comprising: a first camera; a second camera, wherein the first camera and the second camera are positioned so as to provide overlapping views; and a processor operatively coupled to the first camera and the second camera and configured to track a pose of an object by: receiving from the first camera, an event data indicating a change in light intensity or a change in a depth as viewed by the first camera; selecting, from among multiple tracking methodologies, a first tracking methodology to track the object, wherein the first tracking methodology uses event data received from the first camera but does not require image data from the second camera; generating a first track of the object by using the first tracking methodology; determining whether the first track of the object satisfies a specific tracking quality threshold; in response to determining that the first track of the object does not satisfy the specific tracking quality threshold, selecting from the multiple tracking methodologies, a second track methodology that uses image data received from the second camera; and generate a second track of the object by using the second tracking methodology, therefore the same limitations as claimed in application No. 18655518 Patent 12373025 B2.
This is an obviousness-type double patenting rejection.
US Application No. 19247893
No. 18655518 Patent 12373025 B2
21. A wearable display system comprising: a first camera; a second camera, wherein the first camera and the second camera are positioned so as to provide overlapping views; and a processor operatively coupled to the first camera and the second camera and configured to track a pose of an object by: receiving from the first camera, an event data indicating a change in light intensity or a change in a depth as viewed by the first camera; selecting, from among multiple tracking methodologies, a first tracking methodology to track the object, wherein the first tracking methodology uses event data received from the first camera but does not require image data from the second camera; generating a first track of the object by using the first tracking methodology; determining whether the first track of the object satisfies a specific tracking quality threshold; in response to determining that the first track of the object does not satisfy the specific tracking quality threshold, selecting from the multiple tracking methodologies, a second track methodology that uses image data received from the second camera; and generate a second track of the object by using the second tracking methodology.
Claims 30 and 36 similar to claim 21.
1. A wearable display system, the wearable display system comprising: a headset including: a first camera configurable to output an image frame or image data satisfying an intensity change criterion; and a second camera, wherein the first camera and the second camera are positioned so as to provide overlapping views of a central view field; and a processor operatively coupled to the first camera and the second camera and configured to: create a world model using depth information stereoscopically determined from image data output by the first camera and image data output by the second camera; perform a tracking routine using the world model and the image data output by the first camera; determine a tracking parameter for the tracking routine using image data output by at least one of the first camera or the second camera; and based on the tracking parameter, provide instructions to at least one of the first camera or the second camera to adjust image data acquisition, the instructions configured to increase or decrease an amount of data output by at least one of the first camera or the second camera.
Claim 17 similar to claim 1.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 21, 28, 29, 30 and 36 are rejected under 35 U.S.C. 103 as being unpatentable over Venkataraman et al. (IDS submitted prior art US 20160309134 A1) in view of Peri et al. (IDS submitted prior art US 20180275242 Al).
As to Claim 21:
Venkataraman et al. discloses a wearable display system (Venkataraman, see an augmented/mixed reality headset in figure 1B, Abstract and paragraph [0049], where Venkataraman discloses that the invention provides a camera array imaging architecture that computes depth maps for objects within a scene captured by the cameras, and use a near-field sub-array of cameras to compute depth to near-field objects and a far-field sub-array of cameras to compute depth to far-field objects. In particular, a baseline distance between cameras intensity in the near-field subarray is less than a baseline distance between cameras in the far-field sub-array in order to increase the accuracy of the depth map. Some embodiments provide an illumination near-IR light source for use in computing depth maps) comprising: a first camera (Venkataraman, see second camera 1107 in figure 11 and paragraph [0128], where Venkataraman discloses that the middle subarray of cameras 1130 includes two cameras, 1106 and 1107, positioned along a diagonal axis. In some embodiments, the middle camera 1106 or 1107 may be used as the reference camera); a second camera (Venkataraman, see camera 1108 in figure 11 and paragraph [0129], where Venkataraman discloses that cameras 1108 and 1107 (shown in dotted lines along with their connections) may not be populated but can be placed for possible experimentations and/or for increased depth estimation precision), wherein the first camera and the second camera are positioned so as to provide overlapping views (Venkataraman, see paragraph [0132], where Venkataraman discloses a camera array similar to the camera array shown in FIGS. 11 and 12 can be constructed with fewer cameras. Use of six cameras in a multi-baseline camera array that implements an image processing pipeline on a Qualcomm™ 820 platform in accordance with an embodiment of the invention is illustrated in FIG. 13, the combined used of six cameras in image processing teaches or suggest cameras positioned to provide image processing for central field of view); and a processor operatively coupled to the first camera and the second camera (Venkataraman, see paragraph [0132], where Venkataraman discloses that the image data from multiple cameras is combined into a single stream of data that is provided to one of the two image processors provided in the QCT 8096 AP) and configured to track a pose of an object (Venkataraman, see paragraphs [0072] [0133] and [0129], where Venkataraman discloses that once depth information is determined, the depth information can be utilized in extracting the pose of the user with respect to the environment, the creation of virtual objects, and/or subsequent rendering of a virtual environment displayed to the user. As per paragraph [0133], the benefits of using multiple cameras when performing depth estimation relative to a stereo pair of cameras can be appreciated by reviewing figures 14 through 19. Figure 14 shows a monochrome image captured of a scene on which a Near-IR pattern is projected by a camera that acts as a reference camera within an array camera similar to the array camera shown in figure 12. As indicated in paragraph [0129], cameras 1108 and 1107 (shown in dotted lines along with their connections) may not be populated but can be placed for and/or for increased depth estimation precision) by: receiving from the first camera, an event data indicating a change in light intensity or a change in a depth as viewed by the first camera (Venkataraman, see paragraph [0101], where Venkataraman discloses a reference camera may have an IR-cut filter and can be used to determine which points may have increased intensity due to the presence of the projected near-IR pattern. For example, if the system uses an IR strobe, then it may end up with an image with a pattern laid over it and all the IR dots will be a depth. Because the system may already have a depth at that point, it will know how the depth has to be warped to the image of the main reference camera, for example, by looking at the green channel and estimating whether the point does or does not have increased intensity due to the presence of the projected near-IR pattern); selecting, from among multiple tracking methodologies, a first tracking methodology to track the object (Venkataraman, see paragraph [0101], where Venkataraman discloses that the identification of IR fiducial marks as distinct from non-IR fiducial marks may be beneficial to the correct computation of camera pose; in paragraph [0102] the use of Near-IR-structured illumination in the manner described above, systems and methods in accordance with several embodiments of the invention can utilize homogeneous (Near-IR) illumination (e.g. a Near-IR flash) in order to improve edge visibility in cases of naturally poorly illuminated scenes). It is noted that a processor is configured to selectively enable the IR emitter, to reduce the power consumption, based on routine experimentation as stated in paragraph [0111]. The subsequent sections below outline the operational characteristics along with exploiting architectural efficiencies that enable a reduction in component and computational/power consumption costs. In paragraph [0133], it is described the use of the projected Near-IR pattern yields a large number of high confidence depth estimates distributed throughout the field of view of the reference camera and across the complete range of depths visible within the scene. Thus as per paragraph [0073] the cameras are capable of capturing image data within the near-IR spectrum are included within a camera array to provide increased sensitivity in low lighting conditions), wherein the first tracking methodology uses event data received from the first camera but does not require image data from the second camera; generating a first track of the object by using the first tracking methodology; determining whether the first track of the object satisfies a specific tracking quality threshold (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array); in response to determining that the first track of the object does not satisfy the specific tracking threshold, selecting from the multiple tracking methodologies, a second track methodology that uses image data received from the second camera; and generate a second track of the object by using the second tracking methodology (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array).
Venkataraman differs from the claimed subject matter in that Venkataraman does not explicitly disclose tracking quality threshold.
However in an analogous art, Peri discloses tracking quality threshold (Peri, see 1120, 1130 and 1125 in figure 11 and paragraphs [0042], [0107] and [0112], where Peri discloses that cameras 405, 410 can comprise frame-based cameras, event-based cameras, or a combination as shown. Also, other camera types can be utilized. The HMD can include optional inertial measurement unit (IMU) sensors (e.g., a gyroscope, an accelerometer, etc.). Event-based cameras 410 are vision sensors that output pixel-level brightness changes and can have the characteristics in capturing images at high speeds. If enough tracking points are determined consistent, head pose (localization) can be calculated from frame based camera data and event-based camera data independently. The head pose from each camera should be congruent. Based on the congruency, each output ' frame' may be used to calculate head pose until the next frame is generated, the motion speed threshold 2 in 1120 of figure 11 teaches or suggest an intensity change criterion in regards to a DVS event-based camera that captures head pose).
It would have been obvious to one of ordinary skill in the art to modify the invention of Venkataraman with Peri. One would be motivated to modify Venkataraman by disclosing tracking quality threshold as taught by Peri and thereby providing systems and methods that allow improved tracking points (Peri, see paragraph [0003]).
As to Claim 28:
Venkataraman in view of Peri discloses that the system of claim 21, wherein the first camera includes a dynamic vision sensor (DVS), and the second camera is configured to provide colored images (Peri, see 1120, 1130 and 1125 in figure 11 and paragraphs [0042], [0107] and [0112], where Peri discloses that cameras 405, 410 can comprise frame-based cameras, event-based cameras, or a combination as shown. Also, other camera types can be utilized. The HMD can include optional inertial measurement unit (IMU) sensors (e.g., a gyroscope, an accelerometer, etc.). Event-based cameras 410 are vision sensors that output pixel-level brightness changes and can have the characteristics in capturing images at high speeds. If enough tracking points are determined consistent, head pose (localization) can be calculated from frame based camera data and event-based camera data independently. The head pose from each camera should be congruent. Based on the congruency, each output ' frame' may be used to calculate head pose until the next frame is generated, the motion speed threshold 2 in 1120 of figure 11 teaches or suggest an intensity change criterion in regards to a DVS event-based camera that captures head pose).
As to Claim 29:
Venkataraman in view of Peri discloses that the system of claim 21, wherein the first camera consumes less power in producing respective image data than the second camera does in producing the image data (Venkataraman, see paragraph [0101], where Venkataraman discloses that the identification of IR fiducial marks as distinct from non-IR fiducial marks may be beneficial to the correct computation of camera pose; in paragraph [0102] the use of Near-IR-structured illumination in the manner described above, systems and methods in accordance with several embodiments of the invention can utilize homogeneous (Near-IR) illumination (e.g. a Near-IR flash) in order to improve edge visibility in cases of naturally poorly illuminated scenes).It is noted that a processor is configured to selectively enable the IR emitter, to reduce the power consumption, based on routine experimentation as stated in paragraph [0111]. The subsequent sections below outline the operational characteristics along with exploiting architectural efficiencies that enable a reduction in component and computational/power consumption costs. In paragraph [0133], it is described the use of the projected Near-IR pattern yields a large number of high confidence depth estimates distributed throughout the field of view of the reference camera and across the complete range of depths visible within the scene. Thus as per paragraph [0073] the cameras are capable of capturing image data within the near-IR spectrum are included within a camera array to provide increased sensitivity in low lighting conditions).
As to Claim 30:
Venkataraman et al. discloses a method of operating a wearable display system to track an object within a field of view of the display system (Venkataraman, see an augmented/mixed reality headset in figure 1B, Abstract and paragraph [0049], where Venkataraman discloses that the invention provides a camera array imaging architecture that computes depth maps for objects within a scene captured by the cameras, and use a near-field sub-array of cameras to compute depth to near-field objects and a far-field sub-array of cameras to compute depth to far-field objects. In particular, a baseline distance between cameras intensity in the near-field subarray is less than a baseline distance between cameras in the far-field sub-array in order to increase the accuracy of the depth map. Some embodiments provide an illumination near-IR light source for use in computing depth maps), the method comprising: receiving from a first camera of the display system, an event data indicating a change in light intensity or a change in a depth as viewed by the first camera (Venkataraman, see paragraph [0101], where Venkataraman discloses a reference camera may have an IR-cut filter and can be used to determine which points may have increased intensity due to the presence of the projected near-IR pattern. For example, if the system uses an IR strobe, then it may end up with an image with a pattern laid over it and all the IR dots will be a depth. Because the system may already have a depth at that point, it will know how the depth has to be warped to the image of the main reference camera, for example, by looking at the green channel and estimating whether the point does or does not have increased intensity due to the presence of the projected near-IR pattern); selecting, from among multiple tracking methodologies, a first tracking methodology to track the object (Venkataraman, see paragraph [0101], where Venkataraman discloses that the identification of IR fiducial marks as distinct from non-IR fiducial marks may be beneficial to the correct computation of camera pose; in paragraph [0102] the use of Near-IR-structured illumination in the manner described above, systems and methods in accordance with several embodiments of the invention can utilize homogeneous (Near-IR) illumination (e.g. a Near-IR flash) in order to improve edge visibility in cases of naturally poorly illuminated scenes). It is noted that a processor is configured to selectively enable the IR emitter, to reduce the power consumption, based on routine experimentation as stated in paragraph [0111]. The subsequent sections below outline the operational characteristics along with exploiting architectural efficiencies that enable a reduction in component and computational/power consumption costs. In paragraph [0133], it is described the use of the projected Near-IR pattern yields a large number of high confidence depth estimates distributed throughout the field of view of the reference camera and across the complete range of depths visible within the scene. Thus as per paragraph [0073] the cameras are capable of capturing image data within the near-IR spectrum are included within a camera array to provide increased sensitivity in low lighting conditions), wherein the first tracking methodology uses event data received from the first camera but does not require image data from a second camera of the display system; generating a first track of the object by using the first tracking methodology; determining whether the first track of the object satisfies a specific tracking quality threshold (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array); in response to determining that the first track of the object does not satisfy the specific tracking threshold, selecting from the multiple tracking methodologies, a second track methodology that uses image data received from the second camera; and generate a second track of the object by using the second tracking methodology (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array).
Venkataraman differs from the claimed subject matter in that Venkataraman does not explicitly disclose tracking quality threshold.
However in an analogous art, Peri discloses tracking quality threshold (Peri, see 1120, 1130 and 1125 in figure 11 and paragraphs [0042], [0107] and [0112], where Peri discloses that cameras 405, 410 can comprise frame-based cameras, event-based cameras, or a combination as shown. Also, other camera types can be utilized. The HMD can include optional inertial measurement unit (IMU) sensors (e.g., a gyroscope, an accelerometer, etc.). Event-based cameras 410 are vision sensors that output pixel-level brightness changes and can have the characteristics in capturing images at high speeds. If enough tracking points are determined consistent, head pose (localization) can be calculated from frame based camera data and event-based camera data independently. The head pose from each camera should be congruent. Based on the congruency, each output ' frame' may be used to calculate head pose until the next frame is generated, the motion speed threshold 2 in 1120 of figure 11 teaches or suggest an intensity change criterion in regards to a DVS event-based camera that captures head pose).
It would have been obvious to one of ordinary skill in the art to modify the invention of Venkataraman with Peri. One would be motivated to modify Venkataraman by disclosing tracking quality threshold as taught by Peri and thereby providing systems and methods that allow improved tracking points (Peri, see paragraph [0003]).
As to Claim 36:
Venkataraman et al. discloses a non-transitory, computer-readable medium storing one or more instructions that when executed by a wearable display system, cause the wearable display system to perform operations to track an object within a field of view of the display system (Venkataraman, see an augmented/mixed reality headset in figure 1B, Abstract and paragraph [0049], where Venkataraman discloses that the invention provides a camera array imaging architecture that computes depth maps for objects within a scene captured by the cameras, and use a near-field sub-array of cameras to compute depth to near-field objects and a far-field sub-array of cameras to compute depth to far-field objects. In particular, a baseline distance between cameras intensity in the near-field subarray is less than a baseline distance between cameras in the far-field sub-array in order to increase the accuracy of the depth map. Some embodiments provide an illumination near-IR light source for use in computing depth maps), the operations comprising: receiving from a first camera of the wearable display system, an event data indicating a change in light intensity or a change in a depth as viewed by the first camera (Venkataraman, see paragraph [0101], where Venkataraman discloses a reference camera may have an IR-cut filter and can be used to determine which points may have increased intensity due to the presence of the projected near-IR pattern. For example, if the system uses an IR strobe, then it may end up with an image with a pattern laid over it and all the IR dots will be a depth. Because the system may already have a depth at that point, it will know how the depth has to be warped to the image of the main reference camera, for example, by looking at the green channel and estimating whether the point does or does not have increased intensity due to the presence of the projected near-IR pattern); selecting, from among multiple tracking methodologies, a first tracking methodology to track the object (Venkataraman, see paragraph [0101], where Venkataraman discloses that the identification of IR fiducial marks as distinct from non-IR fiducial marks may be beneficial to the correct computation of camera pose; in paragraph [0102] the use of Near-IR-structured illumination in the manner described above, systems and methods in accordance with several embodiments of the invention can utilize homogeneous (Near-IR) illumination (e.g. a Near-IR flash) in order to improve edge visibility in cases of naturally poorly illuminated scenes). It is noted that a processor is configured to selectively enable the IR emitter, to reduce the power consumption, based on routine experimentation as stated in paragraph [0111]. The subsequent sections below outline the operational characteristics along with exploiting architectural efficiencies that enable a reduction in component and computational/power consumption costs. In paragraph [0133], it is described the use of the projected Near-IR pattern yields a large number of high confidence depth estimates distributed throughout the field of view of the reference camera and across the complete range of depths visible within the scene. Thus as per paragraph [0073] the cameras are capable of capturing image data within the near-IR spectrum are included within a camera array to provide increased sensitivity in low lighting conditions), wherein the first tracking methodology uses event data received from the first camera but does not require image data from a second camera of the wearable display system; generating a first track of the object by using the first tracking methodology; determining whether the first track of the object satisfies a specific tracking quality threshold (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array); in response to determining that the first track of the object does not satisfy the specific tracking threshold, selecting from the multiple tracking methodologies, a second track methodology that uses image data received from the second camera; and generate a second track of the object by using the second tracking methodology (Venkataraman, see paragraph [0091], where Venkataraman discloses that the camera array may use low-CTE and high stiffness (e.g., high Young's modulus) substrates within the structure of the system. Young's modulus, which is also known as the elastic modulus, is generally defined as a mechanical property of linear elastic solid materials and defines the relationship between stress (force per unit area) and strain (proportional deformation) in a material. This enables robust pose estimation including rotation and translation (vector) using images captured by this subset of the cameras in the array. Using accurate depth information generated by the subset, estimates concerning the baselines and orientation of other cameras that are not rigidly fixed within the array can be performed. In several embodiments, a feature tracking process is used to track multiple features across any one of these cameras from frame to frame. Tracking a minimum number of feature points robustly enables the recovery of the essential camera matrix for the cameras in the array, from which rotation of the camera system as a whole is accurately derived. The recovery of translation, however, is accurate only up to an unknown scale factor. By considering the group of 3 cameras as a rigid whole and tracking the same feature points across all the cameras in the rigid sub-array and also across other cameras in the overall array, the system can recover the translation and scale to complete the robust recovery of pose for all of the cameras in the array).
Venkataraman differs from the claimed subject matter in that Venkataraman does not explicitly disclose tracking quality threshold.
However in an analogous art, Peri discloses tracking quality threshold (Peri, see 1120, 1130 and 1125 in figure 11 and paragraphs [0042], [0107] and [0112], where Peri discloses that cameras 405, 410 can comprise frame-based cameras, event-based cameras, or a combination as shown. Also, other camera types can be utilized. The HMD can include optional inertial measurement unit (IMU) sensors (e.g., a gyroscope, an accelerometer, etc.). Event-based cameras 410 are vision sensors that output pixel-level brightness changes and can have the characteristics in capturing images at high speeds. If enough tracking points are determined consistent, head pose (localization) can be calculated from frame based camera data and event-based camera data independently. The head pose from each camera should be congruent. Based on the congruency, each output ' frame' may be used to calculate head pose until the next frame is generated, the motion speed threshold 2 in 1120 of figure 11 teaches or suggest an intensity change criterion in regards to a DVS event-based camera that captures head pose).
It would have been obvious to one of ordinary skill in the art to modify the invention of Venkataraman with Peri. One would be motivated to modify Venkataraman by disclosing tracking quality threshold as taught by Peri and thereby providing systems and methods that allow improved tracking points (Peri, see paragraph [0003]).
Allowable Subject Matter
Claims 22-27, 31-35 and 37-40 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Referring to claim 22, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein the processor is further configured to update a world model that includes a representation of the object, wherein the updating comprises using the first track of the object to update a position of the object in the world model in response to determining that the first track of the object satisfies the specific tracking quality threshold; and using the second track of the object to update the position of the object in the world model in response to determining that the first track of the object does not satisfy the specific tracking quality threshold”.
Referring to claim 23 and dependent claims 24, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using the event data received from the first camera in addition to the image data received from the second camera”.
Referring to claim 25 and dependent claims 26 and 27, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using image data received from the first camera in addition to the image data received from the second camera.
Referring to claim 31, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “comprising updating the world model by using the first track of the object to update a position of the object in the world model in response to determining that the first track of the object satisfies the specific tracking quality threshold; and using the second track of the object to update the position of the object in the world model in response to determining that the first track of the object does not satisfy the specific tracking quality threshold”.
Referring to claim 32 and dependent claim 33, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using the event data received from the first camera in addition to the image data received from the second camera”.
Referring to claim 34 and dependent claim 35, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using image data received from the first camera in addition to the image data received from the second camera”.
Referring to claim 37 and dependent claim 38, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using the event data received from the first camera in addition to the image data received from the second camera”.
Referring to claim 39 and dependent claim 40, the following is a statement of reasons for the indication of allowable subject matter: the prior art fail to suggest limitations “wherein generating the second track of the object by using the second tracking methodology comprises using image data received from the first camera in addition to the image data received from the second camera”.
Conclusion
The prior art made of record and not relied upon is considered pertinent to
applicant's disclosure. Petrovskaya (US 10043319 B2) discloses that augmented reality systems provide “see-through” transparent or translucent displays upon which to project virtual objects, many virtual reality systems instead employ opaque, enclosed screens. Indeed, eliminating the user's perception of the real-world may be integral to some successful virtual reality experiences. Thus, head mounted displays designed exclusively for virtual reality experiences may not be easily repurposed to capture significant portions of the augmented reality market.
Contact Information
Any inquiry concerning this communication or earlier communications from the examiner should be directed to NELSON ROSARIO whose telephone number is (571)270-1866. The examiner can normally be reached on Monday through Friday, 7:30am- 5:00pm EST. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Matthew Eason can be reached on (571) 270-7230. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/NELSON M ROSARIO/Primary Examiner, Art Unit 2624