Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claim(s) 1-5, 12-21, and 28-30 is/are rejected under 35 U.S.C. 102(a)(2) as being clearly anticipated by Wang et al (US 20230135088 A1).
Regarding claim 1, Wang et al teaches an apparatus for determining road profiles (Para 4, embodiments of the present disclosure relate to 3D surface estimation. In some embodiments, a 3D surface structure such as the 3D surface structure of a road (3D road surface) may be observed and estimated to generate a 3D point cloud or other representation of the 3D surface structure. i.e. see Fig 1 and 11 regarding road 'profiles'), the apparatus comprising: at least one memory; and at least one processor coupled to the at least one memory and configured to (Para 173, processes carried out by processor and memory): extract image features from one or more images of an environment (Fig 6, image data 102),
wherein the environment includes a road (Fig 6, deep learning model road surface estimator with multiple input heads. i.e. environment being captured includes a road);
generate a segmentation mask based on the image features; determine a subset of the image features based on the segmentation mask (Para 55, in some embodiments, image de-warping and/or distortion correction may be applied to the image data 102 prior to estimating 3D structure. A segmentation mask or other classification data may be used (e.g., by overlaying the cation data on the image data 102) to select points from the estimated 3D structure that are on a desired surface, such as road surface. As such, the 3D structure estimator 105 may generate a representation of the 3D structure of a desired surface (e.g., sparse detection data 110), which may include a. 3D point cloud (e.g., in 3D world coordinates). i.e. before generating a image-based 3D feature based on the segmented subset of image features, segmentation mask is used to select points in the desired plane));
generate image-based three-dimensional features based on the subset of the image features (Para 82 and 83, the result may be a representation of the 3D surface structure of the road, such as a 2D height map, which may be transformed into a 3D point cloud (e.g., in 3D world coordinates). In operation, the deep learning model surface estimator 320 may repetitively operate on successive instances of the sparse detection data 110 (e.g., derived from sensor data captured during successive time slices separated by some designated internal) to predict successive instances of the dense detection data 120 (e.g., successive representations of corresponding portions of the 3D surface structure of the road), for example, as the vehicle 1700 of FIGS. 17A-17D moves through the 3D environment. i.e. generate image-based 3D features based on the subset of the image features);
obtain point-cloud-based three-dimensional features derived from a point cloud representative of the environment (Fig 6 and Para 51, a 3D structure estimator 105 may process the image data 102 to generate a representation of a 3D surface structure of interest (e.g., sparse detection data 110), which may comprise a 3D point cloud. i.e. Sparse data detection input 110 in Fig 6 is a point cloud);
combine the image-based three-dimensional features and the point-cloud-based three-dimensional features to generate combined three-dimensional features (Fig 6 and Para 81, as such, in some embodiments such as the one illustrated in FIG. 6, the deep learning model(s) 535 may learn from two different views of an observed surface structure (e.g., top-down and perspective, 3D point cloud space and 2D image space, etc.). i.e. combining image-based 3D features and point-cloud based 3D features to create 3D road surface representation using the multi-headed encoder in Fig 6);
and generate a road profile based on the combined three-dimensional features (Para 82, the deep learning model surface estimator 320 of FIG. 3 may be implemented using a variety of architectures for a constituent deep learning model (e.g., the deep learning model(s) 535 of FIG. 5 or 6) and/or some other machine learning model(s) to predict the dense detection data 120 from the sparse detection data 110. The result may be a representation of the 3D surface structure of the road, such as a 2D height map, which may be transformed into a 3D point cloud (e.g., in 3D world coordinates). i.e. generate road profile based on the combined three-dimensional features as taught in the multi-headed encoder in Fig 6).
Regarding claim 2, Wang et al teaches the apparatus of claim 1, wherein the determined subset of image features comprise features related to at least one of the road or lane boundaries of the road (Para 83, vehicle 1700 may use this information (e.g., instances of obstacles) to navigate, plan, or otherwise perform one or more operations (e.g., obstacle or protuberance avoidance, lane keeping, lane changing, merging, splitting, adapting a suspension system of the ego-object or ego-actor to match the current road surface, applying an early acceleration or deceleration based on an approaching surface slope, mapping, etc.) within the environment. i.e. comprises road or lanes on road).
Regarding claim 3, Wang et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to generate image-based queries based on the subset of the image features (Para 81, generally, the image encoder 610 (and/or any other input head) may include any number of layers (e.g., convolutions, pooling, and/or other types of operations) to extract features into some latent space, and the extracted features may be combined (e.g., concatenated) with extracted features from the encoder portion of the encoder/decoder 540 (and/or extracted features from other input heads). As such, in some embodiments such as the one illustrated in FIG. 6, the deep learning model(s) 535 may learn from two different views of an observed surface structure (e.g., top-down and perspective, 3D point cloud space and 2D image space, etc.). i.e. image-based queries (extracted features into some latent space) are used to concatenate the features).
Regarding claim 4, Wang et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to generate an uncertainty map related to the subset of the image features, wherein the road profile is based at least in part on the uncertainty map (Para 73 and Fig 5 and Fig 1, the pre-processor 510 includes an encoder 515 that encodes the sparse detection data 110 into a representation that the deep learning model(s) 535 support. By way of non-limiting example, in some embodiments where the sparse detection data 110 includes a sparse 3D point cloud, the encoder 515 may project the sparse 3D point cloud to form a sparse projection image (e.g., a top-down height map). In some cases (e.g., without the normalizer 520), the resulting sparse projection image may be used as the input data 530 and fed into the deep learning model(s) 535 to predict the regression data 570 (e.g., a dense projection image such as a top-down height map) and/or the confidence data 580. In some cases, the regression data 570 and/or the confidence data 580 predicted by the deep learning model(s) 535 to may be used as the dense detection data 120. i.e. uncertainty map is confidence data predicted by the deep learning model of the road surface).
Regarding claim 5, Wang et al teaches the apparatus of claim 1, wherein, to generate the image-based three-dimensional features, the at least one processor is further configured to unproject the subset of the image features (Fig 3 and 6 and Para 82, the result may be a representation of the 3D surface structure of the road, such as a 2D height map, which may be transformed into a 3D point cloud (e.g., in 3D world coordinates). i.e. also see Para 55, unproject the subset of image features to generate the image-based 3D feature).
Regarding claim 12, Wang et al teaches the apparatus of claim 1, wherein the road profile comprises coefficients of a polynomial representation of a surface of the road (Fig 11 and Para 121,is an illustration of an example parametric mathematical model of a desired surface, in accordance with some embodiments of the present disclosure. In the example illustrated in FIG. 11, a 3D surface is modeled with longitudinal curve l and lateral curves q.sub.j. In an example embodiment in which the 3D surface being modeled is a 3D road surface, parameters of parametric equations that define the longitudinal curve l and the lateral curves q.sub.j may be varied to simulate different types of 3D road surfaces. i.e. coefficients are a part of parameters of parametric equations that define the longitudinal curve of the road surface).
Regarding claim 13, Wang et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to: obtain the point cloud; and generate the point-cloud-based three-dimensional features based on the point cloud (Para 61 and Para 62, by way of non-limiting example, one or more LiDAR sensors or RADAR sensors may be used to capture sparse detection data 110 (e.g., a LiDAR or RADAR point cloud). i.e. obtain point cloud and generate point-cloud 3D features based on the point cloud).
Regarding claim 14, Wang et al teaches the apparatus of claim 13, wherein the point cloud comprises a light detection and ranging (LIDAR) point cloud (Para 61, by way of non-limiting example, one or more LiDAR sensors or RADAR sensors may be used to capture sparse detection data 110 (e.g., a LiDAR or RADAR point cloud)).
Regarding claim 15, Wang et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to perform an operation, wherein the operation is: determining a location of a vehicle relative to the road based on the road profile;
transmitting the road profile to a server, wherein the server is configured to generate or update a point map of the road;
planning a path of a vehicle based on the road profile; or detecting objects in the environment based on the road profile (Para 83, vehicle 1700 may use this information (e.g., instances of obstacles) to navigate, plan, or otherwise perform one or more operations (e.g., obstacle or protuberance avoidance, lane keeping, lane changing, merging, splitting, adapting a suspension system of the ego-object or ego-actor to match the current road surface, applying an early acceleration or deceleration based on an approaching surface slope, mapping, etc.) within the environment. i.e. vehicle uses 3D road surface information to determine a location of the vehicle, and update a point map of the road, and navigate a vehicle's path).
Regarding claim 16, Wang et al teaches the apparatus of claim 1, wherein the apparatus is included in an autonomous or semi-autonomous vehicle (Para 83 and Fig 1, 3D structure of the detected surface been determined, positional values that are not already in 3D world coordinates may be converted to 3D world coordinates, associated with a corresponding class label identifying the detected surface (e.g., a road), and/or may be provided for use by the vehicle 1700 of FIGS. 17A-17D in performing one or more operations. For example, the dense detection data 120 (e.g., a 3D point cloud, a projection image, corresponding labels) may be used by control component(s) of the vehicle 1700, such as an autonomous driving software stack 122 executing on one or more components of the vehicle 1700 of FIGS. 17A-17D (e.g., the SoC(s) 1704, the CPU(s) 1718, the GPU(s) 1720, etc.). i.e. autonomous vehicle uses apparatus).
Regarding claims 17-21, claims 17-21 rejected for the same reasons as claims 1-5, respectively.
Regarding claims 28-29, claims 28-29 rejected for the same reasons as claims 12-13, respectively.
Regarding claim 30, claim 30 rejected for the same reasons as claim 15.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 6 and 22 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al (US 20230135088 A1) in view of Wang (2) et al (US 20230154170 A1).
Regarding claim 6, Wang et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to: obtain image-based queries based on the one or more images;
and obtain point-cloud-based queries based on the point cloud;
wherein the image-based three-dimensional features and the point-cloud-based three-dimensional features are combined based on the image-based queries and the point-cloud-based queries using a (Para 81, generally, the image encoder 610 (and/or any other input head) may include any number of layers (e.g., convolutions, pooling, and/or other types of operations) to extract features into some latent space, and the extracted features may be combined (e.g., concatenated) with extracted features from the encoder portion of the encoder/decoder 540 (and/or extracted features from other input heads). As such, in some embodiments such as the one illustrated in FIG. 6, the deep learning model(s) 535 may learn from two different views of an observed surface structure (e.g., top-down and perspective, 3D point cloud space and 2D image space, etc.). i.e. image-based queries (extracted features into some latent space) are used to concatenate the features. See Para 51 and 82-83 regarding determining point-cloud and image-based 3D features which are then used for the combination).
Wang et al does not teach a self-attention transformer specifically, though it does mention a machine learning model in Fig 6 535.
In a similar field of endeavor, Wang (2) et al teaches a self-attention transformer for fusing features (Fig 3C and 3D, Fig 5 discusses fusing image-based 3D features and point cloud 3D features based on a self-attention transformer).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Wang (2) et al (US 20230154170 A1) so that the apparatus includes a self-attention transformer for feature fusion. Doing so would provide an improved performance, through various feature extractors or extraction operations, over previous approaches (Wang (2) et al., Para 69).
Regarding claim 22, claim 22 rejected for the same reasons as claim 6 in the combination above.
Claim(s) 7-9 and 23-25 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al (US 20230135088 A1) in view of Xu et al., (US 20200082614 A1).
Regarding claim 7, Wang et al does not teach the apparatus of claim 1, wherein the at least one processor is further configured to obtain map-based three-dimensional features derived from a point map of the environment, wherein the map-based three-dimensional features are also combined into the combined three-dimensional features.
In a similar field of endeavor, Xu et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to obtain map-based three-dimensional features derived from a point map of the environment, wherein the map-based three-dimensional features are also combined into the combined three-dimensional features (Para 202, i.e. 3D map is the navigation data stream 202 can include geographical data measured by the navigation sensors (e.g., GPS, IMU) of the 3D sensing devices), point cloud scans, and color image 2D data. Includes semantic segmentation. This is combined with encoding into updated map dataset 216).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Xu et al., (US 20200082614 A1) so that the apparatus includes obtaining map-based 3D features derived from a point map of the environment. Doing so would efficiently and accurately generate a high-fidelity three-dimensional representation of a physical environment that can include dynamic objects and scenarios (Para 2, Xu et al).
Regarding claim 8, Wang et al does not teach the apparatus of claim 7, wherein the at least one processor is further configured to: obtain a point map; and generate the map-based three-dimensional features based on the point map.
In a similar field of endeavor, Xu et al teaches the apparatus of claim 7, wherein the at least one processor is further configured to: obtain a point map; and generate the map-based three-dimensional features based on the point map (Fig 2, 202 -> 208, point cloud aggregation from navigation data stream, i.e. generate the map-based 3D features based on the point map).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Xu et al., (US 20200082614 A1) so that the apparatus generates the 3D features of the map based on the point map. Doing so would efficiently and accurately generate a high-fidelity three-dimensional representation of a physical environment that can include dynamic objects and scenarios (Para 2, Xu et al).
Regarding claim 9, Wang et al does not teach the apparatus of claim 8, wherein the point map comprises a high-definition (HD) map.
In a similar field of endeavor, Xu et al teaches the apparatus of claim 8, wherein the point map comprises a high-definition (HD) map (Para 70, map data used for alignment can be "high-fidelity map data", used in iterative process of combining point cloud data and generating 3D features).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Xu et al., (US 20200082614 A1) so that the map includes a high definition map. Doing so would efficiently and accurately generate a high-fidelity three-dimensional representation of a physical environment that can include dynamic objects and scenarios (Para 2, Xu et al).
Regarding claims 23-25, claims 23-25 rejected for the same reasons as claim 7-9 in the combination above, respectively.
Claim(s) 10-11 and 26-27 is/are rejected under 35 U.S.C. 103 as being unpatentable over Wang et al (US 20230135088 A1) in view of Kim et al (US 20240059296 A1).
Regarding claim 10, Wang et al does not teach the apparatus of claim 1, wherein the at least one processor is further configured to generate a perturbation map based on the combined three-dimensional features.
In a similar field of endeavor, Kim et al teaches the apparatus of claim 1, wherein the at least one processor is further configured to generate a perturbation map based on the combined three-dimensional features (Fig 4A and 4B, Para 153-158, i.e. generating a 2D perturbation map based on the combined 3D features. See Fig 3 regarding combination of point cloud and 3D local map features).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Kim et al (US 20240059296 A1) so that the apparatus generates a perturbation map based on combined 3D features. Doing so would check a road surface condition while the vehicle is driven and adjusting a suspension so that an impact is not transmitted to a driver when the vehicle passes a bump (Para 3, Kim et al).
Regarding claim 11, Wang et al does not teach the apparatus of claim 10, wherein the perturbation map comprises a representation of deviations from the road profile.
In a similar field of endeavor, Kim et al teaches the apparatus of claim 10, wherein the perturbation map comprises a representation of deviations from the road profile ((Fig 4A and 4B, Para 153-158, i.e. road profile. generating a 2D perturbation map based on the combined 3D features. See Fig 3 regarding combination of point cloud and 3D local map features).
Therefore, it would have been obvious for one of ordinary skill in the art before the effective filing date to incorporate the teachings of Wang et al (US 20230135088 A1) in view of Kim et al (US 20240059296 A1) so that the deviations from the road profile are represented in the perturbation map. Doing so would check a road surface condition while the vehicle is driven and adjusting a suspension so that an impact is not transmitted to a driver when the vehicle passes a bump (Para 3, Kim et al).
Regarding claims 26-27, claims 26-27 rejected for the same reasons as claim 10-11 in the combination above, respectively.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
WO 2020148183 A1
US 20220024485 A1
US 20190163990 A1
US 20190152487 A1
US 20170270372 A1
US 11645775 B1
US 11461915 B2
US 10339394 B2
US 10115024 B2
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACK PETER KRAYNAK whose telephone number is (703)756-1713. The examiner can normally be reached Monday - Friday 7:30 AM - 5 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/JACK PETER KRAYNAK/Examiner, Art Unit 2668
/UTPAL D SHAH/Primary Examiner, Art Unit 2668