Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1, 4-5, 7-10, 13-14, 16-20 are rejected under 35 U.S.C. 103 as being unpatentable over Chang et al. (Argoverse: 3D Tracking and Forecasting with Rich Maps, hereinafter Chang) in view of Sen et al. (US 2021/0253131 A1, hereinafter Sen).
Regarding Claims 1, 10, and 19, Chang discloses
Claim 1: A method, comprising:
Claim 10: A system, comprising: a data store storing computer-executable instructions; and at least one processor configured to:
Claim 19: A non-transitory computer-readable media comprising computer-executable instructions that, when executed by a computing system, causes the computing system to:
receive a 3D image of an environment of an autonomous vehicle, the 3D image including a plurality of 3D data points, wherein at least one 3D data point of the plurality of 3D data points indicates a location of the at least one 3D data point in three dimensions (P. 5 Section 3.2 3D Tack Annotations: “We only annotated objects within 5 m of the drivable area as defined by our map. For objects that are not visible for the entire segment duration, tracks are instantiated as soon as the object becomes visible in the LiDAR point cloud and tracks are terminated when the object ceases to be visible.”, P. 7 Section 4.1 Evaluation: “We have employed relatively simple baselines to track objects in 3D. We believe that our data enables new approaches to map-based and multimodal tracking research.”);
generate a filtered 3D image from the received 3D image, wherein to generate the filtered 3D image, execution of the computer-executable instructions cause the computing system to: remove at least one 3D data point from the 3D image that satisfies a threshold distance from the autonomous vehicle (P. 7 Section 4.1 Evaluation: “We apply a threshold (30, 50, 100 m) to the distance between vehicles and our ego-vehicle and only evaluate annotations and tracker output within that range.”, P. 8 Table 3: “Tracking accuracy comparison at different ranges while using different map attributes. From top to bottom, accuracy for vehicles within 30 m, 50 m, and 100 m.”),
remove at least one 3D data point from the 3D image that is located outside of a drivable area (P. 6 Subsection Drivable area: “Since our baseline is focused on vehicle tracking, we constrain our tracker to the driveable area as specified by the map. This driveable area covers any region where it is possible for the vehicle to drive (see Section 3.1). This constraint reduces the opportunities for false positives.”, P. 17 Figure 13: “remove_non_driveable_area_points: Uses rasterized driveable area ROI to decimate LiDAR point cloud to only ROI points.”),
wherein the drivable area is identified based on map data associated with a map of the environment of the autonomous vehicle (P. 5 Subsection Rasterized Drivable Area Map: “Our maps include binary driveable area labels at 1 meter grid resolution. A driveable area is an area where it is possible for a vehicle to drive (though not necessarily legal). Driveable areas can encompass a road’s shoulder in addition to the normal driveable area that is represented by a lane segment. We annotate 3D objects with track labels if they are within 5 meters of the driveable area (Section 3.2). We call this larger area our region of interest (ROI).”),
remove at least one 3D data point from the 3D image that satisfies a height threshold (P. 5 Rasterized Ground Height Map: “Finally, our maps include real-valued ground height at 1 meter grid resolution. Knowledge of ground height can be used to remove LiDAR returns on static ground surfaces and thus makes the 3D detection of dynamic objects easier.”, P. 8 Subsection Ground Height: “We use map information to remove LiDAR returns on the ground. In contrast to local ground-plane estimation methods, the map-based approach is effective in sloping and uneven environments.”),
identify at least one group of 3D data points in the filtered 3D image (P. 6 Subsection Baseline Tracker: “Our baseline tracking pipeline clusters LiDAR returns in driveable region (labeled by the map) to detect potential objects, uses Mask R-CNN [18] to prune non-vehicle LiDAR returns, associates clusters over time using nearest neighbor and the Hungarian algorithm, estimates transformations between clusters with iterative closest point (ICP), and estimates vehicle pose with a classical Kalman Filter using constant velocity motion model.”); and
generate an image annotation based on the at least one group of 3D data points (P. 6 Subsection Baseline Tracker: “If a cluster is not associated with current tracked objects, we initialize a new object ID for it.”).
However Chang does not disclose
remove at least one 3D data point from the 3D image that corresponds to at least one identified object within the environment,
wherein the at least one identified object is identified using a machine learning model configured to identify objects in images.
Sen teaches
remove at least one 3D data point from the 3D image that corresponds to at least one identified object within the environment (Fig. 5 Step 502, Para [0024]: “The secondary perception system can analyze some or all of the sensor data to determine respective velocities and/or paths for the such unclassifiable and/or unclassified objects (e.g., which can be identified as clusters of sensor data points). For example, the secondary perception system can analyze all of the sensor data and/or filter the sensor data to remove classified sensor data and/or nonviable sensor data. Examples of sensor data that can be excluded from processing by the secondary perception sensor data include sensor data that is spurious, already classified by the primary perception system, outside of one or more region of interest (RoI) of an autonomous vehicle, and/or any other suitable classified and/or nonviable sensor data, or combination thereof.”, Para [0063]: “In some implementations, the means can be configured to obtain LIDAR data (e.g., a three-dimensional point cloud) obtained from a LIDAR system.”),
wherein the at least one identified object is identified using a machine learning model configured to identify objects in images (Para [0151]: “The method 600 can include, at 606, classifying each of the plurality of classifiable objects as a predefined class of a plurality of predefined classes. For example, in some implementations, one or more classification models (e.g., machine-learned models) can be configured to receive some or all of the sensor data and output classifications associated with objects in the sensor data (e.g., the plurality of classifiable objects.”).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chang with removing LIDAR data points that are already classified of Sen to effectively increase the robustness when identifying objects for autonomous driving purposes. Both Chang and Sen are arts relating to gathering information for autonomous driving.
Regarding Claims 4, 13, and 20, dependent upon claims 1, 10, and 19 respectively, Chang in view of Sen teaches everything regarding claims 1, 10, and 19.
Chang further discloses
wherein removing at least one 3D data point from the 3D image that is located outside of a drivable area comprises: obtaining the map data associated with the map, identifying the drivable area within the map based on the map data (P. 5 Subsection Rasterized Drivable Area Map: “Our maps include binary driveable area labels at 1 meter grid resolution. A driveable area is an area where it is possible for a vehicle to drive (though not necessarily legal). Driveable areas can encompass a road’s shoulder in addition to the normal driveable area that is represented by a lane segment. We annotate 3D objects with track labels if they are within 5 meters of the driveable area (Section 3.2). We call this larger area our region of interest (ROI).”), and
removing the at least one 3D data point from the 3D image that is located outside of the identified drivable area (P. 6 Subsection Drivable area: “Since our baseline is focused on vehicle tracking, we constrain our tracker to the driveable area as specified by the map. This driveable area covers any region where it is possible for the vehicle to drive (see Section 3.1). This constraint reduces the opportunities for false positives.”, P. 17 Figure 13: “remove_non_driveable_area_points: Uses rasterized driveable area ROI to decimate LiDAR point cloud to only ROI points.”).
Regarding Claims 5 and 14, dependent upon claims 1 and 10 respectively, Chang in view of Sen teaches everything regarding claims 1 and 10.
Chang further discloses
wherein the height threshold is a first height threshold, the method further comprising removing at least one 3D data point that satisfies a second height threshold (P. 17 Figure 13: “remove_ground_surface: Removes all 3D points within 30 cm of the ground surface.”, P. 17 Figure 13: “get_ground_height_at_xy: Gets ground height at provided (x,y) coordinates.”; This suggests that there are multiple ground height at different coordinates, as such multiple height thresholds for the ground surface exists.).
Regarding Claims 7 and 16, dependent upon claims 1 and 10 respectively, Chang in view of Sen teaches everything regarding claims 1 and 10.
Chang further discloses
wherein the threshold distance is a first threshold distance (P. 7 Section 4.1 Evaluation: “We apply a threshold (30, 50, 100 m) to the distance between vehicles and our ego-vehicle and only evaluate annotations and tracker output within that range.”, P. 8 Table 3: “Tracking accuracy comparison at different ranges while using different map attributes. From top to bottom, accuracy for vehicles within 30 m, 50 m, and 100 m.”), and
wherein generating the filtered 3D image further comprises: removing at least one 3D data point from the 3D image that does not have a corresponding 3D data point in a subsequent 3D image that is within a second threshold distance (P. 6 Subsection Baseline Tracker: “Our baseline tracking pipeline clusters LiDAR returns in driveable region (labeled by the map) to detect potential objects, uses Mask R-CNN [18] to prune non-vehicle LiDAR returns, associates clusters over time using nearest neighbor and the Hungarian algorithm, estimates transformations between clusters with iterative closest point (ICP), and estimates vehicle pose with a classical Kalman Filter using constant velocity motion model. The same predefined bounding box size is used for all vehicles. When no match can be found by Hungarian method for an object, the object pose is maintained using only motion model up to 5 frames before being removed or associated to a new cluster. This enables our tracker to maintain same object ID even if the object is occluded for a short period of time and reappears.”; Using Kalman filter and motion model to estimate a point means that the distance between the points and the current car is changing.. Unless the two objects is moving at the same time, with the same velocity, and at the same direction, the distance between the current vehicle and the tracked object will change. As such, it is implied that there exists a different threshold distance created by the Kalman filter and the velocity model.).
Regarding Claims 8 and 17, dependent upon claims 1 and 10 respectively, Chang in view of Sen teaches everything regarding claims 1 and 10.
Chang further discloses
wherein removing at least one 3D data point from the 3D image that corresponds to at least one identified object within the environment comprises: receiving an image of the environment from a camera (P. 1 Abstract: “The Argoverse 3D Tracking dataset includes 360º images from 7 cameras with overlapping fields of view, 3D point clouds from long range LiDAR, 6-DOF pose, and 3D track annotations.”, P. 5 Section 3.2 3D Tack Annotations: “We only annotated objects within 5 m of the driveable area as defined by our map. For objects that are not visible for the entire segment duration, tracks are instantiated as soon as the object becomes visible in the LiDAR point cloud and tracks are terminated when the object ceases to be visible.”, P. 7 Section 4.1 Evaluation: “We have employed relatively simple baselines to track objects in 3D. We believe that our data enables new approaches to map-based and multimodal tracking research.”).
Sen further teaches
identifying, using the machine learning model, at least one object within the image (Para [0151]: “The method 600 can include, at 606, classifying each of the plurality of classifiable objects as a predefined class of a plurality of predefined classes. For example, in some implementations, one or more classification models (e.g., machine-learned models) can be configured to receive some or all of the sensor data and output classifications associated with objects in the sensor data (e.g., the plurality of classifiable objects.”), and
removing the at least one 3D data point from the 3D image that corresponds to the at least one identified object (Fig. 5 Step 502, Para [0024]: “The secondary perception system can analyze some or all of the sensor data to determine respective velocities and/or paths for the such unclassifiable and/or unclassified objects (e.g., which can be identified as clusters of sensor data points). For example, the secondary perception system can analyze all of the sensor data and/or filter the sensor data to remove classified sensor data and/or nonviable sensor data. Examples of sensor data that can be excluded from processing by the secondary perception sensor data include sensor data that is spurious, already classified by the primary perception system, outside of one or more region of interest (RoI) of an autonomous vehicle, and/or any other suitable classified and/or nonviable sensor data, or combination thereof.”, Para [0063]: “In some implementations, the means can be configured to obtain LIDAR data (e.g., a three-dimensional point cloud) obtained from a LIDAR system.”).
Regarding Claims 9 and 18, dependent upon claims 1 and 10 respectively, Chang in view of Sen teaches everything regarding claims 1 and 10.
Chang further discloses
wherein generating an image annotation comprises generating a 3D bounding box that surrounds the at least one group of 3D data points (P. 6 Subsection Baseline Tracker: “The same predefined bounding box size is used for all vehicles.”).
Claim(s) 2-3, 6, 11-12, and 15 are rejected under 35 U.S.C. 103 as being unpatentable over Chang et al. (Argoverse: 3D Tracking and Forecasting with Rich Maps, hereinafter Chang) in view of Sen et al. (US 2021/0253131 A1, hereinafter Sen) and Wang et al. (A New Method of 3D Point Cloud Data Processing in Low-speed Self-driving Car, hereinafter Wang).
Regarding Claims 2 and 11, dependent upon claims 1 and 10 respectively, Chang in view of Sen teaches everything regarding claims 1 and 10.
However, Chang in view of Sen does not teach
the threshold distance is a first threshold distance, the method further comprising removing at least one 3D data point that satisfies a second threshold distance.
Wang teaches
the threshold distance is a first threshold distance, the method further comprising removing at least one 3D data point that satisfies a second threshold distance (P. 1 A. Point Cloud Filtering: “A valid data field is selected and the points which distances are out of the range are thrown away. It reduces the number of data greatly, and the calculation is sped up. The thresholds in three directions are selected as
-
30
m
≤
X
≤
30
m
-
30
m
≤
Y
≤
30
m
-
1.2
m
≤
Z
≤
1.2
m
”).
It would have been prima facie obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Chang in view of Sen with distance and height thresholds of Wang to effectively increase the efficiency when processing point cloud data.
Regarding Claims 3 and 12, dependent upon claims 2 and 11 respectively, Chang in view of Sen and Wang teaches everything regarding claims 2 and 11.
Wang further teaches
wherein the at least one 3D data point that satisfies the first threshold distance is closer to the autonomous vehicle than the first threshold distance and the at least one 3D data point that satisfies the second threshold distance is farther away from the autonomous vehicle than the second threshold distance (P. 1 A. Point Cloud Filtering: “A valid data field is selected and the points which distances are out of the range are thrown away. It reduces the number of data greatly, and the calculation is sped up. The thresholds in three directions are selected as
-
30
m
≤
X
≤
30
m
-
30
m
≤
Y
≤
30
m
-
1.2
m
≤
Z
≤
1.2
m
”).
Regarding Claims 6 and 15, dependent upon claims 5 and 14 respectively, Chang in view of Sen teaches everything regarding claims 5 and 14.
However, Chang in view of Sen does not teach
wherein the at least one 3D data point that satisfies the first height threshold is lower than the first height threshold and the at least one 3D data point that satisfies the second height threshold is higher than the second height threshold.
Wang teaches
wherein the at least one 3D data point that satisfies the first height threshold is lower than the first height threshold and the at least one 3D data point that satisfies the second height threshold is higher than the second height threshold (P. 1 A. Point Cloud Filtering: “A valid data field is selected and the points which distances are out of the range are thrown away. It reduces the number of data greatly, and the calculation is sped up. The thresholds in three directions are selected as
-
30
m
≤
X
≤
30
m
-
30
m
≤
Y
≤
30
m
-
1.2
m
≤
Z
≤
1.2
m
”).
Relevant Prior Art Directed to State of Art
Liang et al. (US 11,500,099 B2, hereinafter Liang) is prior art not applied in the rejection(s) above. Liang discloses systems and methods implement improved detection of objects in three-dimensional (3D) space. More particularly, an improved 3D object detection system can exploit continuous fusion of multiple sensors and/or integrated geographic prior map data to enhance effectiveness and robustness of object detection in applications such as autonomous driving. In some implementations, geographic prior data (e.g., geometric ground and/or semantic road features) can be exploited to enhance three-dimensional object detection for autonomous vehicle applications. In some implementations, object detection systems and methods can be improved based on dynamic utilization of multiple sensor modalities. More particularly, an improved 3D object detection system can exploit both LIDAR systems and cameras to perform very accurate localization of objects within three-dimensional space relative to an autonomous vehicle. For example, multi-sensor fusion can be implemented via continuous convolutions to fuse image data samples and LIDAR feature maps at different levels of resolution.
Jespersen et al. (US 11,158,120 B1, hereinafter Jespersen) is prior art not applied in the rejection(s) above. Jespersen discloses techniques for obtaining a range image related to a depth sensor of a vehicle operating in an environment. A first data point is identified in the range image with an intensity at or below a first intensity threshold. A first number of data points are determined in the range image that have an intensity at or above a second intensity threshold in a first region of the range image. Then, it is determined whether the first number of data points is at or above a region number threshold. The first data point is removed from the range image if the first number of data points is at or above the region number threshold. Operation of the vehicle is then facilitated in the environment based at least in part on the range image. Other embodiments may be described or claimed.
Khadem et al. (US 12,360,213 B2, hereinafter Khadem) is prior art not applied in the rejection(s) above. Khadem discloses techniques for segmenting and classifying a representation of aggregated sensor data from a scene. Sensor data may be collected during multiple traversals of a same scene, and the sensor data may be filtered to remove portions of the sensor data not relevant to road network maps. In some examples, the filtered data may be aggregated and represented in voxels of a three-dimensional voxel space, from which an image representing a top-down view of the scene may be generated, though other views are also contemplated. Operations may include segmenting and/or classifying the image e.g., by a trained machine-learned model, to associate class labels indicative of map elements (e.g., driving lane, stop line, turn lane, and the like) with segments identified in the image. Additionally, techniques may create or update road network maps based on segmented and semantically labeled image(s) of various portions of an environment.
Saranin et al. (US 12,050,273 B2, hereinafter Saranin) is prior art not applied in the rejection(s) above. Saranin discloses systems/methods for object detection. The methods comprise: obtaining, by a computing device, a LiDAR dataset generated by a LiDAR system of the autonomous vehicle; and using, by a computing device, the LiDAR dataset and at least one image to detect an object that is in proximity to the autonomous vehicle. The object is detected by: generating a pruned LiDAR dataset by reducing a total number of points contained in the LiDAR dataset; and detecting the object in a point cloud defined by the pruned LiDAR dataset. The object detection may be used by the computing device to facilitate at least one autonomous driving operation.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JOSHUA CHEN whose telephone number is (703)756-5394. The examiner can normally be reached M-Th: 9:30 am - 4:30pm ET F: 9:30 am - 2:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, STEPHEN R KOZIOL can be reached at (408)918-7630. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J. C./ Examiner, Art Unit 2665
/WASSIM MAHROUKA/ Primary Examiner, Art Unit 2665