Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Applicant’s arguments, see Response After Final action, filed on 01/21/2026, with respect to
Final Rejection have been fully considered and are persuasive. The Final rejection of 10/21/2025
has been withdrawn.
This action is responsive to the After Final filed on 01/21/2026. No claims have been amended.
Claims 1-20 are pending in this case. Claims 1, 8 and 15 are independent claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1, 8 and 15 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan et al. (US Pub No.: 20250095173 A1), hereinafter referred to as Srinivasan, in view of Vu et al. (US Pub No.: 20220289237 A1), hereinafter referred to as Vu and further in view of NABATCHIAN et al. (US Pub No.: 20210347378 A1), hereinafter referred to as NABATCHIAN.
With respect to claim 1, Srinivasan disclose:
A machine-learning based (ML-based) method for determining traversability with uncertainty object estimation for one or more robot devices to navigate through one or more terrains, the ML-based method comprising: obtaining, by one or more hardware processors, optimum-fidelity scan data in a form of point cloud from one or more scanner devices, wherein the optimum-fidelity scan data generated by the one or more scanner devices comprise at least one of: one or more three-dimensional point clouds, information associated with optimum-resolution surfaces, depth and distance measurements, color and intensity attributes, spatial coordinates, associated with one or more environments (In Fig. 1 and paragraphs [0025-0026], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. These points are found by a light (like a laser) sent out by LiDAR unit 112 and bouncing back to it. By measuring the angle of the light and how long it takes to return, LiDAR unit 112 can figure out the 3D location of each point. The autonomous driving controller 120 gets image frames from camera 110 very quickly, at rates like 30, 60, 90, or 120 frames per second. It also receives point cloud data from LiDAR unit 112 at the same speed, so each point cloud matches an image frame (or frames from several cameras).)
generating, by the one or more hardware processors, a dense point cloud based upon the point cloud from one or more scanner devices and including one or more ground-truth map features from the elevation map of the one or more environments (In Fig. 1 and paragraphs [0025], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. In paragraph [0027], disclose a ground truth dense depth map may be used, to compare the generated depth maps to the ground truth dense depth map.)
With respect to claim 1, Srinivasan does not explicitly disclose:
generating, by the one or more hardware processors, an elevation map including elevation cells depicting changes in elevation associated with features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud
generating, by the one or more hardware processors, a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments
predicting, by the one or more hardware processors, one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determining, by the one or more hardware processors, the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values
However, it is known by Vu to disclose:
Generating, by the one or more hardware processors, an elevation map including elevation cells depicting changes in elevation associated with features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud (In FIG. 4 and paragraph [0023], Vu discloses an occupancy grid generator system 208 that generates a two-dimensional (2D) occupancy grid 214 based on the remaining data points, where the occupancy grid 214 comprises a plurality of grid cells. The grid cells include respective likelihoods that an obstacle is present in regions of the environment of AV 100 represented by the grid cells. The grid cells show a chance that an area in the cell has an obstacle. These chances can change over time to consider things like noise, dust, and rain.)
Generating, by the one or more hardware processors, a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments (In paragraph [0050], Vu discloses a point cloud 302 made up of data points from a LIDAR system used in the AV 300. These points have three-dimensional coordinates that show where laser light bounced off surfaces outside. The point cloud 302 shows measurements of the ground and objects above it, like buildings, trees, and cones. In Fig. 4 and paragraph [0051], disclose representing the area around the AV 300. This mesh is created by the computing system 104 using the point cloud 302 from FIG.3. The mesh has many nodes 402, each showing how high the ground is at that spot compared to the LIDAR system in the AV 300. Each node 402 comes from one or more data points in the point cloud 302. In Fig. 4 and paragraph [0052], further disclose data points representing objects in the environment of the AV 300. As discussed above, the data points can be filtered based on their heights relative to the ground surface mesh 400. The remaining data points represent potential obstacles to the AV 300 in the environment of the AV 300.)
Srinivasan and Vu are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Srinivasan, with extracting point cloud features from a point cloud representation of the area, the point cloud features representing the objects in the area as taught by Srinivasan, with generating a two-dimensional occupancy grid comprising a plurality of grid cells, wherein each grid cell of the plurality of grid cells includes an occupancy probability indicating a likelihood that a region represented by the corresponding grid cell is occupied by the object as taught by Vu. The motivation for doing so would have been to reduce the computational cost and improving the efficiency of the model (See [0023] of Srinivasan).
With respect to claim 1, Srinivasan and Vu do not explicitly disclose:
predicting, by the one or more hardware processors, one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determining, by the one or more hardware processors, the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values
However, it is known by NABATCHIAN to disclose:
Predicting, by the one or more hardware processors, one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model (In paragraph [0085], NABATCHIAN discloses that predicted features 315 are a group of vectors that show information about one or more areas of interest from the map feature vectors 310 and the vehicle path from the path feature vectors. In paragraph [0122], NABATCHIAN discloses that the map generation module 350 generates a set of predicted features 315 based on the concatenated feature vectors 313, where the predicted features 315 are indicative of one or more areas of interest and the vehicle's trajectory.)
Determining, by the one or more hardware processors, the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values (In paragraph [0087], NABATCHIAN discloses that the minimum Euclidean distance is the shortest straight line between two points. For a system to consider a moving object important, this distance needs to be below a certain limit. This limit is learned by the system during training and can depend on things like how fast the object and the vehicle are moving, the vehicle's planned path, and the types of vehicles involved. An administrator can also set this limit if needed.)
Srinivasan in view of Vu and NABATCHIAN are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify NABATCHIAN, with receiving information representative of a planned path for the vehicle as taught by NABATCHIAN. The motivation for doing so would have been to filter object detection region and reduce computation of object detection, segmentation and tracking components (See[0008] of NABATCHIAN).
With respect to claim 8, Srinivasan disclose:
A machine-learning based (ML-based) system for determining traversability with uncertainty object estimation for one or more robot devices to navigate through one or more terrains, the ML-based system comprising: one or more hardware processors (In paragraph [0081], Srinivasan disclose a memory configured to store a neural network model for the neural network; and a processing system comprising one or more processors implemented in circuitry, the processing system being configured to: extract image features from an image of an area, the image features representing objects in the area; extract point cloud features from a point cloud representation of the area.)
A memory coupled to the one or more hardware processors, wherein the memory comprises a plurality of subsystems in form of programmable instructions executable by the one or more hardware processors, and wherein the plurality of subsystems comprises: a data obtaining subsystem configured to obtain optimum-fidelity scan data in a form of point cloud from one or more scanner devices, wherein the optimum-fidelity scan data generated by the one or more scanner devices comprise at least one of: one or more three-dimensional point clouds, information associated with optimum-resolution surfaces, depth and distance measurements, color and intensity attributes, spatial coordinates, associated with one or more environments (In Fig. 1 and paragraphs [0025-0026], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. These points are found by a light (like a laser) sent out by LiDAR unit 112 and bouncing back to it. By measuring the angle of the light and how long it takes to return, LiDAR unit 112 can figure out the 3D location of each point. The autonomous driving controller 120 gets image frames from camera 110 very quickly, at rates like 30, 60, 90, or 120 frames per second. It also receives point cloud data from LiDAR unit 112 at the same speed, so each point cloud matches an image frame (or frames from several cameras).)
A point cloud generating subsystem configured to: generate a dense point cloud based upon the point cloud from one or more scanner devices and including one or more ground-truth map features from the elevation map of the one or more environments (In Fig. 1 and paragraphs [0025], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. In paragraph [0027], disclose a ground truth dense depth map may be used, to compare the generated depth maps to the ground truth dense depth map.)
With respect to claim 8, Srinivasan does not explicitly disclose:
an elevation map generating subsystem configured to generate an elevation map including elevation cells depicting changes in elevation associated features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud
generate a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments
a traversability predicting subsystem configured to: predict one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determine the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein to determine the traversability with the uncertainty object estimation includes calculating the traversability as a probability the predicted one or more traversability features are below respective critical threshold values
However, it is known by Vu to disclose:
An elevation map generating subsystem configured to generate an elevation map including elevation cells depicting changes in elevation associated features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud (In FIG. 4 and paragraph [0023], Vu discloses an occupancy grid generator system 208 that generates a two-dimensional (2D) occupancy grid 214 based on the remaining data points, where the occupancy grid 214 comprises a plurality of grid cells. The grid cells include respective likelihoods that an obstacle is present in regions of the environment of AV 100 represented by the grid cells. The grid cells show a chance that an area in the cell has an obstacle. These chances can change over time to consider things like noise, dust, and rain.)
Generate a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments (In paragraph [0050], Vu discloses a point cloud 302 made up of data points from a LIDAR system used in the AV 300. These points have three-dimensional coordinates that show where laser light bounced off surfaces outside. The point cloud 302 shows measurements of the ground and objects above it, like buildings, trees, and cones. In Fig. 4 and paragraph [0051], disclose representing the area around the AV 300. This mesh is created by the computing system 104 using the point cloud 302 from FIG.3. The mesh has many nodes 402, each showing how high the ground is at that spot compared to the LIDAR system in the AV 300. Each node 402 comes from one or more data points in the point cloud 302. In Fig. 4 and paragraph [0052], further disclose data points representing objects in the environment of the AV 300. As discussed above, the data points can be filtered based on their heights relative to the ground surface mesh 400. The remaining data points represent potential obstacles to the AV 300 in the environment of the AV 300.)
Srinivasan and Vu are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Srinivasan, with extracting point cloud features from a point cloud representation of the area, the point cloud features representing the objects in the area as taught by Srinivasan, with generating a two-dimensional occupancy grid comprising a plurality of grid cells, wherein each grid cell of the plurality of grid cells includes an occupancy probability indicating a likelihood that a region represented by the corresponding grid cell is occupied by the object as taught by Vu. The motivation for doing so would have been to reduce the computational cost and improving the efficiency of the model (See [0023] of Srinivasan).
With respect to claim 8, Srinivasan and Vu do not explicitly disclose:
a traversability predicting subsystem configured to: predict one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determine the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein to determine the traversability with the uncertainty object estimation includes calculating the traversability
However, it is known by NABATCHIAN to disclose:
A traversability predicting subsystem configured to: predict one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model (In paragraph [0085], NABATCHIAN discloses that predicted features 315 are a group of vectors that show information about one or more areas of interest from the map feature vectors 310 and the vehicle path from the path feature vectors. In paragraph [0122], NABATCHIAN discloses that the map generation module 350 generates a set of predicted features 315 based on the concatenated feature vectors 313, where the predicted features 315 are indicative of one or more areas of interest and the vehicle's trajectory.)
Determine the traversability with the uncertainty object estimation, which adapts the one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein to determine the traversability with the uncertainty object estimation includes calculating the traversability (In paragraph [0087], NABATCHIAN discloses that the minimum Euclidean distance is the shortest straight line between two points. For a system to consider a moving object important, this distance needs to be below a certain limit. This limit is learned by the system during training and can depend on things like how fast the object and the vehicle are moving, the vehicle's planned path, and the types of vehicles involved. An administrator can also set this limit if needed.)
Srinivasan in view of Vu and NABATCHIAN are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify NABATCHIAN, with receiving information representative of a planned path for the vehicle as taught by NABATCHIAN. The motivation for doing so would have been to filter object detection region and reduce computation of object detection, segmentation and tracking components (See[0008] of NABATCHIAN).
With respect to claim 15, Srinivasan disclose:
A non-transitory computer-readable storage medium having instructions stored therein that when executed by one or more hardware processors, cause the one or more hardware processors to execute operations of: obtaining optimum-fidelity scan data in a form of point cloud from one or more scanner devices, wherein the optimum-fidelity scan data generated by the one or more scanner devices comprise at least one of: one or more three-dimensional point clouds, information associated with optimum-resolution surfaces, depth and distance measurements, color and intensity attributes, spatial coordinates, associated with one or more environments including one or more one or more terrains (In Fig. 1 and paragraphs [0025-0026], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. These points are found by a light (like a laser) sent out by LiDAR unit 112 and bouncing back to it. By measuring the angle of the light and how long it takes to return, LiDAR unit 112 can figure out the 3D location of each point. The autonomous driving controller 120 gets image frames from camera 110 very quickly, at rates like 30, 60, 90, or 120 frames per second. It also receives point cloud data from LiDAR unit 112 at the same speed, so each point cloud matches an image frame (or frames from several cameras).)
generating a dense point cloud based upon the point cloud from one or more scanner devices and including one or more ground-truth map features from the elevation map of the one or more environments (In Fig. 1 and paragraphs [0025], Srinivasan discloses that LiDAR unit 112 gives LiDAR data, like point cloud data, to the autonomous driving controller 120 for vehicle 100. It can create a point cloud for a 3D area while camera 110 takes a picture of the same area. The point cloud includes points that show surfaces or objects in the area. In paragraph [0027], disclose a ground truth dense depth map may be used, to compare the generated depth maps to the ground truth dense depth map.)
With respect to claim 15, Srinivasan does not explicitly disclose:
generating an elevation map including elevation cells depicting changes in elevation associated features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud
generating a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments
predicting one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determining the traversability with uncertainty object estimation, which adapts one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values
However, it is known by Vu to disclose:
Generating an elevation map including elevation cells depicting changes in elevation associated features of the one or more terrains and free space of the one or more environments by applying at least one of: an elevation mapping model and free-space detection model on the point cloud (In FIG. 4 and paragraph [0023], Vu discloses an occupancy grid generator system 208 that generates a two-dimensional (2D) occupancy grid 214 based on the remaining data points, where the occupancy grid 214 comprises a plurality of grid cells. The grid cells include respective likelihoods that an obstacle is present in regions of the environment of AV 100 represented by the grid cells. The grid cells show a chance that an area in the cell has an obstacle. These chances can change over time to consider things like noise, dust, and rain.)
generating a synthetic point cloud based on projecting the dense point cloud of the one or more environments into at least one frame defined by a respective at least one pose representing a virtual viewpoint of the one or more environments (In paragraph [0050], Vu discloses a point cloud 302 made up of data points from a LIDAR system used in the AV 300. These points have three-dimensional coordinates that show where laser light bounced off surfaces outside. The point cloud 302 shows measurements of the ground and objects above it, like buildings, trees, and cones. In Fig. 4 and paragraph [0051], disclose representing the area around the AV 300. This mesh is created by the computing system 104 using the point cloud 302 from FIG.3. The mesh has many nodes 402, each showing how high the ground is at that spot compared to the LIDAR system in the AV 300. Each node 402 comes from one or more data points in the point cloud 302. In Fig. 4 and paragraph [0052], further disclose data points representing objects in the environment of the AV 300. As discussed above, the data points can be filtered based on their heights relative to the ground surface mesh 400. The remaining data points represent potential obstacles to the AV 300 in the environment of the AV 300.)
Srinivasan and Vu are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Srinivasan, with extracting point cloud features from a point cloud representation of the area, the point cloud features representing the objects in the area as taught by Srinivasan, with generating a two-dimensional occupancy grid comprising a plurality of grid cells, wherein each grid cell of the plurality of grid cells includes an occupancy probability indicating a likelihood that a region represented by the corresponding grid cell is occupied by the object as taught by Vu. The motivation for doing so would have been to reduce the computational cost and improving the efficiency of the model (See [0023] of Srinivasan).
With respect to claim 15, Srinivasan and Vu do not explicitly disclose:
predicting one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model
determining the traversability with uncertainty object estimation, which adapts one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values
However, it is known by NABATCHIAN to disclose:
Predicting one or more traversability features from the synthetic point cloud associated with the one or more environments using a machine learning (ML) model (In paragraph [0085], NABATCHIAN discloses that predicted features 315 are a group of vectors that show information about one or more areas of interest from the map feature vectors 310 and the vehicle path from the path feature vectors. In paragraph [0122], NABATCHIAN discloses that the map generation module 350 generates a set of predicted features 315 based on the concatenated feature vectors 313, where the predicted features 315 are indicative of one or more areas of interest and the vehicle's trajectory.)
Determining the traversability with uncertainty object estimation, which adapts one or more robot devices to navigate on the one or more terrains, based on the one or more traversability features predicted from the ML model, wherein determining the traversability with the uncertainty object estimation includes calculating, by the one or more hardware processors, the traversability as a probability the predicted one or more traversability features are below respective critical threshold values (In paragraph [0087], NABATCHIAN discloses that the minimum Euclidean distance is the shortest straight line between two points. For a system to consider a moving object important, this distance needs to be below a certain limit. This limit is learned by the system during training and can depend on things like how fast the object and the vehicle are moving, the vehicle's planned path, and the types of vehicles involved. An administrator can also set this limit if needed.)
Srinivasan in view of Vu and NABATCHIAN are analogous pieces of art because both references concern the method of determining how best to operate a vehicle according to applicable traffic laws, safety guidelines, external objects, roads, and the like. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify NABATCHIAN, with receiving information representative of a planned path for the vehicle as taught by NABATCHIAN. The motivation for doing so would have been to filter object detection region and reduce computation of object detection, segmentation and tracking components (See[0008] of NABATCHIAN).
Claims 2 and 9 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN and further in view of Theverapperuma et al. (US Pub No.: 20220024485 A1), hereinafter referred to as Theverapperuma.
Regarding claim 2, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 1. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based method of claim 1, further comprising extracting, by the one or more hardware processors, the one or more traversability features comprising at least one of: step, slope, and roughness, of the one or more terrains, from one or more neighborhoods of one or more elevation cells, based on an analysis of the elevation map of the one or more environments
However, Theverapperuma disclose the limitation (In paragraph [0104], Theverapperuma discloses that the DSES 430 can receive real-time information about different features from three modules: the depth estimation module 424, the surface segmentation module 426, and the object detection module 428. For instance, the depth estimation module 424 can provide data on the slope of a road surface.)
Srinivasan in view of Vu, NABATCHIAN and Theverapperuma are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Theverapperuma, with generating information usable for making a decision as to whether a particular surface is drivable and for estimating the attributes of the particular surface as taught by Theverapperuma. The motivation for doing so would have been to improve the accuracy with which the boundaries of objects and surfaces are identified in 3D space (See [0107] of Theverapperuma).
Regarding claim 9, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 8. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based system of claim 8, wherein the traversability predicting subsystem is configured to extract the one or more traversability features comprising at least one of: step, slope, and roughness, of the one or more terrains, from one or more neighborhoods of one or more elevation cells, based on an analysis of the elevation map of the one or more environment
However, Theverapperuma disclose the limitation (In paragraph [0104], Theverapperuma discloses that the DSES 430 can receive real-time information about different features from three modules: the depth estimation module 424, the surface segmentation module 426, and the object detection module 428. For instance, the depth estimation module 424 can provide data on the slope of a road surface.)
Srinivasan in view of Vu, NABATCHIAN and Theverapperuma are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Theverapperuma, with generating information usable for making a decision as to whether a particular surface is drivable and for estimating the attributes of the particular surface as taught by Theverapperuma. The motivation for doing so would have been to improve the accuracy with which the boundaries of objects and surfaces are identified in 3D space (See [0107] of Theverapperuma).
Claims 3, 10 and 6 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN and further in view of STENNETH et al. (US Pub No.: 20200050973 A1), hereinafter referred to as STENNETH.
Regarding claim 3, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 1. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based method of claim 1, further comprising training, by the one or more hardware processors, the ML model, by: obtaining, by the one or more hardware processors, one or more training datasets from the generated dense point cloud with the one or more ground-truth map features
training, by the one or more hardware processors, the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features
predicting, by the one or more hardware processors, the one or more traversability features using the trained ML model
However, STENNETH disclose the limitations:
The ML-based method of claim 1, further comprising training, by the one or more hardware processors, the ML model, by: obtaining, by the one or more hardware processors, one or more training datasets from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0101], STENNETH discloses extracting a plurality of features for the pre-processed road observations, wherein the plurality of features include sensor-based features and map-based features with ground truth data. )
training, by the one or more hardware processors, the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0102], STENNETH discloses training an ML model based on a plurality of sensors based on features and the plurality of map-based features with ground-truth data.)
predicting, by the one or more hardware processors, the one or more traversability features using the trained ML model (In Fig. 5 and paragraph [0105], STENNETH discloses predicting the location of the road sign based on the trained machine learning model. )
Srinivasan in view of Vu, NABATCHIAN and STENNETH are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify STENNETH, with training a machine learning model based on the association of the ground truth data with the set of sensor based features and the set of map based features as taught by STENNETH. The motivation for doing so would have been to improve the cost and performance aspects of the navigation (See [0031] of STENNETH.)
Regarding claim 10, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 8. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based system of claim 8, further comprising a training subsystem configured to train the ML model, by: obtaining one or more training datasets from the generated dense point cloud with the one or more ground-truth map features
training the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features
predicting the one or more traversability features using the trained ML model
However, STENNETH disclose the limitations:
The ML-based system of claim 8, further comprising a training subsystem configured to train the ML model, by: obtaining one or more training datasets from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0101], STENNETH discloses extracting a plurality of features for the pre-processed road observations, wherein the plurality of features include sensor-based features and map based features with ground truth data.)
training the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0102], STENNETH discloses training an ML model based on a plurality of sensors based on features and the plurality of map-based features with ground-truth data.)
predicting the one or more traversability features using the trained ML model (In Fig. 5 and paragraph [0105], STENNETH discloses predicting the location of the road sign based on the trained machine learning model. )
Srinivasan in view of Vu, NABATCHIAN and STENNETH are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify STENNETH, with training a machine learning model based on the association of the ground truth data with the set of sensor based features and the set of map based features as taught by STENNETH. The motivation for doing so would have been to improve the cost and performance aspects of the navigation (See [0031] of STENNETH.)
Regarding claim 16, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 15. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The non-transitory computer-readable storage medium of claim 15, further comprising training the ML model, by: obtaining one or more training datasets from the generated dense point cloud with the one or more ground-truth map features
training the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features
predicting the one or more traversability features using the trained ML model
However, STENNETH disclose the limitations:
The non-transitory computer-readable storage medium of claim 15, further comprising training the ML model, by: obtaining one or more training datasets from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0101], STENNETH discloses extracting a plurality of features for the preprocessed road observations, wherein the plurality of features include sensor-based features and map-based features with ground truth data.)
training the ML model with the one or more training datasets obtained from the generated dense point cloud with the one or more ground-truth map features (In Fig. 5 and paragraph [0102], STENNETH discloses training an ML model based on a plurality of sensors based on features and the plurality of map-based features with ground-truth data.)
predicting the one or more traversability features using the trained ML model (In Fig. 5 and paragraph [0105], STENNETH discloses predicting the location of the road sign based on the trained machine learning model. )
Srinivasan in view of Vu, NABATCHIAN and STENNETH are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify STENNETH, with training a machine learning model based on the association of the ground truth data with the set of sensor based features and the set of map based features as taught by STENNETH. The motivation for doing so would have been to improve the cost and performance aspects of the navigation (See [0031] of STENNETH.)
Claims 4, 11 and 17 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN, STENNETH and further in view of Botonjic et al. (US Patent No.: 11,308,639 B2), hereinafter referred to as Botonjic.
Regarding claim 4, Srinivasan in view of Vu, NABATCHIAN and STENNETH disclose elements of claim 3. Srinivasan in view of Vu, NABATCHIAN and STENNETH does not explicitly disclose:
The ML-based method of claim 1, wherein generating the synthetic point cloud based on the dense point cloud of the one or more environments, comprises: collecting, by the one or more hardware processors, one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environments
projecting, by the one or more hardware processors, the generated dense point cloud into one or more frames defined by the collected one or more poses
cropping, by the one or more hardware processors, the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses
applying, by the one or more hardware processors, noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices
However, Botonjic disclose the limitation:
The ML-based method of claim 1, wherein generating the synthetic point cloud based on the dense point cloud of the one or more environments, comprises: collecting, by the one or more hardware processors, one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environments (In Col. 12, lines 50–60, Botonjic discloses receiving a point cloud from a LiDAR sensor, the point cloud including a plurality of points representing positions of objects relative to the LiDAR sensor; processing the point cloud to produce a voxelized frame including a plurality of voxels; processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons)
projecting, by the one or more hardware processors, the generated dense point cloud into one or more frames defined by the collected one or more poses (In Col. 12, lines 50–60, Botonjic discloses processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons; and outputting the location of the determined one or more persons and the pose for each of the determined one or more persons)
cropping, by the one or more hardware processors, the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses (In Col. 17-18, lines 50-10, Botonjic disclose the annotation tool can automatically find important areas in the point cloud frame and crop it to show only those areas. The tool may send the point cloud frame to a deep neural network that can identify people and their poses.)
applying, by the one or more hardware processors, noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices (In Col. 11, lines 36–46, Botonjic discloses that it processes the voxel frame to find one or more people near the LiDAR sensor and figure out their poses. Then, Microprocessor 22 can give the location of these people and their poses.)
Srinivasan in view of Vu, NABATCHIAN, STENNETH and Botonjic are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Botonjic, with training a neural network to more accurately identify and label poses in point cloud data in real-time as taught by Botonjic. The motivation for doing so would have been to provide a more accurate determination of a person and pose estimation, while 2D convolutional layers generally provide for a quicker determination of a person and pose estimation (See (Col. 2, lines 6-11) of Botonjic).
Regarding claim 11, Srinivasan in view of Vu, NABATCHIAN and STENNETH disclose elements of claim 8. Srinivasan in view of Vu, NABATCHIAN and STENNETH does not explicitly disclose:
The ML-based system of claim 8, wherein in generating the synthetic point cloud based on the dense point cloud of the one or more environments, the point cloud generating subsystem is configured to: collect one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environment
project the generated dense point cloud into one or more frames defined by the collected one or more poses
crop the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses
apply noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices
However, Bontonjic disclose the limitation:
The ML-based system of claim 8, wherein in generating the synthetic point cloud based on the dense point cloud of the one or more environments, the point cloud generating subsystem is configured to: collect one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environment (In Col. 12, lines 50–60, Botonjic discloses receiving a point cloud from a LiDAR sensor, the point cloud including a plurality of points representing positions of objects relative to the LiDAR sensor; processing the point cloud to produce a voxelized frame including a plurality of voxels; processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons)
project the generated dense point cloud into one or more frames defined by the collected one or more poses (In Col. 12, lines 50–60, Botonjic discloses processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons; and outputting the location of the determined one or more persons and the pose for each of the determined one or more persons).
crop the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses (In Col. 17- 18, lines 50-10, Botonjic disclose the annotation tool can automatically find important areas in the point cloud frame and crop it to show only those areas. The tool may send the point cloud frame to a deep neural network that can identify people and their poses.)
apply noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices (In Col. 11, lines 36–46, Botonjic discloses that it processes the voxel frame to find one or more people near the LiDAR sensor and figure out their poses. Then, Microprocessor 22 can give the location of these people and their poses.)
Srinivasan in view of Vu, NABATCHIAN, STENNETH and Botonjic are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Botonjic, with training a neural network to more accurately identify and label poses in point cloud data in real-time as taught by Botonjic. The motivation for doing so would have been to provide a more accurate determination of a person and pose estimation, while 2D convolutional layers generally provide for a quicker determination of a person and pose estimation (See (Col. 2, lines 6-11) of Botonjic).
Regarding claim 17, Srinivasan in view of Vu, NABATCHIAN and STENNETH disclose elements of claim 15. Srinivasan in view of Vu, NABATCHIAN and STENNETH does not explicitly disclose:
The non-transitory computer-readable storage medium of claim 15, wherein generating the synthetic point cloud based on the dense point cloud of the one or more environments, comprises: collecting one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environments
projecting the generated dense point cloud into one or more frames defined by the collected one or more poses
cropping the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses
applying noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices
However, Botonjic disclose the limitation:
The ML-based system of claim 8, wherein in generating the synthetic point cloud based on the dense point cloud of the one or more environments, the point cloud generating subsystem is configured to: collect one or more poses indicating one or more virtual viewpoints, in one or more free spaces in the one or more environment (In Col. 12, lines 50–60, Botonjic discloses receiving a point cloud from a LiDAR sensor, the point cloud including a plurality of points representing positions of objects relative to the LiDAR sensor; processing the point cloud to produce a voxelized frame including a plurality of voxels; processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons)
project the generated dense point cloud into one or more frames defined by the collected one or more poses (In Col. 12, lines 50–60, Botonjic discloses processing the voxelized frame using a deep neural network to determine one or more persons relative to the LiDAR sensor and a pose for each of the one or more persons; and outputting the location of the determined one or more persons and the pose for each of the determined one or more persons)
crop the one or more ground-truth map features upon a transformation process on the one or more ground-truth map features, based on the collected one or more poses (In Col. 17- 18, lines 50-10, Botonjic disclose the annotation tool can automatically find important areas in the point cloud frame and crop it to show only those areas. The tool may send the point cloud frame to a deep neural network that can identify people and their poses.)
apply noising data to the synthetic point cloud associated with the one or more training datasets to make the synthetic point cloud having outputs similar to outputs of one or more low-resolution sensors associated with the one or more robot devices (In Col. 11, lines 36–46, Botonjic discloses that it processes the voxel frame to find one or more people near the LiDAR sensor and figure out their poses. Then, Microprocessor 22 can give the location of these people and their poses.)
Srinivasan in view of Vu, NABATCHIAN, STENNETH and Botonjic are analogous pieces of art because both references concern the method of navigation of robots. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Botonjic, with training a neural network to more accurately identify and label poses in point cloud data in real-time as taught by Botonjic. The motivation for doing so would have been to provide a more accurate determination of a person and pose estimation, while 2D convolutional layers generally provide for a quicker determination of a person and pose estimation (See (Col. 2, lines 6-11) of Botonjic).
Claims 5, 12 and 18 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN and further in view of Goforth et al. (US Pub No.: 20230289999 A1), hereinafter referred to as Goforth.
Regarding claim 5, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 1. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based method of claim 1, wherein predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, comprises: defining, by the one or more hardware processors, one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud
passing, by the one or more hardware processors, one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling
and generating, by the one or more hardware processors, a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model
However, Goforth disclose the limitation:
The ML-based method of claim 1, wherein predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, comprises: defining, by the one or more hardware processors, one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud (In paragraph [0050], Goforth discloses both partial and complete views of detailed 3D models of different vehicles. For partial views, points from the surfaces of several models are picked using the simulated sensor at a certain height.)
passing, by the one or more hardware processors, one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling (In paragraph [0042], Goforth discloses the geometric information in the input point cloud, and uses max pooling on F to get a global feature g, which captures the most important information.)
generating, by the one or more hardware processors, a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model (In paragraph [0042], Goforth discloses that the deep neural network model creates a simple distribution for the traversability features using the point pillars network and points in specific areas.)
Srinivasan in view of Vu, NABATCHIAN and Goforth are analogous pieces of art because both references concern the method of navigation of robots. . Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Goforth, with generating an estimated pose and shape of the object based on the code as taught by Goforth. The motivation for doing so would have been to improve both the accuracy and efficiency of shape estimation and/or pose estimation based on sensor data (See [0068] of Goforth.)
Regarding claim 12 Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 8. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based system of claim 8, wherein in predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, the traversability predicting subsystem is configured to: define one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud
pass one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling
generate a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model
However, Goforth disclose the limitation:
The ML-based system of claim 8, wherein in predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, the traversability predicting subsystem is configured to: define one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud (In paragraph [0050], Goforth discloses both partial and complete views of detailed 3D models of different vehicles. For partial views, points from the surfaces of several models are picked using the simulated sensor at a certain height.)
pass one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling (In paragraph [0042], Goforth discloses the geometric information in the input point cloud, and uses max pooling on F to get a global feature g, which captures the most important information.)
generate a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model (In paragraph [0042], Goforth discloses that the deep neural network model creates a simple distribution for the traversability features using the point pillars network and points in specific areas.)
Srinivasan in view of Vu, NABATCHIAN and Goforth are analogous pieces of art because both references concern the method of navigation of robots. . Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Goforth, with generating an estimated pose and shape of the object based on the code as taught by Goforth. The motivation for doing so would have been to improve both the accuracy and efficiency of shape estimation and/or pose estimation based on sensor data (See [0068] of Goforth.)
Regarding claim 18 Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 15. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The non-transitory computer-readable storage medium of claim 15, wherein predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, comprises: defining one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud
passing one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling
generating a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model
However, Goforth disclose the limitation:
The non-transitory computer-readable storage medium of claim 15, wherein predicting the one or more traversability features from the synthetic point cloud associated with the one or more environments using the machine learning (ML) model, comprises: defining one or more metric regions around an ego-pose comprising at least one of: a resolution, width and height, in the synthetic point cloud (In paragraph [0050], Goforth discloses both partial and complete views of detailed 3D models of different vehicles. For partial views, points from the surfaces of several models are picked using the simulated sensor at a certain height.)
passing one or more points in the one or more metric regions through a point pillars network comprising at least one of: a point net and a cell-wise max-pooling (In paragraph [0042], Goforth discloses the geometric information in the input point cloud, and uses max pooling on F to get a global feature g, which captures the most important information.)
generating a cell-wise and factorized gaussian distribution for the one or more traversability features based on the point pillars network with the one or more points in the one or more metric regions, using the ML model (In paragraph [0042], Goforth discloses that the deep neural network model creates a simple distribution for the traversability features using the point pillars network and points in specific areas.)
Srinivasan in view of Vu, NABATCHIAN and Goforth are analogous pieces of art because both references concern the method of navigation of robots. . Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Goforth, with generating an estimated pose and shape of the object based on the code as taught by Goforth. The motivation for doing so would have been to improve both the accuracy and efficiency of shape estimation and/or pose estimation based on sensor data (See [0068] of Goforth.)
Claims 7, 14 and 20 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN and further in view of AMBRUS et al. (US Pub No.: 20230177850 A1), hereinafter referred to as AMBRUS
Regarding claim 7 Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 1. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based method of claim 1, further comprising re-training, by the one or more hardware processors, the ML model for the one or more robot devices based on at least one of: static traversability estimation at an execution time, changing of cost function, and one or more user requirements
However, AMBRUS disclose the limitation (In paragraph [0077], AMBRUS discloses that during the training phase of the monocular depth prediction network, a loss function helps improve the depth predictions based on a true depth map. This loss function is adjusted to include depth uncertainty. These changes allow the network to better estimate how uncertain each depth prediction is. In this process, the predicted depth map and the depth uncertainty are given as outputs from the network to a 3D object detection system.)
Srinivasan in view of Vu, NABATCHIAN and AMBRUS are analogous pieces of art because both references concern the method of navigating autonomous agents (e.g., vehicles, robots, etc.) through one or more uncertainty regions, with detecting objects and accurately locating them in three-dimensional space as taught by AMBRUS. The motivation for doing so would have been to improve a 3D object detection network (See [0081] of AMBRUS).
Regarding claim 14 Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 8. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based system of claim 8, further comprising a re-training subsystem configured to re-train the ML model for the one or more robot devices based on at least one of: static traversability estimation at an execution time, changing of cost function, and one or more user requirements
However, AMBRUS disclose the limitation (In paragraph [0077], AMBRUS discloses that during the training phase of the monocular depth prediction network, a loss function helps improve the depth predictions based on a true depth map. This loss function is adjusted to include depth uncertainty. These changes allow the network to better estimate how uncertain each depth prediction is. In this process, the predicted depth map and the depth uncertainty are given as outputs from the network to a 3D object detection system)
Srinivasan in view of Vu, NABATCHIAN and AMBRUS are analogous pieces of art because both references concern the method of navigating autonomous agents (e.g., vehicles, robots, etc.) through one or more uncertainty regions, with detecting objects and accurately locating them in three-dimensional space as taught by AMBRUS. The motivation for doing so would have been to improve a 3D object detection network (See [0081] of AMBRUS).
Regarding claim 20 Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 15. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The non-transitory computer-readable storage medium of claim 15,further comprising retraining the ML model for the one or more robot devices based on at least one of: static traversability estimation at an execution time, changing of cost function, and one or more user requirements
However, AMBRUS disclose the limitation (In paragraph [0077], AMBRUS discloses that during the training phase of the monocular depth prediction network, a loss function helps improve the depth predictions based on a true depth map. This loss function is adjusted to include depth uncertainty. These changes allow the network to better estimate how uncertain each depth prediction is. In this process, the predicted depth map and the depth uncertainty are given as outputs from the network to a 3D object detection system)
Srinivasan in view of Vu, NABATCHIAN and AMBRUS are analogous pieces of art because both references concern the method of navigating autonomous agents (e.g., vehicles, robots, etc.) through one or more uncertainty regions, with detecting objects and accurately locating them in three-dimensional space as taught by AMBRUS. The motivation for doing so would have been to improve a 3D object detection network (See [0081] of AMBRUS).
Claims 13 and 19 are rejected under 35 U.S.C 103 as being unpatentable over Srinivasan in view of Vu, NABATCHIAN and further in view of TOKMAKOV et al. (US Pub No.: 20240269844 A1), hereinafter referred to as TOKMAKOV.
Regarding claim 13, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 8. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The ML-based system of claim 8, wherein the traversability predicting subsystem is further configured to: analyze the traversability as probability that the predicted one or more traversability features are below to critical threshold values
analyze the traversability with the uncertainty object estimation when the predicted one or more traversability features are exceeded to the critical threshold values
However, TOKMAKOV disclose the limitation:
The ML-based system of claim 8, wherein the traversability predicting subsystem is further configured to: analyze the traversability as probability that the predicted one or more traversability features are below to critical threshold values (In paragraph [0076], TOKMAKOV disclose the model can predict trajectory from a previous application of motion prediction model using a mathematical notation, wherein the estimated point must be less than the threshold "y".)
analyze the traversability with the uncertainty object estimation when the predicted one or more traversability features are exceeded to the critical threshold values (In paragraph [0089], TOKMAKOV discloses that the models' predicted trajectory estimated is greater than or equal to the threshold.)
Srinivasan in view of Vu, NABATCHIAN and TOKMAKOV are analogous pieces of art because both references concern the method of navigating autonomous agents (e.g., vehicles, robots, etc.) through one or more uncertainty regions, with identifying one or more uncertainty regions in an environment based on an estimated trajectory of an object in the environment as taught by TOKMAKOV. The motivation for doing so would have been to improve of an accuracy of objection motion predictions and may also improve a precision and efficiency of agent-object interactions as taught by TOKMAKOV (See [0025] of TOKMAKOV.)
Regarding claim 19, Srinivasan in view of Vu and NABATCHIAN disclose elements of claim 15. Srinivasan in view of Vu and NABATCHIAN does not explicitly disclose:
The non-transitory computer-readable storage medium of claim 15, further comprising at least one of: analyzing the traversability as probability that the predicted one or more traversability features are below to critical threshold values
analyzing the traversability with the uncertainty object estimation when the predicted one or more traversability features are exceeded to the critical threshold values
However, TOKMAKOV discloses the limitation:
The non-transitory computer-readable storage medium of claim 15, further comprising at least one of: analyzing the traversability as probability that the predicted one or more traversability features are below to critical threshold values (In paragraph [0076], TOKMAKOV disclose the model can predict trajectory from a previous application of motion prediction model using a mathematical notation, wherein the estimated point must be less than the threshold "y".)
analyzing the traversability with the uncertainty object estimation when the predicted one or more traversability features are exceeded to the critical threshold values (In paragraph [0089], TOKMAKOV discloses that the models' predicted trajectory estimated is greater than or equal to the threshold.)
Srinivasan in view of Vu, NABATCHIAN and TOKMAKOV are analogous pieces of art because both references concern the method of navigating autonomous agents (e.g., vehicles, robots, etc.) through one or more uncertainty regions, with identifying one or more uncertainty regions in an environment based on an estimated trajectory of an object in the environment as taught by TOKMAKOV. The motivation for doing so would have been to improve of an accuracy of objection motion predictions and may also improve a precision and efficiency of agent-object interactions as taught by TOKMAKOV (See [0025] of TOKMAKOV.)
Response to Arguments
Applicant's arguments filed on 01/21/2026 have been fully considered, and in part are persuasive in combination of applicants amendments and arguments.
Pertaining to Rejection under 101
Applicant’s argument in regard to 101 is persuasive and rejection is withdrawn.
Pertaining to Rejection under 103
Applicant’s arguments in regard to the examiner’s rejections under 35 USC 103 are moot in view of the new grounds of rejection,
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to EVEL HONORE whose telephone number is (703)756-1179. The examiner can normally be reached Monday-Friday 8 a.m. -5:30 p.m.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Mariela D Reyes can be reached at (571) 270-1006. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
EVEL HONORE
Examiner
Art Unit 2142
/HAIMEI JIANG/Primary Examiner, Art Unit 2142