DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Status of the Claims
Claims 1-20, as originally filed, are currently pending and have been considered below.
Drawings
Color photographs and color drawings are not accepted in utility applications unless a petition filed under 37 CFR 1.84(a)(2) is granted. Any such petition must be accompanied by the appropriate fee set forth in 37 CFR 1.17(h), one set of color drawings or color photographs, as appropriate, if submitted via the USPTO patent electronic filing system or three sets of color drawings or color photographs, as appropriate, if not submitted via the via USPTO patent electronic filing system, and, unless already present, an amendment to include the following language as the first paragraph of the brief description of the drawings section of the specification:
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Color photographs will be accepted if the conditions for accepting color drawings and black and white photographs have been satisfied. See 37 CFR 1.84(b)(2).
Note that the requirement for three sets of color drawings under 37 C.F.R. § 1.84(a)(2)(ii) is not applicable to color drawings submitted via the USPTO patent electronic filing system. Therefore, only one set of such color drawings is necessary when filing via the USPTO patent electronic filing system.
The drawings are objected to because Figures 4 and 5 include color, but no petition under 37 C.F.R. § 1.84(a)(2) has been filed.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kabzan, Juraj, et al. "AMZ Driverless: The Full Autonomous Racing System." arXiv preprint arXiv:1905.05150 (2019), hereinafter, “Kabzan”, and further in view of Sauerbeck, Florian, et al. "A combined LiDAR-camera localization for autonomous race cars." SAE (2021), hereinafter, “Sauerbeck”.
As per claim 1, Kabzan discloses a method comprising:
generating an estimated track boundary segment based on a digital image associated with a moving object (Kabzan, page 6, 2.2 Hardware Concept – The Racecar, In order to make the vehicle race fully autonomously, sensors allowing for environment perception need to be added … The second sensor modality is image-based. Three CMOS cameras with global shutter are used in a stereo and mono setup, see Figure 3a. The stereo setup is intended to deal with cones in close proximity whereas the mono camera is used for detecting cones further away … Considering the vantage point, the cameras were placed on the main roll hoop, above the driver's seat in the car, see Figure 4; Kabzan, page 8, 2.3 Software Concept, Motion Estimation and Mapping: Its goal is to map the track, detect the boundaries and estimate the physical state of the car. The inputs are the observed cone positions from Perception (camera and LiDAR pipeline) ... three elements are needed. First, a velocity estimation ... Second, a SLAM algorithm, that uses cone position estimates from Perception as landmarks, and the integrated velocity estimate as a motion model. Finally, and only for the SLAM mode, a boundary estimation algorithm is used to classify cones as right or left boundary [a segment corresponds to the area between two cones, see page 17, Figure 20b]);
for each position of a plurality of positions in an actual track boundary segment pertaining to a track for one or more moving objects: based on said each position, making an alignment of the estimated track boundary segment with the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m);
based on the alignment, generating a difference measurement between the estimated track boundary segment and a portion of the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m [Figure 20b depicts the difference between the estimated and the actual position for each segment around the entire track]);
selecting a particular alignment, of a plurality of alignments, that is associated with the lowest difference measurement among the plurality of positions (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a. As expected, the error increases with distance to the cone ... An experiment was conducted to compare the 3D positions of cones estimated via PnP and those estimated via triangulation. For this experiment setup, cones were placed in front of the car at various distances and then images were actively acquired. The estimated variance of each cone was observed over time and plotted as a box plot in Figure 19b; Kabzan, page 25, 4.4.3 Boundary Estimation Validation & Results, The system is robust when driving through rule compliant tracks. However, in case of distances larger than 5 m between two consecutive cones, we observed an increased error in predicting the track. In Figure 26b we show the classification success rate from the FSG trackdrive competition. For this track, only 4.2% of all iterations yield a path which is not inside the real track. Furthermore, the miss predicted paths only leave the real track at more than 7 m away from the car);
wherein the method is performed by one or more computing devices (Kabzan, page 7, 2.2 Hardware Concept – The Racecar, To execute the required computations, two computing units were installed (see Figure 3b). Both units have a CPU and a GPU. One is a rugged industrial computer ... and the other is an efficient embedded unit).
Kabzan further discloses (Kabzan, page 8, 2.3 Software Concept, To reach the full potential of the car, it is estimated that the track must be known for at least 2s ahead of the vehicle’s current position ... Thus, a map of the track is built, such that once the track is known, its shape ahead can also be anticipated by estimating the car’s pose in the map) but does not explicitly disclose the following limitations as further recited however Sauerbeck discloses
based on the particular alignment, determining a longitudinal value of the moving object (Sauerbeck, page 1, Introduction, One of the crucial tasks of the autonomous driving software stack is the localization of the ego vehicle … First, the pre-known track is represented in a trackbound 2D coordinate frame. Second, the localization is implemented in two steps running in parallel. The pose of the ego is estimated in reference to the SAFER (Steel and Foam Energy Reduction) Barrier on the outside of the track … Third, the absolute position of the ego along the centerline is estimated via a visual Simultaneous Localization and Mapping (SLAM) algorithm using data from the front center camera ... this information is put together and transformed to 2D Cartesian coordinates ... To track the longitudinal progress along the track, we need far-off features that are used in a visual SLAM; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach based on the open-source OpenVSLAM ... To decouple the longitudinal and lateral position of the vehicle along the track ... We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, the longitudinal tracking of the vehicle position is based on camera images … we only rely on one front-facing monocular camera ... For map generation, a slow run around the track is carried out at a constant low speed. The map generated in this way includes found and matched landmarks and corresponding keyframes. An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6 ... To calculate the s coordinate of a keyframe KFn, the distance along the track from the starting point (s = 0) to the keyframe KFn is divided by the total track length (Equation 2) … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFn and KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track [the longitudinal location of the vehicle is calculated for every track segment as shown in Figure 6]).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine the teachings of Sauerbeck with Kabzan because they are in the same field of endeavor. One skilled in the art would have been motivated to substitute the longitudinal tracking with respect to the track boundary as taught by Sauerbeck for the pose estimation as taught by Kabzan as an alternate means to localize the vehicle on the track (Sauerbeck, page 4, Longitudinal Localization).
As per claim 2, Kabzan and Sauerbeck disclose the method of claim 1, wherein determining the longitudinal value comprises:
based on the particular alignment, determining a point on the actual track boundary segment that is associated with the particular alignment of the estimated track boundary segment with the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP); Kabzan, page 17, Figure 20B; Kabzan, page 24, Figure 25);
identifying the longitudinal value that is associated with the point (Sauerbeck, page 4, Longitudinal Localization, An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFnand KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4.; Sauerbeck, page 8, Table 1: Results of longitudinal localization; Sauerbeck, page 8, Table 2: Results of boundary localization; Sauerbeck, page 8, Figure 12: Difference between the longitudinal position estimate and the ground truth over a lap with 300 km=h).
As per claim 3, Kabzan and Sauerbeck disclose the method of claim 1, wherein the estimated track boundary segment is a first estimated track boundary segment that is associated with a first boundary of the track, wherein the particular alignment is a first particular alignment, wherein the longitudinal value is a first longitudinal value, the method further comprising:
generating a second estimated track boundary segment based on the digital image (Kabzan, page 6, 2.2 Hardware Concept – The Racecar, In order to make the vehicle race fully autonomously, sensors allowing for environment perception need to be added … The second sensor modality is image-based. Three CMOS cameras with global shutter are used in a stereo and mono setup, see Figure 3a. The stereo setup is intended to deal with cones in close proximity whereas the mono camera is used for detecting cones further away … Considering the vantage point, the cameras were placed on the main roll hoop, above the driver's seat in the car, see Figure 4; Kabzan, page 8, 2.3 Software Concept, Motion Estimation and Mapping: Its goal is to map the track, detect the boundaries and estimate the physical state of the car. The inputs are the observed cone positions from Perception (camera and LiDAR pipeline) ... three elements are needed. First, a velocity estimation ... Second, a SLAM algorithm, that uses cone position estimates from Perception as landmarks, and the integrated velocity estimate as a motion model. Finally, and only for the SLAM mode, a boundary estimation algorithm is used to classify cones as right or left boundary [a segment corresponds to the area between two cones, see page 17, Figure 20b]);
for each position of a second plurality of positions in a second actual track boundary segment pertaining to the track: based on each position, making a second alignment of the second estimated track boundary segment with the second actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m);
based on the second alignment, generating a second difference measurement between the second estimated track boundary segment and a portion of the second actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m [Figure 20b depicts the difference between the estimated and the actual position for each segment around the entire track]);
selecting a second particular alignment, of a second plurality of alignments, that is associated with the lowest difference measurement among the second plurality of positions (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a. As expected, the error increases with distance to the cone ... An experiment was conducted to compare the 3D positions of cones estimated via PnP and those estimated via triangulation. For this experiment setup, cones were placed in front of the car at various distances and then images were actively acquired. The estimated variance of each cone was observed over time and plotted as a box plot in Figure 19b; Kabzan, page 25, 4.4.3 Boundary Estimation Validation & Results, The system is robust when driving through rule compliant tracks. However, in case of distances larger than 5 m between two consecutive cones, we observed an increased error in predicting the track. In Figure 26b we show the classification success rate from the FSG trackdrive competition. For this track, only 4.2% of all iterations yield a path which is not inside the real track. Furthermore, the miss predicted paths only leave the real track at more than 7 m away from the car);
based on the second particular alignment, determining a second longitudinal value of the moving object (Sauerbeck, page 1, Introduction, One of the crucial tasks of the autonomous driving software stack is the localization of the ego vehicle … First, the pre-known track is represented in a trackbound 2D coordinate frame. Second, the localization is implemented in two steps running in parallel. The pose of the ego is estimated in reference to the SAFER (Steel and Foam Energy Reduction) Barrier on the outside of the track … Third, the absolute position of the ego along the centerline is estimated via a visual Simultaneous Localization and Mapping (SLAM) algorithm using data from the front center camera ... this information is put together and transformed to 2D Cartesian coordinates ... To track the longitudinal progress along the track, we need far-off features that are used in a visual SLAM; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach based on the open-source OpenVSLAM ... To decouple the longitudinal and lateral position of the vehicle along the track ... We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, the longitudinal tracking of the vehicle position is based on camera images … we only rely on one front-facing monocular camera ... For map generation, a slow run around the track is carried out at a constant low speed. The map generated in this way includes found and matched landmarks and corresponding keyframes. An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6 ... To calculate the s coordinate of a keyframe KFn, the distance along the track from the starting point (s = 0) to the keyframe KFn is divided by the total track length (Equation 2) … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFn and KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track [the longitudinal location of the vehicle is calculated for every track segment as shown in Figure 6]). The motivation would be the same as above in claim 1.
As per claim 4, Kabzan and Sauerbeck disclose the method of claim 3, further comprising: generating a final longitudinal value of the moving object based on the first longitudinal value and the second longitudinal value (Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
As per claim 5, Kabzan and Sauerbeck disclose the method of claim 4, further comprising:
determining whether the moving object is closer to the first estimated track boundary segment or to the second estimated track boundary (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP); Kabzan, page 17, Figure 20B; Kabzan, page 24, Figure 25);
wherein generating the final longitudinal value comprises applying more weight to the first longitudinal value than to the second longitudinal value in response to determining that the moving object is closer to the first estimated track boundary segment than to the second estimated track boundary segment (Kabzan, page 21, Localization, After finishing the first lap, a lap closure is detected and the operational mode switches from SLAM to Localization mode. The track boundaries are then identified and the map is not longer updated. The map of the most likely particle (highest weight) is taken as the map for subsequent laps. In order to localize the car in the created map, Monte Carlo Localization is used. A smooth pose update is given through the computation of the mean over all particles; Kabzan, page 22, 4.3.1 Path Planning Algorithm. Starting from the position of the car, the paths always connect to the next center points of the surrounding edges. The resulting paths describe all possible center lines given the observed cones and the current car positions ... The final step of the algorithm computes a cost for each path generated in the second step and selects the one with the lowest cost as the most likely track center line ... To get the cost of a path, the five costs are first computed, then normalized, squared, weighted and finally added together. The best path is then selected as the estimate of the center line as well as the corresponding boundaries; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
As per claim 6, Kabzan and Sauerbeck disclose the method of claim 4, further comprising:
generating a first uncertainty value that is associated with the first longitudinal value (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
generating a second uncertainty value that is associated with the second longitudinal value (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
wherein generating the final longitudinal value is also based on the first uncertainty value and the second uncertainty value (Kazban, page 8, 2.3 Software Concept, To reach the full potential of the car, it is estimated that the track must be known for at least 2s ahead of the vehicle’s current position ... Thus, a map of the track is built, such that once the track is known, its shape ahead can also be anticipated by estimating the car’s pose in the map; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach … We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, First validations with different speeds showed a longitudinal deviation of up to 40 meters. The reason for this was found within the generated map which did not depict the reality with enough precision. To lower the error a map correction was implemented. With the information of the mapping run at constant speed, the calculation of keyframes to the scaled longitudinal coordinate was corrected).
As per claim 7, Kabzan and Sauerbeck disclose the method of claim 4, further comprising: determining an uncertainty of the first particular alignment of the first estimated track boundary segment and an uncertainty of the second particular alignment of the second estimated track boundary segment (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
wherein generating the final longitudinal value comprises applying more weight to the first longitudinal value than to the second longitudinal value in response to determining which alignment is more uncertain than the other (Kabzan, page 21, Localization, After finishing the first lap, a lap closure is detected and the operational mode switches from SLAM to Localization mode. The track boundaries are then identified and the map is not longer updated. The map of the most likely particle (highest weight) is taken as the map for subsequent laps. In order to localize the car in the created map, Monte Carlo Localization is used. A smooth pose update is given through the computation of the mean over all particles; Kabzan, page 22, 4.3.1 Path Planning Algorithm. Starting from the position of the car, the paths always connect to the next center points of the surrounding edges. The resulting paths describe all possible center lines given the observed cones and the current car positions ... The final step of the algorithm computes a cost for each path generated in the second step and selects the one with the lowest cost as the most likely track center line ... To get the cost of a path, the five costs are first computed, then normalized, squared, weighted and finally added together. The best path is then selected as the estimate of the center line as well as the corresponding boundaries; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
As per claim 8, Kabzan and Sauerbeck disclose the method of claim 1, further comprising:
for each alignment of the plurality of alignments: determining a difference between the difference measurement of said each alignment and the lowest difference measurement (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP); Kabzan, page 24, Figure 25: Mapping and localization comparison against ground-truth measurements; Kabzan, page 17, Figure 20B; Kabzan, page 24, Figure 25);
adding the difference to a set of differences (Kabzan, page 24, 4.4.2 SLAM Validation & Results, The 230 m long track shown in Section 4.4.2 is mapped with both LiDAR and vision pipeline. In addition, the cars position is measured in order to verify the vehicle’s particulate filter-based location. Therefore, a Totalstations measurement prism was attached to the car and measured for positional ground-truth. The estimated position differs only by a RMSE of 0.2 m from the tracked ground-truth; Sauerbeck, page 8, Table 1: Results of longitudinal localization; Sauerbeck, page 8, Table 2: Results of boundary localization; page 8, Figure 12: Difference between the longitudinal position estimate and the ground truth over a lap with 300 km=h);
generating an uncertainty value for the particular alignment based on the set of differences (Kabzan, page 24, Figure 25: Mapping and localization comparison against ground-truth measurements; Kabzan, page 24, 4.4.2 SLAM Validation & Results, The 230 m long track shown in Section 4.4.2 is mapped with both LiDAR and vision pipeline. In addition, the cars position is measured in order to verify the vehicle’s particulate filter-based location. Therefore, a Totalstations measurement prism was attached to the car and measured for positional ground-truth. The estimated position differs only by a RMSE of 0.2 m from the tracked ground-truth).
As per claim 9, Kabzan and Sauerbeck disclose the method of claim 8, further comprising:
for each alignment of the plurality of alignments: determining a normalized difference between the difference measurement of said each alignment and the lowest difference measurement (Sauerbeck, page 4, Longitudinal Localization, To make the localization independent of the scale, a fixed scale factor is used from the generated map to the real race track. This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6);
wherein generating the uncertainty value for the particular alignment is also based on the normalized difference (Sauerbeck, page 4, Longitudinal Localization, To make the localization independent of the scale, a fixed scale factor is used from the generated map to the real race track. This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6).
As per claim 10, Kabzan and Sauerbeck disclose the method of claim 1, further comprising: storing ground truth data about an actual track boundary, of which the actual track boundary segment is a part; wherein the actual track boundary comprises a set of positions, each of which is associated with a longitudinal value; wherein the plurality of positions is a strict subset of the set of positions (Kabzan, page 17, Figure 20b, Comparison of stereo 3D estimate of stereo system and the ground truth; Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned; Kabzan, page 24, 4.4.2 SLAM Validation & Results, The 230 m long track shown in Section 4.4.2 is mapped with both LiDAR and vision pipeline. In addition, the cars position is measured in order to verify the vehicle’s particulate filter-based location. Therefore, a Totalstations measurement prism was attached to the car and measured for positional ground-truth; Kabzan, page 24, Figure 25: Mapping and localization comparison against ground-truth measurements; Sauerbeck, page 8, Figure 12: Difference between the longitudinal position estimate and the ground truth over a lap with 300 km=h);
As per claim 11, Kabzan and Sauerbeck disclose the method of claim 10, further comprising:
generating an uncertainty value for the particular alignment based on the set of differences Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
for a second digital image that is subsequent to the digital image, determining a search area size of the actual track boundary based on the uncertainty value; wherein a number of positions of the actual track boundary to consider for aligning a second actual track boundary segment with a second estimated track boundary segment, that is generated based on the second digital image, is directly proportional to the search area size (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones).
As per claim 12, Kabzan and Sauerbeck disclose the method of claim 10, further comprising: for a second digital image that is subsequent to the digital image, determining a search area size of the actual track boundary based on an approximate position of the moving object and a known error of the approximate position (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m; Kazban, page 24, 4.4.2 SLAM Validation & Results, Multiple results were already presented in Section 3.1 as well as Section 3.2 which show the estimated map with comparison to ground-truth measurements. By purely vision-based mapping, a RMSE of 0.25 m between estimated map and GT … The 230 m long track shown in Section 4.4.2 is mapped with both LiDAR and vision pipeline. In addition, the cars position is measured in order to verify the vehicle’s particulate filter-based location. Therefore, a Totalstations measurement prism was attached to the car and measured for positional ground-truth. The estimated position differs only by a RMSE of 0.2 m from the tracked ground-truth).
As per claim 13, Kabzan and Sauerbeck disclose the method of claim 10, further comprising: for a second digital image that is subsequent to the digital image, determining a search area size of the actual track boundary based on an amount of time that has passed since the moving object began moving in a controlled environment (Sauerbeck, page 4, Longitudinal Localization, the longitudinal tracking of the vehicle position is based on camera images … a slow run around the track is carried out at a constant low speed. The map generated in this way includes found and matched landmarks and corresponding keyframes … To calculate the s coordinate of a keyframe KFn, the distance along the track from the starting point (s = 0) to the keyframe KFn is divided by the total track length (Equation 2) … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFn and KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4 [time between acquisition of keyframes corresponds to amount of time that has passed]).
As per claim 14, Kabzan discloses one or more non-transitory storage media storing instructions which, when executed by one or more computing devices (Kabzan, page 7, 2.2 Hardware Concept – The Racecar, To execute the required computations, two computing units were installed (see Figure 3b). Both units have a CPU and a GPU. One is a rugged industrial computer ... and the other is an efficient embedded unit), cause:
generating an estimated track boundary segment based on a digital image associated with a moving object (Kabzan, page 6, 2.2 Hardware Concept – The Racecar, In order to make the vehicle race fully autonomously, sensors allowing for environment perception need to be added … The second sensor modality is image-based. Three CMOS cameras with global shutter are used in a stereo and mono setup, see Figure 3a. The stereo setup is intended to deal with cones in close proximity whereas the mono camera is used for detecting cones further away … Considering the vantage point, the cameras were placed on the main roll hoop, above the driver's seat in the car, see Figure 4; Kabzan, page 8, 2.3 Software Concept, Motion Estimation and Mapping: Its goal is to map the track, detect the boundaries and estimate the physical state of the car. The inputs are the observed cone positions from Perception (camera and LiDAR pipeline) ... three elements are needed. First, a velocity estimation ... Second, a SLAM algorithm, that uses cone position estimates from Perception as landmarks, and the integrated velocity estimate as a motion model. Finally, and only for the SLAM mode, a boundary estimation algorithm is used to classify cones as right or left boundary [a segment corresponds to the area between two cones, see page 17, Figure 20b]);
for each position of a plurality of positions in an actual track boundary segment pertaining to a track for one or more moving objects: based on said each position, making an alignment of the estimated track boundary segment with the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m);
based on the alignment, generating a difference measurement between the estimated track boundary segment and a portion of the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m [Figure 20b depicts the difference between the estimated and the actual position for each segment around the entire track]);
selecting a particular alignment, of a plurality of alignments, that is associated with the lowest difference measurement among the plurality of positions (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a. As expected, the error increases with distance to the cone ... An experiment was conducted to compare the 3D positions of cones estimated via PnP and those estimated via triangulation. For this experiment setup, cones were placed in front of the car at various distances and then images were actively acquired. The estimated variance of each cone was observed over time and plotted as a box plot in Figure 19b; Kabzan, page 25, 4.4.3 Boundary Estimation Validation & Results, The system is robust when driving through rule compliant tracks. However, in case of distances larger than 5 m between two consecutive cones, we observed an increased error in predicting the track. In Figure 26b we show the classification success rate from the FSG trackdrive competition. For this track, only 4.2% of all iterations yield a path which is not inside the real track. Furthermore, the miss predicted paths only leave the real track at more than 7 m away from the car).
Kabzan further discloses (Kabzan, page 8, 2.3 Software Concept, To reach the full potential of the car, it is estimated that the track must be known for at least 2s ahead of the vehicle’s current position ... Thus, a map of the track is built, such that once the track is known, its shape ahead can also be anticipated by estimating the car’s pose in the map) but does not explicitly disclose the following limitations as further recited however Sauerbeck discloses
based on the particular alignment, determining a longitudinal value of the moving object (Sauerbeck, page 1, Introduction, One of the crucial tasks of the autonomous driving software stack is the localization of the ego vehicle … First, the pre-known track is represented in a trackbound 2D coordinate frame. Second, the localization is implemented in two steps running in parallel. The pose of the ego is estimated in reference to the SAFER (Steel and Foam Energy Reduction) Barrier on the outside of the track … Third, the absolute position of the ego along the centerline is estimated via a visual Simultaneous Localization and Mapping (SLAM) algorithm using data from the front center camera ... this information is put together and transformed to 2D Cartesian coordinates ... To track the longitudinal progress along the track, we need far-off features that are used in a visual SLAM; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach based on the open-source OpenVSLAM ... To decouple the longitudinal and lateral position of the vehicle along the track ... We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, the longitudinal tracking of the vehicle position is based on camera images … we only rely on one front-facing monocular camera ... For map generation, a slow run around the track is carried out at a constant low speed. The map generated in this way includes found and matched landmarks and corresponding keyframes. An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6 ... To calculate the s coordinate of a keyframe KFn, the distance along the track from the starting point (s = 0) to the keyframe KFn is divided by the total track length (Equation 2) … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFn and KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track [the longitudinal location of the vehicle is calculated for every track segment as shown in Figure 6]).
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to combine the teachings of Sauerbeck with Kabzan because they are in the same field of endeavor. One skilled in the art would have been motivated to substitute the longitudinal tracking with respect to the track boundary as taught by Sauerbeck for the pose estimation as taught by Kabzan as an alternate means to localize the vehicle on the track (Sauerbeck, page 4, Longitudinal Localization).
As per claim 15, Kabzan and Sauerbeck disclose the one or more storage media of claim 14, wherein determining the longitudinal value comprises:
based on the particular alignment, determining a point on the actual track boundary segment that is associated with the particular alignment of the estimated track boundary segment with the actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP); Kabzan, page 17, Figure 20B; Kabzan, page 24, Figure 25);
identifying the longitudinal value that is associated with the point (Sauerbeck, page 4, Longitudinal Localization, An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFnand KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4.; Sauerbeck, page 8, Table 1: Results of longitudinal localization; Sauerbeck, page 8, Table 2: Results of boundary localization; Sauerbeck, page 8, Figure 12: Difference between the longitudinal position estimate and the ground truth over a lap with 300 km=h). The motivation would be the same as above in claim 14.
As per claim 16, Kazban and Sauerbeck disclose the one or more storage media of claim 14, wherein the estimated track boundary segment is a first estimated track boundary segment that is associated with a first boundary of the track, wherein the particular alignment is a first particular alignment, wherein the longitudinal value is a first longitudinal value, wherein the instructions, when executed by the one or more computing devices, further cause:
generating a second estimated track boundary segment based on the digital image (Kabzan, page 6, 2.2 Hardware Concept – The Racecar, In order to make the vehicle race fully autonomously, sensors allowing for environment perception need to be added … The second sensor modality is image-based. Three CMOS cameras with global shutter are used in a stereo and mono setup, see Figure 3a. The stereo setup is intended to deal with cones in close proximity whereas the mono camera is used for detecting cones further away … Considering the vantage point, the cameras were placed on the main roll hoop, above the driver's seat in the car, see Figure 4; Kabzan, page 8, 2.3 Software Concept, Motion Estimation and Mapping: Its goal is to map the track, detect the boundaries and estimate the physical state of the car. The inputs are the observed cone positions from Perception (camera and LiDAR pipeline) ... three elements are needed. First, a velocity estimation ... Second, a SLAM algorithm, that uses cone position estimates from Perception as landmarks, and the integrated velocity estimate as a motion model. Finally, and only for the SLAM mode, a boundary estimation algorithm is used to classify cones as right or left boundary [a segment corresponds to the area between two cones, see page 17, Figure 20b]);
for each position of a second plurality of positions in a second actual track boundary segment pertaining to the track: based on each position, making a second alignment of the second estimated track boundary segment with the second actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m);
based on the second alignment, generating a second difference measurement between the second estimated track boundary segment and a portion of the second actual track boundary segment (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a … we illustrate the quality of a map generated using cones from the vision pipeline. We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP). The Root Mean Square Error (RMSE) between cone estimates and its corresponding ground-truth is estimated to be 0.25 m [Figure 20b depicts the difference between the estimated and the actual position for each segment around the entire track]);
selecting a second particular alignment, of a second plurality of alignments, that is associated with the lowest difference measurement among the second plurality of positions (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We compare the 3D positions estimates from the vision pipeline with those from an accurate range sensor such as the LiDAR, as can be seen in Figure 19a. As expected, the error increases with distance to the cone ... An experiment was conducted to compare the 3D positions of cones estimated via PnP and those estimated via triangulation. For this experiment setup, cones were placed in front of the car at various distances and then images were actively acquired. The estimated variance of each cone was observed over time and plotted as a box plot in Figure 19b; Kabzan, page 25, 4.4.3 Boundary Estimation Validation & Results, The system is robust when driving through rule compliant tracks. However, in case of distances larger than 5 m between two consecutive cones, we observed an increased error in predicting the track. In Figure 26b we show the classification success rate from the FSG trackdrive competition. For this track, only 4.2% of all iterations yield a path which is not inside the real track. Furthermore, the miss predicted paths only leave the real track at more than 7 m away from the car);
based on the second particular alignment, determining a second longitudinal value of the moving object (Sauerbeck, page 1, Introduction, One of the crucial tasks of the autonomous driving software stack is the localization of the ego vehicle … First, the pre-known track is represented in a trackbound 2D coordinate frame. Second, the localization is implemented in two steps running in parallel. The pose of the ego is estimated in reference to the SAFER (Steel and Foam Energy Reduction) Barrier on the outside of the track … Third, the absolute position of the ego along the centerline is estimated via a visual Simultaneous Localization and Mapping (SLAM) algorithm using data from the front center camera ... this information is put together and transformed to 2D Cartesian coordinates ... To track the longitudinal progress along the track, we need far-off features that are used in a visual SLAM; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach based on the open-source OpenVSLAM ... To decouple the longitudinal and lateral position of the vehicle along the track ... We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, the longitudinal tracking of the vehicle position is based on camera images … we only rely on one front-facing monocular camera ... For map generation, a slow run around the track is carried out at a constant low speed. The map generated in this way includes found and matched landmarks and corresponding keyframes. An exemplary map created by the enhanced OpenVSLAM of the IMS is shown in Figure 5 ... The dimensions (x, y position of the inner and outer bounds) of the IMS track are given and thus the length of the centerline can be easily calculated … This allows mapping of each point from the driven path to a value from zero to one (s coordinate) with a defined starting point as shown in Figure 6 ... To calculate the s coordinate of a keyframe KFn, the distance along the track from the starting point (s = 0) to the keyframe KFn is divided by the total track length (Equation 2) … The current position s is calculated by projecting the current position estimate in xy coordinates posxy on the straight between two keyframes KFn and KFn+1. The method according to Equation 3 is illustrated in Figure 7. With this projected vector, the s coordinate can be calculated using Equation 4; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track [the longitudinal location of the vehicle is calculated for every track segment as shown in Figure 6]).
As per claim 17, Kabzan and Sauerbeck disclose the one or more storage media of claim 16, wherein the instructions, when executed by the one or more computing devices, further cause: generating a final longitudinal value of the moving object based on the first longitudinal value and the second longitudinal value (Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
As per claim 18, Kabzan and Sauerbeck disclose the one or more storage media of claim 17, wherein the instructions, when executed by the one or more computing devices, further cause:
determining whether the moving object is closer to the first estimated track boundary segment or to the second estimated track boundary (Kabzan, pages 16-17, 3.3.2 Camera-based Results, We measured a ground truth of cone positions with the Leica Totalstation, shown in Figure 20b, as well as created a map with our SLAM algorithm (see Section 4.2) using only vision-pipeline cone estimates. These two maps were then aligned using the Iterative Closest Point (ICP); Kabzan, page 17, Figure 20B; Kabzan, page 24, Figure 25);
wherein generating the final longitudinal value comprises applying more weight to the first longitudinal value than to the second longitudinal value in response to determining that the moving object is closer to the first estimated track boundary segment than to the second estimated track boundary segment (Kabzan, page 21, Localization, After finishing the first lap, a lap closure is detected and the operational mode switches from SLAM to Localization mode. The track boundaries are then identified and the map is not longer updated. The map of the most likely particle (highest weight) is taken as the map for subsequent laps. In order to localize the car in the created map, Monte Carlo Localization is used. A smooth pose update is given through the computation of the mean over all particles; Kabzan, page 22, 4.3.1 Path Planning Algorithm. Starting from the position of the car, the paths always connect to the next center points of the surrounding edges. The resulting paths describe all possible center lines given the observed cones and the current car positions ... The final step of the algorithm computes a cost for each path generated in the second step and selects the one with the lowest cost as the most likely track center line ... To get the cost of a path, the five costs are first computed, then normalized, squared, weighted and finally added together. The best path is then selected as the estimate of the center line as well as the corresponding boundaries; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
As per claim 19, Kabzan and Sauerbeck disclose the one or more storage media of claim 17, wherein the instructions, when executed by the one or more computing devices, further cause:
generating a first uncertainty value that is associated with the first longitudinal value (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
generating a second uncertainty value that is associated with the second longitudinal value (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
wherein generating the final longitudinal value is also based on the first uncertainty value and the second uncertainty value (Kazban, page 8, 2.3 Software Concept, To reach the full potential of the car, it is estimated that the track must be known for at least 2s ahead of the vehicle’s current position ... Thus, a map of the track is built, such that once the track is known, its shape ahead can also be anticipated by estimating the car’s pose in the map; Sauerbeck, page 3, Concept, The longitudinal position along the track is estimated via a camera-based visual SLAM approach … We use a coordinate system that defines the longitudinal coordinate along the track centerline and the lateral coordinate orthogonally as shown in Figure 3; Sauerbeck, page 4, Longitudinal Localization, First validations with different speeds showed a longitudinal deviation of up to 40 meters. The reason for this was found within the generated map which did not depict the reality with enough precision. To lower the error a map correction was implemented. With the information of the mapping run at constant speed, the calculation of keyframes to the scaled longitudinal coordinate was corrected).
As per claim 20, Kabzan and Sauerbeck disclose the one or more storage media of claim 17, wherein the instructions, when executed by the one or more computing devices, further cause:
determining an uncertainty of the first particular alignment of the first estimated track boundary segment and an uncertainty of the second particular alignment of the second estimated track boundary segment (Kabzan, page 13, 3.2.2 Multiple Object Detection, To estimate the 3D positions of multiple cones from a single image, it is necessary to detect these cones and classify them as blue, yellow or orange. A real-time and powerful object detector in the form of YOLOv2 was trained on these three different types of cones. On being fed with the acquired images, this YOLOv2 returns the bounding box positions around the detected cones along with the confidence scores for each detection. This object detector network was chosen ... due to the existence of pre-trained weights);
wherein generating the final longitudinal value comprises applying more weight to the first longitudinal value than to the second longitudinal value in response to determining which alignment is more uncertain than the other (Kabzan, page 21, Localization, After finishing the first lap, a lap closure is detected and the operational mode switches from SLAM to Localization mode. The track boundaries are then identified and the map is not longer updated. The map of the most likely particle (highest weight) is taken as the map for subsequent laps. In order to localize the car in the created map, Monte Carlo Localization is used. A smooth pose update is given through the computation of the mean over all particles; Kabzan, page 22, 4.3.1 Path Planning Algorithm. Starting from the position of the car, the paths always connect to the next center points of the surrounding edges. The resulting paths describe all possible center lines given the observed cones and the current car positions ... The final step of the algorithm computes a cost for each path generated in the second step and selects the one with the lowest cost as the most likely track center line ... To get the cost of a path, the five costs are first computed, then normalized, squared, weighted and finally added together. The best path is then selected as the estimate of the center line as well as the corresponding boundaries; Sauerbeck, page 4, Figure 6: Illustration of the scaled map to extract the longitudinal position along the track).
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRACY MANGIALASCHI whose telephone number is (571)270-5189. The examiner can normally be reached M-F, 9:30AM TO 6:00PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vu Le can be reached at (571) 272-7332. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TRACY MANGIALASCHI/Examiner, Art Unit 2668