DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 29 September 2025 has been entered.
Status of Claims
This Office Action is in response to the applicant’s response/amendment of 29 September 2025.
Claims 79-80, 82, 88, 90-91, 95-96, 99, 111-112, 114, 116, 119, 121, 123, 135, 137, and 139-
148 are currently pending and addressed below.
Response to Arguments
Applicant’s arguments/amendments with respect to the objection to claim 123 been fully considered and are persuasive. Therefore, the objection to claim 123 has been withdrawn.
Applicant’s arguments/amendments with respect to the rejection of claims under 35 U.S.C. 112(b) have been fully considered and are persuasive. Therefore, the rejection of claims under 35 U.S.C. 112(b) has been withdrawn.
Applicant's arguments/amendments with respect to the rejection of claims under 35 U.S.C. 101 have been fully considered but they are not persuasive.
Specifically, applicant argued:
The Office oversimplifies the analysis under Step 2A by asserting, without
explanation, …Applicant’s amended claims are not directed to “an abstract idea” under Step 2A, but rather are rooted in a technical solution for collecting data for aligning drive information along a road segment using forward- and rearward-facing cameras onboard a host vehicle…Amended claim 79 recites steps for applying a trained machine learning model to identify and correlate feature points within images to improve vehicle navigation maps. A driver or another human being cannot reasonably access captured image data in a vehicle navigation system, nor can they analyze this electronic data using machine learning models-in their mind-to identify semantic features. “Claims do not recite a mental process when they do not contain limitations that can practically be performed in the human mind, for instance when the human mind is not equipped to perform the claim limitations…The human mind is not equipped to process the electronic data in the manner recited in amended claim 79.
As explained in the specification, "[two-dimensional] points may not be sufficient for
aligning multiple drives in different directions because the same road segment may look completely different when viewed from the same direction." See, e.g., as-filed specification, [0472]. "For example, the same road sign when viewed from one direction may look completely different from the opposite direction" and thus "it may be difficult for a system to correlate points representing the road sign from one direction, with points representing the sign collected from the other direction." Id. The system of claim 79 addresses this issue by using a host vehicle to "capture images of a front side of an object using a forward facing camera and later capture images of the back side of an object." Id., [0473]. Accordingly, "[f]eature points associated with the front side of the object may then be correlated with feature points associated with the rear side of the object, based on analysis of the images. "Id. Drive information subsequently captured by vehicles traveling in opposite directions can then be aligned based on these features points because "both sets of feature points were captured along the same trajectory."
Id., [0489]. The system of claim 79 thus overcomes technical difficulties by enabling more accurate alignment of drive information captured by vehicles traveling in opposite directions. Accordingly, claim 79 recites specific steps performed by a navigation system of a vehicle to collect a particular form of drive information used by a remotely located entity (e.g., a server). Claim 79 "covers a particular solution to a problem or a particular way to achieve a desired outcome" and does not "merely claim[] the idea of a solution or outcome," as cautioned against in the August 2025 Memo. August 2025 Memo, 4. Such practical applications as claimed are patent-eligible.
The Examiner’s response:
Applicant asserts “The Office oversimplifies the analysis under Step 2A by asserting, without
explanation, …Amended claim 79 recites steps for applying a trained machine learning
model to identify and correlate feature points within images to improve vehicle navigation maps. A driver or another human being cannot reasonably access captured image data in a vehicle navigation system, nor can they analyze this electronic data using machine learning models-in their mind-to identify semantic features. “Claims do not recite a mental process when they do not contain limitations that can practically be performed in the human mind, for instance when the human mind is not equipped to perform the claim limitations…The human mind is not equipped to process the electronic data in the manner recited in amended claim 79.” However, the Examiner respectfully disagrees. The Examiner submits the limitations “detect a first semantic feature represented in the first image…”, “identify, using at least one trained machine learning model, at least one position descriptor associated with the first semantic feature represented in the first image…”, “detect a second semantic feature represented in the second image…”,“identify, using at least one trained machine learning model, at least one position descriptor associated with the second semantic feature…”, and “determine, based on the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, and the position information, a position the first semantic feature relative the second semantic feature”, under its broadest reasonable interpretation, can reasonably be performed by a human mentally or with aid of pen and paper. Regarding the use of “trained machine learning model” appears to be an “apply it” limitation recited at a high-level generality, since the limitation invokes computers or other machinery merely as a tool to perform an existing process – simply adding a general-purpose computer or computer components after the fact to an abstract idea. The Examiner notes that applicant’s arguments are not commensurate with the scope of the claim because the “electronic [image] data” is not apparently claimed.
Applicant asserts “The system of claim 79 addresses this issue by using a host vehicle to
"capture images of a front side of an object using a forward facing camera and later capture images of the back side of an object." Id., [0473]. Accordingly, "[f]eature points associated with the front side of the object may then be correlated with feature points associated with the rear side of the object, based on analysis of the images. "Id. Drive information subsequently captured by vehicles traveling in opposite directions can then be aligned based on these features points because "both sets of feature points were captured along the same trajectory." Id., [0489]. The system of claim 79 thus overcomes technical difficulties by enabling more accurate alignment of drive information captured by vehicles traveling in opposite directions. Accordingly, claim 79 recites specific steps performed by a navigation system of a vehicle to collect a particular form of drive information used by a remotely located entity (e.g., a server). Claim 79 "covers a particular solution to a problem or a particular way to achieve a desired outcome" and does not "merely claim[] the idea of a solution or outcome," as cautioned against in the August 2025 Memo. August 2025 Memo, 4. Such practical applications as claimed are patent-eligible. “ However, the Examiner respectfully disagrees. Applicant’s arguments are not commensurate with the scope of the claim because “Drive information subsequently captured by vehicles traveling in opposite directions can then be aligned based on these features points because "both sets of feature points were captured along the same trajectory." is not apparently claimed. In addition, the Examiner submits the limitations “receiving a first image captured by a forward-facing camera…”, “receiving a second image captured by a rearward-facing camera…”, and “align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles…” are recited at a high-level of generality and amounts to mere pre- or post- solution actions, which is a form of insignificant extra-solution activity. Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea. The Examiner notes that there is no transformation of a particular article to a different state or thing recited in the claim. Further, aligning “driving information” is mere manipulation of information and not a transformation of a particular article. Furthermore, the cameras (e.g. forward-facing camera, and rearward-facing camera) are used to capture/gather data (e.g. “a first image”, “a second image”); therefore, the additional limitation is directed to an insignificant extra-solution activity.
Applicant’s arguments with respect to the rejection of claims under 35 U.S.C. 103 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 79-80, 82, 88, 90-91, 94-96, 99, 111-114, 116, 119, 121, 123, 135, 137, and 139-148 are rejected under 35 U.S.C. 101
Regarding claim 79:
Step 1: Statutory Category - Yes
The claim is directed toward a system which falls within one of the four statutory categories. MPEP 2106.3.
Step 2A Prong 1: Judicial Exception – Yes
Independent claim 79 includes limitations that recites an abstract idea. The claim recites “detect a first semantic feature represented in the first image…”, “identify, …, at least one position descriptor associated with the first semantic feature represented in the first image…”, “detect a second semantic feature represented in the second image…”,“identify, …, at least one position descriptor associated with the second semantic feature…”, and “determine, based on the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, and the position information, a position the first semantic feature relative the second semantic feature”, which given their broadest reasonable interpretation, the claim covers performance of the limitations in the human mind. For example, “detect a first semantic feature…” and “detect a second semantic feature…” steps in the context of this claim encompasses a human analyzing one or more image(s) and interprets elements such as objects, actions/events, and the image’s context to understand what the image represents.
Step 2A Prong 2: Practical Application – No
Claim 79 is evaluated whether as a whole it integrates the recited judicial exception into a practical application. As noted in the 2019 PEG, it must be determined whether any additional elements in the claim beyond the abstract idea integrate the exception into a practical application in a manner that imposes a meaningful limit on the judicial exception. The courts have indicated that additional elements merely using a computer to implement an abstract idea, adding insignificant extra solution activity, or generally linking use of a judicial except ion to a particular technological environment or field of use do not integrate a judicial exception into a “practical application”.
The claim does not include additional elements that are sufficient enough to amount to integrating the judicial exception into a practical application, for example, the claimed elements “receive a first image captured by a forward-facing camera onboard the host vehicle…”, “receive a second image captured by a rearward-facing camera onboard the host vehicle…”, “receive position information indicative of a position of the forward-facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured;”, “cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle…”, “receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles…”, and “align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles…” are recited at a high-level of generality and amounts to mere pre- or post- solution actions, which is a form of insignificant extra-solution activity. Claim 79 recites the additional elements of “forward-facing camera”, and “rearward-facing camera” are recited at a high-level of generality and merely links the abstract idea to a particular technological environment. Additionally, the “at least one processor”, “the entity remotely-located” and “trained machine learning model” are recited at a high-level of generality and amount to no more than mere instructions apply the exception using a generic computer. The claim is recited at a high-level of generality and merely automates the aforementioned step(s).
Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea .
Step 2B:
Claim 79 is evaluated as to whether the claim as a whole amount to significantly more
than the recited exception, i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim.
The claim does not include additional elements that are sufficient enough to provide an
inventive concept in Step 2B, for example, the claimed elements “receive a first image captured by a forward-facing camera onboard the host vehicle…”, “receive a second image captured by a rearward-facing camera onboard the host vehicle…”, “receive position information indicative of a position of the forward-facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured;”, “cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle…”, “receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles…”, and “align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles…” are well-understood, routine and conventional activity in the art. See MPEP 2106.05(d), II, “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information);”.
As discussed with respect to step 2A Prong 2, the additional elements of “forward-facing camera”, and “rearward-facing camera” are recited at a high-level of generality and merely links the abstract idea to a particular technological environment. Additionally, the “at least one processor”, “the entity remotely-located”, and “trained machine learning model” are recited at a high-level of generality and amount to no more than mere instructions apply the exception using a generic computer. The claim is recited at a high-level of generality and merely automates the aforementioned step(s). Accordingly, the claim is not patent eligible.
Regarding claims 99 and 135 , claim 99 recites a method and claim 135 recites a non-transitory computer-readable medium which fall within at least one of the four statutory categories. Claims 99 and 135 recite similar limitations as indicated above with respect to claim 79. Hence, the claim is not eligible for the same reasons as discussed above with respect to claim 79. All other limitations not discussed are the same as those discussed above with to claim 79. Discussion is omitted for brevity.
Claims 80, 82, 88, 90-91, 95-96, 139-140, 143-144, and 147-148 are also rejected under 35 U.S.C. 101 by virtue of their dependency to the independent claims.
Claims 80, 82, 88, 90-91, 95-96, 139-140, 143-144, and 147-148 do not recite additional elements that integrate the judicial exception into a practical application, because the additional elements are directed toward additional aspects of judicial exception and/or well-understood, routine and conventional additional elements that do not integrate the judicial exception into a practical application. For example, claim 82 recites “where in the X-Y-Z position is determined based on an one ego motion of the host vehicle and based on at least one of:…” further the abstract idea.
The dependent claims are rejected under 35 U.S.C. 101 under similar rationale as their independent claims.
Regarding claim 111:
Step 1: Statutory Category - Yes
The claim is directed toward a system which falls within one of the four statutory categories. MPEP 2106.3.
Step 2A Prong 1: Judicial Exception – Yes
Independent claim 111 includes limitations that recites an abstract idea. The claim recites “detect at least one object represented in the first image”, “identify, …, at least one front side two-dimensional feature point,…”. “detect a representation of the at least one object in the second image”, “identify, …, at least one rear side two-dimensional feature point,…”, and “determine, based on at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, and the position information that the first semantic feature and the second semantic feature are associated with a common object” which given their broadest reasonable interpretation, the claim covers performance of the limitations in the human mind. For example, “detect a first semantic feature…” and “detect a second semantic feature…” steps in the context of this claim encompasses a human analyzing one or more image(s) and interprets elements such as objects, actions/events, and the image’s context to understand what the image represents.
Step 2A Prong 2: Practical Application – No
Claim 111 is evaluated whether as a whole it integrates the recited judicial exception into a practical application. As noted in the 2019 PEG, it must be determined whether any additional elements in the claim beyond the abstract idea integrate the exception into a practical application in a manner that imposes a meaningful limit on the judicial exception. The courts have indicated that additional elements merely using a computer to implement an abstract idea, adding insignificant extra solution activity, or generally linking use of a judicial except ion to a particular technological environment or field of use do not integrate a judicial exception into a “practical application”.
The claim does not include additional elements that are sufficient enough to amount to integrating the judicial exception into a practical application, for example, the claimed elements “receive a first image captured by a forward-facing camera onboard the host vehicle…”, “receive a second image captured by a rearward-facing camera onboard the host vehicle…”, “receive position information indicative of a position of the forward-facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured;”, “cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle…”,“receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles that travel along the road segment in the first direction along with drive information from a second plurality of other vehicles…”, and “align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles…” are recited at a high-level of generality and amounts to mere pre- or post- solution actions, which is a form of insignificant extra-solution activity. Claim 111 recites the additional elements of “forward-facing camera”, and “rearward-facing camera” are recited at a high-level of generality and merely links the abstract idea to a particular technological environment. Additionally, the “at least one processor”, “the entity remotely-located”, and “trained machine learning model” are recited at a high-level of generality and amount to no more than mere instructions apply the exception using a generic computer. The claim is recited at a high-level of generality and merely automates the aforementioned step(s).
Accordingly, even in combination, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea .
Step 2B:
Claim 111 is evaluated as to whether the claim as a whole amount to significantly more
than the recited exception, i.e., whether any additional element, or combination of additional elements, adds an inventive concept to the claim.
The claim does not include additional elements that are sufficient enough to provide an
inventive concept in Step 2B, for example, the claimed elements “receive a first image captured by a forward-facing camera onboard the host vehicle…”, “receive a second image captured by a rearward-facing camera onboard the host vehicle…”, “receive position information indicative of a position of the forward-facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured;”, “cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle…”,“receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles that travel along the road segment in the first direction along with drive information from a second plurality of other vehicles…”, and “align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles…” are well-understood, routine and conventional activity in the art. See MPEP 2106.05(d), II, “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. Receiving or transmitting data over a network, e.g., using the Internet to gather data, Symantec, 838 F.3d at 1321, 120 USPQ2d at 1362 (utilizing an intermediary computer to forward information);”.
As discussed with respect to step 2A Prong 2, additional elements of “forward-facing camera”, and “rearward-facing camera” are recited at a high-level of generality and merely links the abstract idea to a particular technological environment. Additionally, the “at least one processor”, “the entity remotely-located”, and “trained machine learning model” are recited at a high-level of generality and amount to no more than mere instructions apply the exception using a generic computer. The claim is recited at a high-level of generality and merely automates the aforementioned step(s). Accordingly, the claim is not patent eligible.
Regarding claims 123 and 137 , the claim 123 recites a method and claim 137 recites a non-transitory computer-readable medium which fall within at least one of the four statutory categories. Claims 123 and 137 recite similar limitations as indicated above with respect to claim 111. Hence, the claim is not eligible for the same reasons as discussed above with respect to claim 111. All other limitations not discussed are the same as those discussed above with to claim 111. Discussion is omitted for brevity.
Claims 112, 114, 116, 119, 121, 141-142, and 145-146 are also rejected under 35 U.S.C. 101 by virtue of their dependency to the independent claims.
Claims 112, 114, 116, 119, 121, 141-142, and 145-146 do not recite additional elements that integrate the judicial exception into a practical application, because the additional elements are directed toward additional aspects of judicial exception and/or well-understood, routine and conventional additional elements that do not integrate the judicial exception into a practical application. For example, claim 114 recites “wherein the correlation is further based on at least one of the position information…” further the abstract idea.
The dependent claims are rejected under 35 U.S.C. 101 under similar rationale as their independent claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 79-80, 82, 88, 90 – 91, 95-96, 99, 111 – 112, 114, 116, 119, 121, 123, 135, and 139 – 148 are rejected under 35 U.S.C. 103 as being unpatentable over Ma et al. “Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization”, 2019 (Cited in IDS filed on 05/25/2022) in view Zou et al. (US 20170300763 A1), in view of Singh et al. (US 20220182498 A1), and further in view of Fridman (US 20180025235 A1, cited previously in the Office Action of 11/18/2024).
a. Regarding claim 79, and similarly with respect to claims 99 and 135, Ma et al. discloses A host vehicle-based sparse map feature harvester system, comprising: (Title, “Exploiting Sparse Semantic HD Maps for Self-Driving Vehicle Localization”) at least one processor comprising circuitry and a memory, wherein the memory includes instructions that when executed by the circuitry cause the at least one processor to: (Section IV(B)(b) “FFT-conv is used to accelerate the computation speed by a factor of 20 over the state-of-the-art GEMM-based spatial GPU correlation implementations”)
receive a first image captured by a forward-facing camera onboard the host vehicle, as the host vehicle travels along a road segment in a first direction, (Section IV(A) “Our localization system exploits a wide variety of sensors: GPS, IMU, wheel encoders, LiDAR, and cameras. These sensors are available in most self-driving vehicles. The GPS provides a coarse location with several meters accuracy; an IMU captures vehicle dynamic measurements; the wheel encoders measure the total travel distance; the LiDAR accurately perceives the geometry of the surrounding area through a sparse point cloud; images capture dense and rich appearance information.”) wherein the first image is representative of an environment in a forward direction relative to the first direction; (Fig. 3 “Our system detects signs in the camera images”)
detect a first semantic feature represented in the first image, wherein the first semantic feature is associated with a predetermined object type classification; (Fig. 3 “Our system detects signs in the camera images”, and section IV(A)(b) “we run an image-based semantic segmentation algorithm that performs dense semantic labeling of traffic signs.”)
identify, using at least one trained machine learning model, at least one position descriptor associated with the first semantic feature represented in the first image captured by the forward-facing camera; (Fig. 2 “We first detect signs in 2D using semantic segmentation in the camera frame, and then use the LiDAR points to localize the signs in 3D.” and Fig. 3 “an illustration of the neural network’s input and output”)
Ma et al. fails to explicitly disclose receive a second image captured by a rearward-facing camera onboard the host vehicle, as the host vehicle travels along the road segment in the first direction, wherein the second image is representative of an environment in a backward direction relative to the first direction; detect a second semantic feature represented in the second image, wherein the second semantic feature is associated with a predetermined object type classification; identify, using the at least one trained machine learning model, at least one position descriptor associated with the second semantic feature represented in the second image captured by the rearward-facing camera; cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle, wherein the drive information includes the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature,
Zou et al. teaches receive a second image captured by a rearward-facing camera onboard the host vehicle, (Fig. 1, 130d) as the host vehicle travels along the road segment in the first direction, wherein the second image is representative of an environment in a backward direction relative to the first direction; ( [0047] “ In particular, the method 600 provides for coordinating multi-camera fusion of images from a plurality of cameras (e.g., the cameras 130a, 130b, 130c, 130d).”, [0048] “the processing system 110 receives an image from each of the cameras 130. At block 604, for each of the cameras 130, following occurs: the top view generation engine 212 generates a top view of the road based on the image; the lane boundaries detection engine 214 detects lane boundaries of a lane of the road based on the top view; and the road feature detection engine 216 detects a road feature within the lane boundaries of the lane of the road using machine learning. The road feature detection engine can detect multiple road features.”)
detect a second semantic feature represented in the second image, wherein the second semantic feature is associated with a predetermined object type classification; ([0040] “The feature extraction 402 uses a neural network, as described herein, to extract road features, for example, using feature maps. The feature extraction 402 outputs the road features to the classification 404 to classify the road features, such as based on road features stored in the road feature database 218. The classification 404 outputs the road features 406a, 406b, 406c, 406d, etc., which can be a speed limit indicator, a bicycle lane indicator, a railroad indicator, a school zone indicator, a direction indicator, or other road feature.” and [0041] “It should be appreciated that the road feature detection engine 216, using the feature detection 402 and classification 404, can detect multiple road features (e.g., road features 406a, 406b, 406e, 406d, etc.) in parallel as one step and in real-time.”)
identify, using the at least one trained machine learning model, at least one position descriptor associated with the second semantic feature represented in the second image captured by the rearward-facing camera; ([0034] “The road feature detection engine 216 searches within the top view, as defined by the lane boundaries, to detect road features. The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).” and [0041] “It should be appreciated that the road feature detection engine 216, using the feature detection 402 and classification 404, can detect multiple road features (e.g., road features 406a, 406b, 406e, 406d, etc.) in parallel as one step and in real-time.”)
cause transmission of drive information for the road segment to an entity remotely-located relative to the host vehicle, (Fig. 2, [0034] “the road feature detection engine 216 uses the lane boundaries to detect road features within the lane boundaries of the lane of the road using machine learning and/or computer vision techniques. The road feature detection engine 216 searches within the top view, as defined by the lane boundaries, to detect road features. The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).” and [0035] “The road feature database 218 can be updated when road features are detected, and the road feature database 218 can be accessible by other vehicles, such as from a cloud computing environment over a network or from the vehicle 100 directly (e.g., using direct short-range communications (DSCR)). This enables crowd-sourcing of road features.”) wherein the drive information includes the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, ([0034] “The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention with reasonable expectations of success to modify the vehicle detection system of Ma et al. to incorporate multiple cameras including a rearward camera and a cloud/crowdsource system as taught by Zou et al. for the purpose of allowing the vehicle to detect features of the environment on all sides as well as inform other vehicles, increasing coverage of features.
Ma et al. in combination with Zou et al. fails to explicitly disclose receive position information indicative of a position of the forward- facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured; and wherein the drive information includes … the position information.
Singh et al. teaches receive position information indicative of a position of the forward- facing camera when the first image was captured and indicative of a position of the rearward-facing camera when the second image was captured; and wherein the drive information includes … the position information. ([0075] “The sensors may also include an image sensor (imaging means) that captures an image of surroundings of the vehicle 200. The example of FIG. 2 includes the first front camera 203, the side camera 204, the rear camera 208, and the second front camera 205, as image sensors.” and [0078] “the system 700 may receive data from each of the sensors described above. These data may be configured to allow determination or estimation of, for example, a position of each of objects around the vehicle 200 with respect to the vehicle 200 (or with respect to the corresponding one of the sensors), a distance from the vehicle 200 (or from each sensor) to the corresponding one of the objects, a type of each of the objects, and a behavior (e.g., a movement direction and speed of an object) of each of the objects.)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention with reasonable expectations of success to modify the feature location information/detection of Ma et al. in combination with Zou et al. to use position information of the respective sensor/camera when detecting features as taught by Singh et al. for the purpose of providing accurate and precise location information of the feature.
However, Ma et al. in combination with Zou et al. and Singh et al. fails to explicitly disclose the entity remotely-located relative to the host vehicle being configured to: determine, based on the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, and the position information, a position of the first semantic feature relative to the second semantic feature; receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles that travel along the road segment and drive information from a second plurality of other vehicles that travel along the road segment, the drive information from the first plurality of other vehicles being representative of the environment in the backward direction relative to the first direction; and align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles based, at least in part, on the relationship between the first semantic feature and the second semantic feature.
Fridman teaches the entity remotely-located relative to the host vehicle being configured to: (Figure 12, 1230) determine, based on the at least one position descriptor associated with the first semantic feature, the at least one position descriptor associated with the second semantic feature, and the position information, a position of the first semantic feature relative to the second semantic feature; ([0017] “determining a line representation of a road surface feature extending along a road segment, where the line representation of the road surface feature is configured for use in autonomous vehicle navigation, may comprise receiving, by a server, a first set of drive data including position information associated with the road surface feature, and receiving, by a server, a second set of drive data including position information associated with the road surface feature. The position information may be determined based on analysis of images of the road segment. The method may further comprise segmenting the first set of drive data into first drive patches and segmenting the second set of drive data into second drive patches; longitudinally aligning the first set of drive data with the second set of drive data within corresponding patches; and determining the line representation of the road surface feature based on the longitudinally aligned first and second drive data in the first and second draft patches.”)
receive, in addition to the drive information transmitted by the host vehicle, drive information from a first plurality of other vehicles that travel along the road segment drive information from a second plurality of other vehicles that travel along the road segment, (Figure 12, [0053] “a system that uses crowd sourcing data received from a plurality of vehicles for autonomous vehicle navigation”, and see at least paragraph [0277]) the drive information from the first plurality of other vehicles being representative of the environment in the backward direction relative to the first direction; and ([0278] “Each vehicle may be similar to vehicles disclosed in other embodiments (e.g., vehicle 200), and may include components or devices included in or associated with vehicles disclosed in other embodiments. Each vehicle may be equipped with an image capture device or camera (e.g., image capture device 122 or camera 122). Each vehicle may communicate with a remote server 1230 via one or more networks (e.g., over a cellular network and/or the Internet, etc.) through wireless communication paths 1235, as indicated by the dashed lines. Each vehicle may transmit data to server 1230 and receive data from server 1230. For example, server 1230 may collect data from multiple vehicles travelling on the road segment 1200 at different times, and may process the collected data to generate an autonomous vehicle road navigation model, or an update to the model. Server 1230 may transmit the autonomous vehicle road navigation model or the update to the model to the vehicles that transmitted data to server 1230.”, and see at least [0279])
align the drive information received from the first plurality of other vehicles with the drive information received from the second plurality of other vehicles based, at least in part, on the relationship between the first semantic feature and the second semantic feature. (Figure 15, [0292] “the data (e.g. ego-motion data, road markings data, and the like) may be shown as a function of position S (or S.sub.1 or S.sub.2) along the drive. Server 1230 may identify landmarks for the sparse map by identifying unique matches between landmarks 1501, 1503, and 1505 of drive 1510 and landmarks 1507 and 1509 of drive 1520. Such a matching algorithm may result in identification of landmarks 1511, 1513, and 1515. One skilled in the art would recognize, however, that other matching algorithms may be used. For example, probability optimization may be used in lieu of or in combination with unique matching. As described in further detail below with respect to FIG. 29, server 1230 may longitudinally align the drives to align the matched landmarks. For example, server 1230 may select one drive (e.g., drive 1520) as a reference drive and then shift and/or elastically stretch the other drive(s) (e.g., drive 1510) for alignment.”, and [0293] “aligned landmark data for use in a sparse map. In the example of FIG. 16, landmark 1610 comprises a road sign. The example of FIG. 16 further depicts data from a plurality of drives 1601, 1603, 1605, 1607, 1609, 1611, and 1613. In the example of FIG. 16, the data from drive 1613 consists of a “ghost” landmark, and the server 1230 may identify it as such because none of drives 1601, 1603, 1605, 1607, 1609, and 1611 include an identification of a landmark in the vicinity of the identified landmark in drive 1613.”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention with reasonable expectations of success to modify the feature location information/detection of Ma et al. in combination with Zou et al. and Singh to incorporate crowdsourcing data and aligning data received from other vehicles as taught by Fridman for the purpose of allowing the vehicle to navigate one or more roads.
b. Regarding claim 80, and similarly with respect to claim 143, Ma et al. in view of Zou et al., Singh et al., and Fridman discloses The system of claim 79,
Ma et al. discloses the at least one position descriptor associated with the first semantic feature includes an X-Y image position relative to the first image and the at least one position descriptor associated with the second semantic feature includes an X-Y image position relative to the the first semantic feature and the at least one position descriptor associated with the second semantic feature each include an X-Y-Z position relative to a predetermined origin. (Fig. 2 “We first detect signs in 2D using semantic segmentation in the camera frame”). Examiner Notes: The plurality of images captures the X-Y position of the traffic signs.
Yu et al. teaches the at least one position descriptor associated with … the at least one position descriptor associated with the second semantic feature includes ([0034] “The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).”)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention with reasonable expectations of success to modify the vehicle detection system of Ma et al. in combination with Zou et al., Singh et al. and Fridman to incorporate multiple cameras including a rearward camera as taught by Zou et al. for the purpose of allowing the vehicle to detect features of the environment on all sides, increasing coverage of features. Further, It would have been obvious to one of ordinary skill in the art, when in the combination, to perform detecting features in 2D using semantic segmentation as taught by Ma et al. with the second image captured by the rearward facing camera of Zou et al.
c. Regarding claim 82, and similarly with respect to claim 144, Ma et al. in view of Zou et al., Singh et al., and Fridman discloses The system of claim 80
Ma et al. discloses wherein the X-Y-Z position is determined based on an one ego motion of the host vehicle and based on at least one of: (Fig.2, “then use the LiDAR points to localize the signs in 3D. Mapping can aggregate information from multiple passes through
the same area using the ground truth pose information”) tracking a change in image position of the first semantic feature between the first image and at least one additional image or tracking a change in image position of the second semantic feature between the second image and the at least one additional image, the ego motion being determined based on at least one of a plurality of images or an output of at least one eqo motion sensor, the at least one eqo motion sensor including at least one of a speedometer, an accelerometer, or a GPS receiver. (Section IV “We formulate the localization problem as a histogram filter taking as input the structured outputs of our sign and lane detection neural networks, as well as GPS, IMU, and wheel odometry information, and outputting a probability histogram over the vehicle’s pose,
expressed in world coordinates.” and Section IV(A) “Let Gt be the GPS readings at time t and let L and T represent the lane graph and traffic sign maps respectively. We compute an estimate of the vehicle dynamics Xt from both IMU and the wheel encoders smoothed through an extended Kalman filter, which is updated at 100Hz. The localization task is formulated as a histogram filter aiming to maximize the agreement between the observed and mapped lane graphs and traffic signs while respecting vehicle dynamics”)
d. Regarding claim 88, Ma et al. in view of Zou et al., Singh et al., and Fridman discloses The system of claim 79,
Zou et al. teaches wherein at least one of the first semantic feature or the second semantic feature includes one of: a front side of a speed limit sign, a yield sign, a pole, a painted directional arrow, a traffic light, a billboard, or a building. ([0034] “The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).”) and [0035] “The road features can be predefined in a database of road features (e.g., road feature database 218). Examples of road features include a speed limit indicator, a bicycle lane indicator, a railroad indicator, a school zone indicator, and a direction indicator (e.g., left-turn arrow, straight arrow, right-turn arrow, straight and left-turn arrow, straight and right-turn arrow, etc.), and the like. “)
It would have been obvious to one of ordinary skill in the art before the effective filling date of the claimed invention with reasonable expectations of success to modify the semantic labeling system of features of Ma et al. in combination with Zou et al., Singh et al. and Fridman to incorporate specific labels of features including speed limit signs etc. as taught by Zou et al. for the purpose of precisely labeling feature types for accurate feature detection.
e. Regarding claim 90, and similarly with respect to claim 139, Ma et al. in view of Zou et al., Singh et al., and Fridman discloses The system of claim 79,
Zou et al. teaches wherein the drive information further includes one or more descriptors associated with each of the first and second semantic features. ([0034] “The road feature detection engine 216 can determine a type of road feature (e.g., a straight arrow, a left-turn arrow, etc.) as well as a location of the road feature (e.g., arrow ahead, bicycle lane to the left, etc.).”), and [0035] “The road features can be predefined in a database of road features (e.g., road feature database 218). Examples of road features include a speed limit indicator, a bicycle lane indicator, a railroad indicator, a school zone indicator, and a direction indicator (e.g., left-turn arrow, straight arrow, right-turn arrow, straight and left-turn arrow, straight and right-turn arrow, etc.), and the like.”)
It would have been obvious to one of ordinary skill in the art before the effective filling dat