Last updated: May 29, 2026
Application No. 18/508,818
VEHICLE CONTROL SYSTEMS FOR CAMERA-BASED VEHICLE NAVIGATION

Non-Final OA §102§103
Filed
Nov 14, 2023
Examiner
PEDERSEN, DAVID RUBEN
Art Unit
3658
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
GM Global Technology Operations LLC
OA Round
2 (Non-Final)
Interview Optional

— +50.0% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 56% grant rate with +50.0% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 105 resolved cases, 2023–2026
Examiner Intelligence

PEDERSEN, DAVID RUBEN View full profile →
Grants 56% of resolved cases
Career Allowance Rate
59 granted / 105 resolved
+4.2% vs TC avg
Strong +50% interview lift
Without
With
+50.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
23 currently pending
Career history
136
Total Applications
across all art units
Statute-Specific Performance

§101
3.2%
-36.8% vs TC avg
§103
87.6%
+47.6% vs TC avg
§102
6.1%
-33.9% vs TC avg
§112
3.2%
-36.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 105 resolved cases
Office Action

§102 §103
DETAILED ACTION
Claims 1-11, 13-20 are currently pending and have been examined in this application. Claim 12 is Canceled.
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is made FINAL in response to the “amendment” and “remarks” filed 09/15/2025.

Claim Objections
Claim 7  objected to because of the following informalities:  

Claim 7: Amend claim to correct apparent typographical error. 
“…process the image [[go]] to generate a saliency map…”

Appropriate correction is required.

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: “vehicle control module” in claim 1 and repeated throughout.
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. As such, “vehicle control module” will be interpreted as “an Application Specific Integrated Circuit (ASIC); a digital, analog, or mixed analog/digital discrete circuit; a digital, analog, or mixed analog/digital integrated circuit; a combinational logic circuit; a field programmable gate array (FPGA); a processor circuit (shared, dedicated, or group) that executes code; a memory circuit (shared, dedicated, or group) that stores code executed by the processor circuit; other suitable hardware components that provide the described functionality; or a combination of some or all of the above, such as in a system-on-chip.” (Spec Para 0104) 
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claim(s) 1-5, 9, 11, 13-17 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hori (US20210247201) in view of Hoppenot (US20220316906).

Claim 1:
Hori explicitly teaches:
A vehicle control system for camera-based vehicle navigation, the vehicle control system comprising: at least one vehicle camera configured to capture an image of a front view from a vehicle;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

a global positioning system (GPS) receiver configured to obtain a current location of the vehicle;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

a vehicle user interface including a display; and
(Hori) – “The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)
“A driver control interface 310 interfaces the computer 305 to one or more driver controls 311 that may include, for example, buttons on the vehicle's steering wheel, and enable the driver to provide one form of input to the route guidance system 300. A display interface 350 interfaces the computer 305 to one or more display devices 355 that may include, for example, an instrument cluster mounted display or a center console mounted display, and enable the route guidance system to display visual output to the driver.” (Para 0051)

a vehicle control module configured to: obtain the current location of the vehicle via the GPS receiver;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

identify a sequence of vehicle navigation steps from the current location of the vehicle to a target destination;
(Hori) – “FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“Further, another embodiment of the present invention is based on recognition that a method for providing route guidance for a driver in a vehicle can be realized by steps of acquiring multimodal information, analyzing the acquired multimodal information, identifying one or more salient objects based on the route, and generating a sentence that provides the route guidance based on the one or more salient objects. The method may include a step of outputting the generated sentence using one or more of a speech synthesis module or a display. In this case, a route is determined based on a current location and a destination, the sentence is generated based on the acquired multimodal information and the salient objects, and the multimodal information includes information from one or more imaging devices.” (Para 0125)

capture the image of the front view of the vehicle via the at least one vehicle camera;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

process the image with a machine learning model to detect multiple objects in the image;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“The route guidance system processes the dynamic map using a number of methods such as rule-based methods, or machine learning-based methods, in order to identify a salient object from the set of salient objects based on the route, to use as a selected salient object in order to provide route guidance.” (Para 0011)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“The analyzing may be achieved by including one or combination of steps of detecting and classifying a plurality of objects, associating a plurality of attributes to the detected objects, detecting locations of intersections in a heading direction of the vehicle based on the route, estimating a motion trajectory of a subset of the objects, and determining spatial relationships among a subset of the detected objects where the spatial relationship indicates relative positions and orientations between the objects. In some cases, the steps of detecting and classifying of the plurality of objects can be performed by use of a machine learning-based system, and further the attributes may include one of or combination of a dominant color, a depth relative to the current location of the vehicle, the object classes of the classifying may include one or more of pedestrians, vehicles, bicycles, buildings, traffic signs.” (Para 0125)

for each of the multiple objects, generate a landmark score according to a weighted combination of a visibility score for the object, [an intuitiveness score for the object], and a uniqueness score for the object, wherein the visibility score is determined at least in part based on whether an object is partially or totally obscured from a view of a driver, [the intuitiveness score is at least based on recognition of at least one of text or a logo in the object], and the uniqueness score is determined at least in part based on a count of how many times a type of the object occurs in other locations in a scene;
(Hori) – “In FIG. 1C, the driving instruction 105 is based on a description “the silver car turning right” of a salient object 125 in the dynamic map. In FIG. 1D, the driving instruction 105 is a warning based on a description “pedestrians in crosswalk” of a set of salient objects in the dynamic map: the pedestrians 106 and the crosswalk. These objects are important from the driver perspective because they are visible to the driver of the vehicle and they are on the next portion of the route 110 of the vehicle.” (Para 0046)
“Types of attributes that salient objects may possess in the dynamic map include: class, color, dynamics (i.e., motion), shape, size, location, appearance, and depth…and size of a visible portion of the salient object, in cases where the processor determines that only part of the object is visible from the driver perspective.” (Para 0104)
“It should be noted that in some embodiments according to the present disclosure, a salient object does not need to be currently visible or perceivable to a driver in order to be relevant from the driver perspective. For example, an ambulance that is approaching a current or future location of the vehicle may be relevant for inclusion in a driving instruction, such as “Warning: ambulance approaching from behind” or “Warning: ambulance approaching from behind the blue building on the left,” even if the ambulance is not currently visible or audible to a driver of the vehicle.” (Para 0105)
 “Fo…is a number that is a measure of how common a salient object of a given class is among all salient objects, where N(O) is the number of salient objects with the same class as O and H is the total number of salient objects. It is larger if there are fewer objects of the same class as O…Fc…is a number that is a measure of how common a salient object of a given color is among all salient objects, where N(c) is the number of salient objects with the same color as o and H is the total number of salient objects. It is larger if there are fewer salient objects with the same color as salient object O.” (Para 0122)
“The attentional multimodal fusion with scene understanding technology and context-based natural language generation realizes a powerful scene-aware interaction system to more intuitively interact with users based on objects and events in the scene.” (Para 0019)
“The sentence generation module 212 performs the operation of generating the driving instruction sentence 213 given the driving route 202, the viewing volume 211 and the dynamic map 206. The sentence generation module 212 uses a parametric function to select among the set of static salient objects 207 and dynamic salient objects 208 in the dynamic map 206 a small subset of objects that are most salient for generating the driving instruction sentence 213. Broadly speaking, the most salient object will tend to be larger, and more unique in color or location so as to enable the driver to quickly observe it.” (Para 0049)
“At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“Then the object ranker 1245 computes S for all of the salient objects and orders them from greatest to least. The selected salient object 1250 may then be determined as that salient object with the largest score S.” (Para 0123)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)
Examiner Note: Bracketed text not explicitly taught by primary reference, but is taught by non-primary reference later in the rejection.

rank the multiple objects according to landmark score for each of the multiple objects, wherein the landmark score is indicative of an object recognition likelihood by the driver of the vehicle; and
(Hori) – “At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“Then the object ranker 1245 computes S for all of the salient objects and orders them from greatest to least. The selected salient object 1250 may then be determined as that salient object with the largest score S.” (Para 0123)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)

display, on the vehicle user interface, a highest ranked one of the multiple objects in association with a next vehicle navigation step.
(Hori) – “At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)

Hori does not explicitly teach:
an intuitiveness score for the object…the intuitiveness score is at least based on recognition of at least one of text or a logo in the object

Hoppenot in the same field of endeavor of landmark scoring, teaches:
an intuitiveness score for the object…the intuitiveness score is at least based on recognition of at least one of text or a logo in the object
(Hoppenot) – “There is provided an approach for using street view images, captured from a selected geographical area, to obtain one or more of a landmark saliency score and a street crossing simplicity score, with each score reflecting a degree to which a computer-implemented circuit (including image recognition engines and a visual element matching module) can recognize and identify a landmark in at least one of the street view images. In turn, a navigational plan for a selected geographical area, including travel directions, is generated with the one or more of the landmark saliency score and the street crossing simplicity score.” (Abstract)
“In one example of the first embodiment, the visual element in the at least one electronic image may be a text portion or a logo. When the visual element is a logo, the computer-based image recognition system recognizes the logo with a logo recognition engine. When the visual element is a text portion, the computer-based image recognition system recognizes the text portion with a text recognition engine.” (Para 0011)
“The landmark_score (also referred to herein as “saliency score”) represents, among other things, the capability of a POI to be easily recognized and identified as a landmark by a human. As will appear form the following, that capability can be assessed from the capacity of the system 100 to recognize, from street view images, visual elements (e.g., text and/or logos) associated with POls.” (Para 0068)

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the technique for recognizing and identifying a landmark of Hoppenot. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “There is therefore a need for a machine-based approach for identifying landmarks or points of interest, in accordance with how such landmarks or points of interest might actually be perceived by human beings, for purposes of generating navigational plans.” (Hoppenot Para 0009)


Claim 2:
Hori in combination with the references relied upon in Claim 1 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to: record a turn action of the vehicle;
(Hori) – “FIG. 6A is a flow diagram illustrating training of a parametric function 635 of a navigation system 600A configured to generate driving instructions 640 based on a state of the vehicle 610 and a dynamic map 611, according to embodiments of the present disclosure. For example, the parametric function 635 can be implemented as a neural network with parameters included in a set of parameters 650, or as a rule-based system also involving parameters included in a set of parameters 650. Training can be performed by considering a training set 601 of training data examples including combinations of observed vehicle states 610, observed dynamic maps 611, and corresponding driving instructions 602. The training data examples may be collected by driving a vehicle in various conditions, recording observed vehicle states 610 and observed dynamic maps 611, and collecting corresponding driving instructions 602 as labels by asking humans to give examples of driving instructions that they consider relevant for guiding a driver in a situation corresponding to the current vehicle state and dynamic map.” (Para 0097)
“The training data examples may be collected by driving a vehicle in various conditions, recording observed measurements 603 of a scene, and collecting corresponding driving instructions 602” (Para 0099)

compare a location of the turn action to the next vehicle navigation step to determine a turn compliance score indicative of whether the highest ranked one of the multiple objects was an accurate guidance landmark; and
(Hori) – “FIG. 6A is a flow diagram illustrating training of a parametric function 635 of a navigation system 600A configured to generate driving instructions 640 based on a state of the vehicle 610 and a dynamic map 611, according to embodiments of the present disclosure. For example, the parametric function 635 can be implemented as a neural network with parameters included in a set of parameters 650, or as a rule-based system also involving parameters included in a set of parameters 650. Training can be performed by considering a training set 601 of training data examples including combinations of observed vehicle states 610, observed dynamic maps 611, and corresponding driving instructions 602. The training data examples may be collected by driving a vehicle in various conditions, recording observed vehicle states 610 and observed dynamic maps 611, and collecting corresponding driving instructions 602 as labels by asking humans to give examples of driving instructions that they consider relevant for guiding a driver in a situation corresponding to the current vehicle state and dynamic map.” (Para 0097)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)

update the machine learning model via supervised learning, according to the turn compliance score.
(Hori) – “FIG. 6A is a flow diagram illustrating training of a parametric function 635 of a navigation system 600A configured to generate driving instructions 640 based on a state of the vehicle 610 and a dynamic map 611, according to embodiments of the present disclosure. For example, the parametric function 635 can be implemented as a neural network with parameters included in a set of parameters 650, or as a rule-based system also involving parameters included in a set of parameters 650. Training can be performed by considering a training set 601 of training data examples including combinations of observed vehicle states 610, observed dynamic maps 611, and corresponding driving instructions 602. The training data examples may be collected by driving a vehicle in various conditions, recording observed vehicle states 610 and observed dynamic maps 611, and collecting corresponding driving instructions 602 as labels by asking humans to give examples of driving instructions that they consider relevant for guiding a driver in a situation corresponding to the current vehicle state and dynamic map.” (Para 0097)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)

Claim 3:
Hori teaches the respective limitations of Claim 1. Hori further teaches:
wherein the vehicle control module is configured to: obtain a distance between the current location of the vehicle and the next vehicle navigation step; and
(Hori) – “At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)


process the image with a depth estimation model to generate a region of interest within the image, wherein the distance between the current location of the vehicle and the next vehicle navigation step lies within the region of interest of the image.
(Hori) – “A sentence generator with a multimodal fusion model may be constructed based on a multimodal attention method.” (Para 0053)
“When time-sequential depth images obtained by a range sensor are used for Modal data, the system 300 uses the feature extractors 411, 421 and 431 (set K=3) in the figure. The real-time multimodal information, which can include images (frames) from at least one camera, signals from a measurement system, communication data from at least one neighboring vehicle, or sound signals via at least one microphone arranged in the vehicle, is provided to the feature extractors 411, 421 and 431 in the system 300 via the camera interface 360, the range sensor interface 370, or the microphone interface 380. The feature extractors 411, 421 and 431 can extract image data, audio data and depth data, respectively” (Para 0063)
“The spatial relationship among salient objects in the dynamic map is also used in the generation of driving instructions. Spatial relationships may indicate a relative 3D position of one or more objects to another object or set of objects. The relative positions are expressed as being positioned to the left, to the right, in front of, behind, over, under, etc. Depth or range information as estimated from cameras or acquired directly from range sensors (i.e., depth sensors) such as Lidar or radar sensors are used in the determination of relative 3D positions” (Para 0106)
“At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110) 
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
Examiner Note: multimodal fusion model corresponds with depth estimation model. Per BRI, region of interest may correspond with any region being considered by the depth estimation model.

Claim 4:
Hori in combination with the references relied upon in Claim 3 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to detect the multiple objects in the image only within the region of interest.
(Hori) – “A sentence generator with a multimodal fusion model may be constructed based on a multimodal attention method.” (Para 0053)
“When time-sequential depth images obtained by a range sensor are used for Modal data, the system 300 uses the feature extractors 411, 421 and 431 (set K=3) in the figure. The real-time multimodal information, which can include images (frames) from at least one camera, signals from a measurement system, communication data from at least one neighboring vehicle, or sound signals via at least one microphone arranged in the vehicle, is provided to the feature extractors 411, 421 and 431 in the system 300 via the camera interface 360, the range sensor interface 370, or the microphone interface 380. The feature extractors 411, 421 and 431 can extract image data, audio data and depth data, respectively” (Para 0063)
“The spatial relationship among salient objects in the dynamic map is also used in the generation of driving instructions. Spatial relationships may indicate a relative 3D position of one or more objects to another object or set of objects. The relative positions are expressed as being positioned to the left, to the right, in front of, behind, over, under, etc. Depth or range information as estimated from cameras or acquired directly from range sensors (i.e., depth sensors) such as Lidar or radar sensors are used in the determination of relative 3D positions” (Para 0106)
“At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110) 
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
Examiner Note: Per BRI, region of interest may correspond with any region being considered by the depth estimation model. Therefore, any object considered by the system would be within the region of interest.
Claim 5:
Hori in combination with the references relied upon in Claim 3 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to: access a database to obtain multiple points of interest corresponding to the current location of the vehicle; and detect objects corresponding to the points of interest, within the region of interest of the image.
(Hori) – “The analyzing can also include localizing the vehicle in the map. In this case, the map information may include a plurality of points of interest and one or more of the salient objects are selected from the points of interest based on a result of the analyzing.” (Para 0126)

Claim 9:
Hori in combination with the references relied upon in Claim 1 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to: divide the image into multiple segments via the machine learning model;
(Hori) – “The route guidance system processes the dynamic map using a number of methods such as rule-based methods, or machine learning-based methods, in order to identify a salient object from the set of salient objects based on the route, to use as a selected salient object in order to provide route guidance.” (Para 0011)
“The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)
“The object detection and classification module 311 may detect multiple salient objects from each image, wherein a bounding box and object class are predicted for each object.” (Para 0083)
Examiner Note: Segments are recited broadly and may correspond with any subdivision of the image, including the bounding boxes. See Fig. 1D as an example.

    PNG
    media_image1.png
    495
    651
    media_image1.png
    Greyscale


crop a bounding box for each of the multiple segments; and
(Hori) – “The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)
“The object detection and classification module 311 may detect multiple salient objects from each image, wherein a bounding box and object class are predicted for each object.” (Para 0083)
Examiner Note: See Fig. 1D.

transform each of the multiple segments into a text output via a large language machine learning model.
(Hori) – “The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)
“The object detection and classification module 311 may detect multiple salient objects from each image, wherein a bounding box and object class are predicted for each object.” (Para 0083)
Examiner Note: See Fig. 1D.

Claim 11:
Hori in combination with the references relied upon in Claim 1 teach those respective limitations.  Hori further teaches:
wherein the vehicle control module is configured to, for each object of the multiple objects:
[perform optical character recognition to identify text associated with the object; compare the object to a database of stored logo data to determine whether the object has a matching logo]; and obtain a confidence score from the machine learning model indicative of a detection confidence for the object.
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“The route guidance system processes the dynamic map using a number of methods such as rule-based methods, or machine learning-based methods, in order to identify a salient object from the set of salient objects based on the route, to use as a selected salient object in order to provide route guidance.” (Para 0011)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“The analyzing may be achieved by including one or combination of steps of detecting and classifying a plurality of objects, associating a plurality of attributes to the detected objects, detecting locations of intersections in a heading direction of the vehicle based on the route, estimating a motion trajectory of a subset of the objects, and determining spatial relationships among a subset of the detected objects where the spatial relationship indicates relative positions and orientations between the objects. In some cases, the steps of detecting and classifying of the plurality of objects can be performed by use of a machine learning-based system, and further the attributes may include one of or combination of a dominant color, a depth relative to the current location of the vehicle, the object classes of the classifying may include one or more of pedestrians, vehicles, bicycles, buildings, traffic signs.” (Para 0125)
“Then the object ranker 1245 computes S for all of the salient objects and orders them from greatest to least. The selected salient object 1250 may then be determined as that salient object with the largest score S.” (Para 0123)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)
Examiner Note: Bracketed text not explicitly taught by primary reference, but is taught by non-primary reference later in the rejection.

Hori does not explicitly teach:
perform optical character recognition to identify text associated with the object; compare the object to a database of stored logo data to determine whether the object has a matching logo

Hoppenot in the same field of endeavor of landmark scoring, teaches:
perform optical character recognition to identify text associated with the object; compare the object to a database of stored logo data to determine whether the object has a matching logo
(Hoppenot) – “There is provided an approach for using street view images, captured from a selected geographical area, to obtain one or more of a landmark saliency score and a street crossing simplicity score, with each score reflecting a degree to which a computer-implemented circuit (including image recognition engines and a visual element matching module) can recognize and identify a landmark in at least one of the street view images. In turn, a navigational plan for a selected geographical area, including travel directions, is generated with the one or more of the landmark saliency score and the street crossing simplicity score.” (Abstract)
“In one example of the first embodiment, the visual element in the at least one electronic image may be a text portion or a logo. When the visual element is a logo, the computer-based image recognition system recognizes the logo with a logo recognition engine. When the visual element is a text portion, the computer-based image recognition system recognizes the text portion with a text recognition engine.” (Para 0011)
“The landmark_score (also referred to herein as “saliency score”) represents, among other things, the capability of a POI to be easily recognized and identified as a landmark by a human. As will appear form the following, that capability can be assessed from the capacity of the system 100 to recognize, from street view images, visual elements (e.g., text and/or logos) associated with POls.” (Para 0068)

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the technique for recognizing and identifying a landmark of Hoppenot. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “There is therefore a need for a machine-based approach for identifying landmarks or points of interest, in accordance with how such landmarks or points of interest might actually be perceived by human beings, for purposes of generating navigational plans.” (Hoppenot Para 0009)

Claim 13:
Hori explicitly teaches:
A method of camera-based vehicle navigation, the method comprising: obtaining the current location of the vehicle via the GPS receiver;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

identifying a sequence of vehicle navigation steps from the current location of the vehicle to a target destination;
(Hori) – “FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“Further, another embodiment of the present invention is based on recognition that a method for providing route guidance for a driver in a vehicle can be realized by steps of acquiring multimodal information, analyzing the acquired multimodal information, identifying one or more salient objects based on the route, and generating a sentence that provides the route guidance based on the one or more salient objects. The method may include a step of outputting the generated sentence using one or more of a speech synthesis module or a display. In this case, a route is determined based on a current location and a destination, the sentence is generated based on the acquired multimodal information and the salient objects, and the multimodal information includes information from one or more imaging devices.” (Para 0125)

capturing the image of the front view of the vehicle via at least one vehicle camera;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)

processing the image with a machine learning model to detect multiple objects in the image;
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“The route guidance system processes the dynamic map using a number of methods such as rule-based methods, or machine learning-based methods, in order to identify a salient object from the set of salient objects based on the route, to use as a selected salient object in order to provide route guidance.” (Para 0011)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“The analyzing may be achieved by including one or combination of steps of detecting and classifying a plurality of objects, associating a plurality of attributes to the detected objects, detecting locations of intersections in a heading direction of the vehicle based on the route, estimating a motion trajectory of a subset of the objects, and determining spatial relationships among a subset of the detected objects where the spatial relationship indicates relative positions and orientations between the objects. In some cases, the steps of detecting and classifying of the plurality of objects can be performed by use of a machine learning-based system, and further the attributes may include one of or combination of a dominant color, a depth relative to the current location of the vehicle, the object classes of the classifying may include one or more of pedestrians, vehicles, bicycles, buildings, traffic signs.” (Para 0125)

for each of the multiple objects, generating a landmark score according to a weighted combination of a visibility score for the object, [an intuitiveness score for the object], and a uniqueness score for the object, wherein the visibility score is determined at least in part based on whether an object is partially or totally obscured from a view of a driver, [the intuitiveness score is at least based on recognition of at least one of text or a logo in the object], and the uniqueness score is determined at least in part based on a count of how many times a type of the object occurs in other locations in a scene;
(Hori) – “In FIG. 1C, the driving instruction 105 is based on a description “the silver car turning right” of a salient object 125 in the dynamic map. In FIG. 1D, the driving instruction 105 is a warning based on a description “pedestrians in crosswalk” of a set of salient objects in the dynamic map: the pedestrians 106 and the crosswalk. These objects are important from the driver perspective because they are visible to the driver of the vehicle and they are on the next portion of the route 110 of the vehicle.” (Para 0046)
“Types of attributes that salient objects may possess in the dynamic map include: class, color, dynamics (i.e., motion), shape, size, location, appearance, and depth…and size of a visible portion of the salient object, in cases where the processor determines that only part of the object is visible from the driver perspective.” (Para 0104)
“It should be noted that in some embodiments according to the present disclosure, a salient object does not need to be currently visible or perceivable to a driver in order to be relevant from the driver perspective. For example, an ambulance that is approaching a current or future location of the vehicle may be relevant for inclusion in a driving instruction, such as “Warning: ambulance approaching from behind” or “Warning: ambulance approaching from behind the blue building on the left,” even if the ambulance is not currently visible or audible to a driver of the vehicle.” (Para 0105)
 “Fo…is a number that is a measure of how common a salient object of a given class is among all salient objects, where N(O) is the number of salient objects with the same class as O and H is the total number of salient objects. It is larger if there are fewer objects of the same class as O…Fc…is a number that is a measure of how common a salient object of a given color is among all salient objects, where N(c) is the number of salient objects with the same color as o and H is the total number of salient objects. It is larger if there are fewer salient objects with the same color as salient object O.” (Para 0122)
“The attentional multimodal fusion with scene understanding technology and context-based natural language generation realizes a powerful scene-aware interaction system to more intuitively interact with users based on objects and events in the scene.” (Para 0019)
“The sentence generation module 212 performs the operation of generating the driving instruction sentence 213 given the driving route 202, the viewing volume 211 and the dynamic map 206. The sentence generation module 212 uses a parametric function to select among the set of static salient objects 207 and dynamic salient objects 208 in the dynamic map 206 a small subset of objects that are most salient for generating the driving instruction sentence 213. Broadly speaking, the most salient object will tend to be larger, and more unique in color or location so as to enable the driver to quickly observe it.” (Para 0049)
“At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“Then the object ranker 1245 computes S for all of the salient objects and orders them from greatest to least. The selected salient object 1250 may then be determined as that salient object with the largest score S.” (Para 0123)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)
Examiner Note: Bracketed text not explicitly taught by primary reference, but is taught by non-primary reference later in the rejection.

ranking the multiple objects according to landmark score for each of the multiple objects, wherein the landmark score is indicative of an object recognition likelihood by a driver of the vehicle; and
(Hori) – “At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)

displaying, on a vehicle user interface, a highest ranked one of the multiple objects in association with a next vehicle navigation step.
(Hori) – “At a particular instance of time, the navigation system can compare attributes of the salient objects perceived from the driver perspective to estimate a relevance score for each salient object indicating relevance of the salient object for inclusion in the generated driving instruction. The navigation system then selects a salient object for inclusion in the generated driving instruction from the set of salient objects based on a value of its relevance score. The navigation system estimates the relevance score of each salient object based on one or combination of a function of a distance of the salient object to the vehicle, a function of a distance of the salient object to a next turn on the route, and a function of a distance of the vehicle to the next turn on the route.” (Para 0110)
“The specific embodiment of sentence generation module 1243 illustrated in FIG. 12 employs a rule-based object ranker 1245 that uses a set of hand generated rules to rank the salient objects in order to output a selected salient object 1250. The rules may use to compare the set of salient objects based on their data and attributes to rank the salient objects to identify a selected salient object 1250. For example, the rules may favor dynamic objects moving the in the same direction as the vehicle. The rules may favor larger objects over smaller ones, or favor bright colors, such as red or green, over darker colors such as brown or black.” (Para 0118)
“The conveying of route guidance information may include highlighting the salient objects using a bounding rectangle, or other graphical elements, on a display such as, for example, an LCD display in the instrument cluster or central console. Alternately, the method of conveying may consist of generating a sentence, using, for example, a rule-based method or a machine learning-based method, which includes a set of descriptive attributes of the salient object. The generated sentence may be conveyed to the driver on the display.” (Para 0012)

Hori does not explicitly teach:
an intuitiveness score for the object…the intuitiveness score is at least based on recognition of at least one of text or a logo in the object

Hoppenot in the same field of endeavor of landmark scoring, teaches:
an intuitiveness score for the object…the intuitiveness score is at least based on recognition of at least one of text or a logo in the object
(Hoppenot) – “There is provided an approach for using street view images, captured from a selected geographical area, to obtain one or more of a landmark saliency score and a street crossing simplicity score, with each score reflecting a degree to which a computer-implemented circuit (including image recognition engines and a visual element matching module) can recognize and identify a landmark in at least one of the street view images. In turn, a navigational plan for a selected geographical area, including travel directions, is generated with the one or more of the landmark saliency score and the street crossing simplicity score.” (Abstract)
“In one example of the first embodiment, the visual element in the at least one electronic image may be a text portion or a logo. When the visual element is a logo, the computer-based image recognition system recognizes the logo with a logo recognition engine. When the visual element is a text portion, the computer-based image recognition system recognizes the text portion with a text recognition engine.” (Para 0011)
“The landmark_score (also referred to herein as “saliency score”) represents, among other things, the capability of a POI to be easily recognized and identified as a landmark by a human. As will appear form the following, that capability can be assessed from the capacity of the system 100 to recognize, from street view images, visual elements (e.g., text and/or logos) associated with POls.” (Para 0068)

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the technique for recognizing and identifying a landmark of Hoppenot. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “There is therefore a need for a machine-based approach for identifying landmarks or points of interest, in accordance with how such landmarks or points of interest might actually be perceived by human beings, for purposes of generating navigational plans.” (Hoppenot Para 0009)

Claim 14:
Rejected for the same reasons as Claim 2

Claim 15:
Rejected for the same reasons as Claim 3

Claim 16:
Rejected for the same reasons as Claim 4

Claim 17:
Rejected for the same reasons as Claim 5



Claim(s) 6, 18 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hori (US20210247201) in view of Hoppenot (US20220316906) further in view of Mizuno (US9464914).

Claim 6:
Hori in combination with the references relied upon in Claim 5 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to: [for each of the multiple points of interest, obtain an associated popularity score from the database, wherein the associated popularity score is indicative of a point of interest recognition level; and]
(Hori) – “FIG. 6A is a flow diagram illustrating training of a parametric function 635 of a navigation system 600A configured to generate driving instructions 640 based on a state of the vehicle 610 and a dynamic map 611, according to embodiments of the present disclosure. For example, the parametric function 635 can be implemented as a neural network with parameters included in a set of parameters 650, or as a rule-based system also involving parameters included in a set of parameters 650. Training can be performed by considering a training set 601 of training data examples including combinations of observed vehicle states 610, observed dynamic maps 611, and corresponding driving instructions 602. The training data examples may be collected by driving a vehicle in various conditions, recording observed vehicle states 610 and observed dynamic maps 611, and collecting corresponding driving instructions 602 as labels by asking humans to give examples of driving instructions that they consider relevant for guiding a driver in a situation corresponding to the current vehicle state and dynamic map.” (Para 0097)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)
Examiner Note: Bracketed text not explicitly taught by primary reference, but is taught by non-primary reference later in the rejection.

supply [each associated popularity score] to the machine learning model to facilitate ranking of the objects corresponding to the points of interest.
(Hori) – “FIG. 6A is a flow diagram illustrating training of a parametric function 635 of a navigation system 600A configured to generate driving instructions 640 based on a state of the vehicle 610 and a dynamic map 611, according to embodiments of the present disclosure. For example, the parametric function 635 can be implemented as a neural network with parameters included in a set of parameters 650, or as a rule-based system also involving parameters included in a set of parameters 650. Training can be performed by considering a training set 601 of training data examples including combinations of observed vehicle states 610, observed dynamic maps 611, and corresponding driving instructions 602. The training data examples may be collected by driving a vehicle in various conditions, recording observed vehicle states 610 and observed dynamic maps 611, and collecting corresponding driving instructions 602 as labels by asking humans to give examples of driving instructions that they consider relevant for guiding a driver in a situation corresponding to the current vehicle state and dynamic map.” (Para 0097)
“The dialog system 1262 provides an output used to adjust the object ranker's 1245 function. For example, a previous driving instruction using a first salient object was rendered to the driver, but the driver did not see the referenced salient object. The result is that the driver indicates by speech that they did not see the salient object. Therefore, the object ranker should reduce the score of the previous salient object in order to select an alternate salient object as the selected salient object 1250.” (Para 0124)

Hori does not explicitly teach:
for each of the multiple points of interest, obtain an associated popularity score from the database, wherein the associated popularity score is indicative of a point of interest recognition level; and… each associated popularity score

Mizuno in the same field of endeavor of landmark navigation, teaches:
for each of the multiple points of interest, obtain an associated popularity score from the database, wherein the associated popularity score is indicative of a point of interest recognition level; and… each associated popularity score
(Mizuno) – “Referring now to FIG. 1B, a landmark navigation program 108 may include a receiving module 118A, extraction module 118B, scoring module 118C, and modification module 118D. Receiving module 118A may receive a map route indicating departure and destination points. Receiving module 118A may receive the map route from an external device, user, or even an internal navigational unit. Extraction module 118B may extract information regarding different landmark candidates near the map route. Extraction module 118B may extract these information from an external data bank or an internal data table compiled for the extraction purposes. Scoring module 118C may assign a recognizability score to the landmark candidates based on a variety of factors (non-limiting examples of the factors includes user's preferences, distance to the map route, logo characteristic . . . ). Modification module 118D may select a landmark among the landmark candidates based on their respective recognizability scores and modify the map route based on the selected landmark. In an embodiment, detailed in FIG. 3, receiving module receives map route 310. Map route 310 is a route between departure 300 and destination 312 which includes landmarks 302, 304, 306, 314, and 308.” (Col 2 Ln 62 – Col 3 Ln 16)
Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the landmark navigation of Mizuno. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, for the purpose of “assigning a recognizability score to the plurality of landmark candidates.” (Mizuno Col 1 Ln 15-53)

Claim 18:
Rejected for the same reasons as Claim 6

Claim(s) 7-8, 19-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hori (US20210247201) in view of Hoppenot (US20220316906) further in view of Yu (“Texture-suppressed Visual Attention Model for Grain Insects Detection”).

Claim 7:
Hori in combination with the references relied upon in Claim 1 teach those respective limitations. Hori further teaches:
wherein the vehicle control module is configured to: process the image go generate a saliency map, the saliency map indicating a saliency level [for each pixel of the image]; and
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
Examiner Note: Bracketed text not explicitly taught by primary reference, but is taught by non-primary reference later in the rejection.

supply the saliency map to the machine learning model to facilitate ranking of the multiple detected objects.
(Hori) – “The route guidance system of the present invention may receive information from multiple sources, including a static map, the planned route, the vehicle's present position as determined by GPS or other methods, and real-time sensor information from a range of sensors including, but not limited to, one or more cameras, one or more microphones, and one or more range detectors including radars and LIDAR. The real-time sensor information is processed by a processor that is able to detect, from the real-time sensor information, a set of salient static and dynamic objects in the vicinity of the vehicle as well as a set of object attributes that may include, for example: each object's class, such as car, truck, building; and the object's color, size, and location. For dynamic objects, the processor may also determine the dynamic object's trajectory…The set of salient objects as well as their set of attributes is hereafter referred as the dynamic map.” (Para 0010)
“The route guidance system processes the dynamic map using a number of methods such as rule-based methods, or machine learning-based methods, in order to identify a salient object from the set of salient objects based on the route, to use as a selected salient object in order to provide route guidance.” (Para 0011)
“FIG. 1A is a block diagram of a navigation system illustrating features of some embodiments. A set of salient objects in a dynamic map may be identified and described based on sensing information perceived by a measurement system 160 of the vehicle, which includes information from one or multiple modalities such as audio information from a microphone 161, visual information from a camera 162, depth information from a range sensor (i.e., depth sensor) such as LiDAR 163, and localization information from a global positioning system (GPS) 164. The system outputs a driving instruction 105 based on a description of one or multiple salient objects from the set of salient objects. In some embodiments, the processor generates the driving instruction 105 by submitting the measurements from the measurement system 160 to a parametric function 170 that has been trained to generate the driving instruction from the measurements. In other embodiments, the multimodal sensing information obtained by the measurement system is used in the determination of a state of the vehicle (which we also refer to in this document as the vehicle state) and the dynamic map. The processor is configured to submit the state of the vehicle and the dynamic map to a parametric function 170 that is configured to generate the driving instruction 105 based on a description of a salient object in the dynamic map derived from a driver perspective specified by the state of the vehicle.” (Para 0043)
“The analyzing may be achieved by including one or combination of steps of detecting and classifying a plurality of objects, associating a plurality of attributes to the detected objects, detecting locations of intersections in a heading direction of the vehicle based on the route, estimating a motion trajectory of a subset of the objects, and determining spatial relationships among a subset of the detected objects where the spatial relationship indicates relative positions and orientations between the objects. In some cases, the steps of detecting and classifying of the plurality of objects can be performed by use of a machine learning-based system, and further the attributes may include one of or combination of a dominant color, a depth relative to the current location of the vehicle, the object classes of the classifying may include one or more of pedestrians, vehicles, bicycles, buildings, traffic signs.” (Para 0125)

Hori does not explicitly teach:
for each pixel of the image

Yu in the same field of endeavor of salience mapping, teaches:
for each pixel of the image
(Yu) – “For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the saliency map model of Yu. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “the proposed model can achieve higher precision-recall rate against other models for salient object detection.” (Yu Abstract)

Claim 8:
Hori in combination with the references relied upon in claim 7 teach those respective limitations. Hori does not explicitly teach the following limitations in full. However, Yu further teaches:
wherein the vehicle control module is configured to generate the saliency map by: generating a red color saliency map which highlights pixels in the image corresponding to a red color;
(Yu) – “Therefore, we should highlight the salient object while suppressing the texture of the
grains. Except the color feature, we exploit the orientation and texture of Gabor to produce the saliency map…For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)
“Some color spaces (like CIE Lab, YCbCr) are better to mimic the color difference of the eye. The original RGB color image can transform into unrelated components of luminance and colors. Li [12] utilizes I, red-green, blue-yellow feature maps to form a hyper-complex matrix. Inspired by [12], we convert the RGB color space into some new and simpler uncorrelated channels.” (Sec III.A)

generating a blue color saliency map which highlights pixels in the image corresponding to a blue color;
(Yu) – “Therefore, we should highlight the salient object while suppressing the texture of the
grains. Except the color feature, we exploit the orientation and texture of Gabor to produce the saliency map…For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)
“Some color spaces (like CIE Lab, YCbCr) are better to mimic the color difference of the eye. The original RGB color image can transform into unrelated components of luminance and colors. Li [12] utilizes I, red-green, blue-yellow feature maps to form a hyper-complex matrix. Inspired by [12], we convert the RGB color space into some new and simpler uncorrelated channels.” (Sec III.A)

generating an intensity saliency map indicating an intensity level for each pixel in the image;
(Yu) – “Principal Component Analysis can be used to transform the multi dimension Gabor features to a 1D magnitude intensity value for each pixel. Then, reshape the intensity value to 2D image…For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)

generating a gabor saliency map corresponding to detection of straight lines in the image; and
(Yu) – “For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)

combining the red color saliency map, the blue color saliency map, the intensity saliency map and the gabor saliency map.
(Yu) – “Principal Component Analysis can be used to transform the multi dimension Gabor features to a 1D magnitude intensity value for each pixel. Then, reshape the intensity value to 2D image…For each pixel in the image, we get 3 color features according to (1)-(3) and 3 Gabor magnitude features using (5). The saliency map can be computed for each feature in the feature set.” (Sec III.B)


Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation system of Hori with the saliency map model of Yu. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “the proposed model can achieve higher precision-recall rate against other models for salient object detection.” (Yu Abstract)

Claim 19:
Rejected for the same reasons as Claim 7

Claim 20:
Rejected for the same reasons as Claim 8


Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Hori (US20210247201) in view of Hoppenot (US20220316906) further in view of Mayster (US9464914).

Claim 10:
Hori in combination with the references relied upon in Claim 9 teach those respective limitations. Hori does not explicitly teach the following limitations in full. However, Mayster, in the same field of endeavor of navigation, teaches:
wherein the vehicle control module is configured to: generate a histogram of multiple terms according to the text output corresponding to each of the multiple segments; and
(Mayster) – “Satisfaction of the one or more entropic criteria can be based, for example, on a feature being infrequent (e.g., the only tree in an area or the only high-rise building in an area). Thus, in one example, clustering or other algorithmic techniques can be used to determine a rarity or infrequency associated with each feature, which can then be used to guide selection of features for use as landmarks. As one example, for each location, an area around the location can be analyzed to identify which features associated with the location are most rare (e.g., a histogram of semantic tags in an area around a location might reveal that an obelisk only occurs once while a mailbox occurs sixty times, thereby indicating that the obelisk would be better suited for use as a landmark). By way of further example, satisfaction of the one or more entropic criteria can include the distinctiveness of various characteristics of a feature with respect to other similar features in the area (e.g., although they may all be “buildings,” a small house on one corner of a four sided intersection will contrast with high-rise buildings on the other three corners).” (Para 0033)
“By way of example, clustering or other algorithmic techniques can be used to determine a rarity or infrequency associated with each feature, which can then be used to guide selection of features for use as landmarks. As one example, for each location, an area around the location can be analyzed to identify which features associated with the location are most rare (e.g., a histogram of semantic tags in an area around a vantage point location might determine that a monument with a rider mounted on a horse occurs once while a set of traffic lights occurs twenty times, thereby indicating that the monument has higher entropy value and is a better choice for use as a landmark).” (Para 0146)

select one of the multiple terms having a lowest frequency for display on the vehicle user interface in association with a next one of the sequence of vehicle navigation steps.
(Mayster) – “Satisfaction of the one or more entropic criteria can be based, for example, on a feature being infrequent (e.g., the only tree in an area or the only high-rise building in an area). Thus, in one example, clustering or other algorithmic techniques can be used to determine a rarity or infrequency associated with each feature, which can then be used to guide selection of features for use as landmarks. As one example, for each location, an area around the location can be analyzed to identify which features associated with the location are most rare (e.g., a histogram of semantic tags in an area around a location might reveal that an obelisk only occurs once while a mailbox occurs sixty times, thereby indicating that the obelisk would be better suited for use as a landmark). By way of further example, satisfaction of the one or more entropic criteria can include the distinctiveness of various characteristics of a feature with respect to other similar features in the area (e.g., although they may all be “buildings,” a small house on one corner of a four sided intersection will contrast with high-rise buildings on the other three corners).” (Para 0033)
“By way of example, clustering or other algorithmic techniques can be used to determine a rarity or infrequency associated with each feature, which can then be used to guide selection of features for use as landmarks. As one example, for each location, an area around the location can be analyzed to identify which features associated with the location are most rare (e.g., a histogram of semantic tags in an area around a vantage point location might determine that a monument with a rider mounted on a horse occurs once while a set of traffic lights occurs twenty times, thereby indicating that the monument has higher entropy value and is a better choice for use as a landmark).” (Para 0146)
“The computing device can then use the landmarks within the area to provide a set of instructions (e.g., audible instructions generated through a vehicle loudspeaker and/or textual or graphical instructions provided on a display) to a user to assist in navigation.” (Para 0025)

Therefore, it would be obvious to one of ordinary skill in the art, before the effective filing date of the claimed invention, to have modified the navigation and geocoding of Mayster with the saliency map model of Yu. One of ordinary skill in the art would have been motivated to make these modifications, with a reasonable expectation of success, because “the disclosed technology provides a variety of improvements in navigation and geocoding.” (Mayster Para 0025)

Claim 12: Canceled

Response to Arguments
The Claim Objection of Claim 19 mailed 6/16/2025 has been withdrawn because the “amendment” and “remarks” filed 09/15/2025 satisfactorily overcome this objection. However, the discrepancy remains in Claim 7. As such Claim 7 remains objected to.

Applicant's arguments with respect to the 35 U.S.C. 102 and 103 rejections mailed 6/16/2025 have been fully considered but they are not persuasive. Rejection has been updated to reflect amended language. Specifically, all claims are now rejected under 35 U.S.C. 103 at least over Hori in view of Hoppenot as necessitated by amendment. No claims are currently rejected under 35 U.S.C. 102. Examiner maintains that Hoppenot resolves any alleged deficiencies of the previously applied prior art as fully evidenced in the updated rejection rationale as necessitated by amendment.

As such, all remaining claims remain rejected over 35 U.S.C. 103.



Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to DAVID RUBEN PEDERSEN whose telephone number is (571)272-9696. The examiner can normally be reached M-Th: 07:00 -16:00 Eastern.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Vivek Koppikar can be reached at 571-272-5109. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.










Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/DAVID RUBEN PEDERSEN/Examiner, Art Unit 3667                                                                                                                                                                                                        
/VIVEK D KOPPIKAR/Supervisory Patent Examiner
Art Unit 3667                                                                                                                                                                                                        
December 2, 2025
Read full office action
Prosecution Timeline

Show 3 earlier events
Sep 10, 2025
Examiner Interview Summary
Sep 10, 2025
Applicant Interview (Telephonic)
Sep 15, 2025
Response Filed
Dec 05, 2025
Final Rejection mailed — §102, §103
Jan 16, 2026
Interview Requested
Jan 22, 2026
Applicant Interview (Telephonic)
Jan 22, 2026
Examiner Interview Summary
Feb 04, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

17/877,855
Patent 12637081
SYSTEM AND METHOD OF CONTROLLING AXLE HOP IN A VEHICLE
3y 10m to grant Granted May 26, 2026
18/487,633
Patent 12620272
INFORMATION PROCESSING APPARATUS
2y 6m to grant Granted May 05, 2026
18/645,882
Patent 12597267
METHOD AND SYSTEM FOR MULTI-OBJECT TRACKING AND NAVIGATION WITHOUT PRE-SEQUENCING
1y 11m to grant Granted Apr 07, 2026
17/877,569
Patent 12589756
ASYMMETRIC FAILSAFE SYSTEM ARCHITECTURE
3y 8m to grant Granted Mar 31, 2026
18/455,705
Patent 12590813
NAVIGATION INTERFACE DISPLAY METHOD AND APPARATUS, TERMINAL, AND STORAGE MEDIUM
2y 7m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
56%
Grant Probability
99%
With Interview (+50.0%)
3y 0m (~5m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 105 resolved cases by this examiner. Grant probability derived from career allowance rate.