Last updated: May 29, 2026
Application No. 18/121,551
SYSTEM AND METHOD FOR GENERATING A FUSED ENVIRONMENT REPRESENTATION FOR A VEHICLE

Final Rejection §103
Filed
Mar 14, 2023
Examiner
PATEL, PINALBEN V
Art Unit
2673
Tech Center
2600 — Communications
Assignee
Mercedes-Benz Group AG
OA Round
2 (Final)
Interview Optional

— +9.7% interview lift. Interview lift (+9.7%) is below the 15.0% threshold. A written response is recommended.
Based on 551 resolved cases, 2023–2026
Examiner Intelligence

PATEL, PINALBEN V View full profile →
Grants 89% — above average
Career Allowance Rate
490 granted / 551 resolved
+26.9% vs TC avg
Moderate +10% lift
Without
With
+9.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 3m
Avg Prosecution
17 currently pending
Career history
570
Total Applications
across all art units
Statute-Specific Performance

§101
2.1%
-37.9% vs TC avg
§103
69.9%
+29.9% vs TC avg
§102
2.3%
-37.7% vs TC avg
§112
17.4%
-22.6% vs TC avg
Black line = Tech Center average estimate • Based on career data from 551 resolved cases
Office Action

§103
1-20 DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant’s arguments with respect to claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

The argued features of combining traditional sensor data and learned data are combined to create hybrid BEV grid map of surrounding environment is disclosed by the newly cited prior art reference Cacas et al. (20190382007 A1). See sections below. 

Therefore, the rejection of the claims are maintained. 

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Park et al. (US Pub No. 20210019521 A1, as provided) in view of Ditty et al. (US Pub No. 20190258251 A1, as provided) and further in view of Cacas et al. (US Pub No. 20190382007 A1). 

Regarding Claim 1,
Park discloses A computing system for automated or assisted driving, the computing system comprising: one or more processors; a memory storing instructions that, when executed by the one or more processors, cause the computing system to:  (Park, [0113-0115], discloses control module 11, the input module 13, and the output module 15 may include a controller. The controller may process and calculate various information and control components of the modules. The controller may be physically provided in the form of an electronic circuit which processes an electrical signal. The modules may physically include only a single controller but, to the contrary, may include a plurality of controllers. As an example, the controller may be one or more processors installed in one computing means. As another example, the controller may be provided as processors which are installed on a server and a terminal physically separate from each other and cooperate with each other through communication)

receive, by a traditional sensor data processing module and a learned sensor data processing module, raw sensor data from a sensor suite of a vehicle, the sensor suite comprising a plurality of sensor types; (Park, [0120-0121], Fig. 5, discloses ship sensor system according to an exemplary embodiment of the present invention. Referring to FIG. 5, during sailing of a ship, an obstacle detection sensor, such as radar, light detection and ranging (LiDAR), or an ultrasonic detector, and an automatic identification system (AIS) are used in general; it is possible to detect an obstacle using a camera. Referring to FIG. 5, examples of a camera include a monocular camera, a binocular camera, an infrared (IR) camera, and a time-of-flight (TOF) camera but are not limited thereto; multiple sensor suite are disclosed including traditional and learned sensors to process raw data) 

generate, by the traditional sensor data processing module, a traditional bird's eye view (BEV) grid map or volume based on the raw sensor data; (Park, [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; three-dimensional volume data is captured by traditional sensors (lidar or radar) and represented as volume grid map)

generate, by the learned sensor data processing module, a learned BEV grid map or volume based on the raw sensor data; (Park, [0002], [0056], [0347-0348], discloses a situation awareness method and device using image segmentation and more particularly, to a situation awareness method and device using a neural network which performs image segmentation; a model which converts an image into a bird's eye view may be provided. For example, the model may receive a perspective-view image and output a bird's-eye view image; the model which converts an image into a bird's eye view may be trained with training data including an input perspective-view image and an image obtained by converting the input image into a bird's eye view. For example, the model may be trained on the basis of output data outputted from the model receiving the input perspective-view image and the image obtained by converting the input image into a bird's eye view; an autonomous navigation method of a ship based on a marine image and a neural network may include an operation of obtaining a training image including a plurality of pixel values and labeling data corresponding to the training image and including a plurality of labeling values which are determined by reflecting type information and distance information of a training obstacle included in the training image, an operation in which the neural network receives the training image and outputs output data including a plurality of output values corresponding to the labeling values, an operation of training the neural network using an error function in which differences between the labeling values and the output values are taken into consideration, an operation of obtaining the marine image from a camera installed on the ship, an operation of obtaining type information and distance information of an obstacle included in the marine image using the neural network, the obstacle including a plurality of pixel values, an operation of obtaining direction information of the obstacle included in the marine image on the basis of locations of the pixel values of the obstacle on the marine image, an operation of obtaining a location of the obstacle included in the marine map on an obstacle map using the distance information and the direction information of the obstacle included in the marine image, an operation of generating the obstacle map using the location on the obstacle map, an operation of generating a following path followed by the ship using the obstacle map and ship status information including location information and position information of the ship, and an operation of generating a control signal including a propeller control signal and a heading control signal using the following path so that the ship may follow the following path; view of surroundings is captured by learned camera image sensors and converted to bird’s eye view)

combine the learned BEV grid map or volume and the traditional BEV grid map or volume (Park, [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; traditional camera sensor image data and three-dimensional image data is captured by learned sensors (lidar or radar) and represented as volume grid map are combined (fused) together to obtain representation of environment around the ship (vehicle)) and 

process the hybrid BEV representation of the surrounding environment to derive a fused representation of the surrounding environment of the vehicle.  (Park, [0097], discloses autonomous navigation of a moving object according to an exemplary embodiment of the present invention, the moving object may perform situation awareness, path planning, and path-following operations. The moving object may determine surrounding obstacles and/or a navigable region through situation awareness, generate a path on the basis of the surrounding obstacles and/or navigable region, and then travel by itself along the generated path; environment surrounding around the ship is processed and represented by data captured by sensors and fused together)

Park does not explicitly disclose a vehicle 
Ditty discloses a vehicle (Ditty, [0033], [0124], discloses two distinctly different approaches have been proposed for autonomous vehicles. The first approach, computer vision, is the process of automatically perceiving, analyzing, understanding, and/or interpreting visual data. Such visual data may include any combination of videos, images, real-time or near real-time data captured by any type of camera or video recording device. Computer vision applications implement computer vision algorithms to solve high-level problems. For example, an ADAS system can implement real-time object detection algorithms to detect pedestrians/bikes, recognize traffic signs, and/or issue lane departure warnings based on visual data captured by an in-vehicle camera or video recording device; Controller (100) provides autonomous driving outputs in response to an array of sensor inputs including, for example: one or more ultrasonic sensors (66), one or more RADAR sensors (68), one or more Light Detection and Ranging (LIDAR) sensors (70), one or more surround cameras (72) (typically such cameras are located at various places on vehicle body (52) to image areas all around the vehicle body), one or more stereo cameras (74) (in preferred embodiments, at least one such stereo camera faces forward to provide depth-perception for object detection and object recognition in the vehicle path), one or more infrared cameras (75), GPS unit (76) that provides location coordinates, a steering sensor (78) that detects the steering angle, speed sensors (80) (one for each of the wheels (54)), an inertial sensor or inertial measurement unit (IMU) (82) that monitors movement of vehicle body (52) (this sensor can be for example an accelerometer(s) and/or a gyro-sensor(s) and/or a magnetic compass(es)), tire vibration sensors (85), and microphones (102) placed around and inside the vehicle. Other sensors may be used, as is known to persons of ordinary skill in the art; discloses visual images and data captured with array of sensors (suite of sensors) surrounding the environment around the vehicle to navigate)

Park discloses the claimed invention except for the vehicle. Ditty teaches that it is known to capture sensor data around the vehicle and process the data for obstacle and or lane traffic detection for safe autonomous navigation of the vehicle. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention was made to use modification in Park that captures data using array of sensors including radar or lidar and camera image data and combines them for object detection and navigation of a ship substituted by a vehicle on road, as taught by Ditty in order to obstacle detection, traffic pattern detection, lane keeping and lane centering for safe navigation of autonomous vehicle. 

	The combination of Park and Ditty does not explicitly disclose generate a hybrid BEV representation of a surrounding environment of the vehicle by combining the separate learned BEV grid map or volume and the traditional BEV grid map or volume along spatial dimensions; 
	Cacas discloses generate a hybrid BEV representation of a surrounding environment of the vehicle by combining the separate learned BEV grid map or volume and the traditional BEV grid map or volume along spatial dimensions; (Cacas, [0089], Fig. 3, discloses a graphical representation 260 of example sensor data (e.g., sensor data 204) provided to a machine-learned intent model 210 in the form of voxelized LIDAR data in birds-eye view form according to example embodiments of the present disclosure. With more particular reference to the graphical representation 260 of a 3D point cloud, it should be appreciated that standard convolutional neural networks (CNNs) perform discrete convolutions, assuming a grid structured input. Sensor data (e.g., sensor data 204) corresponding to point clouds in bird's eye view (BEV) can be represented as a 3D tensor, treating height as a channel dimension. This input parametrization has several key advantages: (i) computation efficiency due to dimensionality reduction (made possible as vehicles drive on the ground), (ii) non-overlapping targets (contrary to camera-view representations, where objects can overlap), (iii) preservation of the metric space (undistorted view) that eases the creation of priors regarding vehicles' sizes, and (iv) this representation also facilitates the fusion of LIDAR and map features as both are defined in bird's eye view. Multiple consecutive LIDAR sweeps (corrected by ego-motion) can be utilized as this can be helpful to accurately estimate both intention and motion forecasting. Height and time dimensions can be stacked together into the channel dimension as this allows the use of 2D convolutions to fuse time information; traditional sensor data and learned sensor data (autonomous grid map) are combined to create hybrid BEV grid map of the surrounding environment)
	
The combination of Park and Ditty discloses the claimed invention except for the vehicle. Cacas teaches that it is known to capture and learn the specific features from the sensor data around the vehicle and process the data for obstacle and or lane traffic detection for safe autonomous navigation of the vehicle. It would have been obvious to one having ordinary skill in the art before the effective filing date of the claimed invention was made to use modification in combination of Park and Ditty that fuses the map information obtained by traditional sensor data being two or dimensional or any form of data to derive grid map in BEV with feature map learned data to create BEV hybrid map data of the surroundings in order to accurately navigate in applications of autonomous driving.

Regarding Claim 2, 
The combination of Park, Ditty and Cacas further discloses wherein the plurality of sensor types comprises any combination of a-LIDAR sensors image sensors, a-radar sensors, or ultrasonic sensors. (Park, [0120-0121], Fig. 5, discloses ship sensor system according to an exemplary embodiment of the present invention. Referring to FIG. 5, during sailing of a ship, an obstacle detection sensor, such as radar, light detection and ranging (LiDAR), or an ultrasonic detector, and an automatic identification system (AIS) are used in general; it is possible to detect an obstacle using a camera. Referring to Fig. 5, examples of a camera include a monocular camera, a binocular camera, an infrared (IR) camera, and a time-of-flight (TOF) camera but are not limited thereto; multiple sensor suite are disclosed including traditional and learned sensors to process raw data). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim. 

Regarding Claim 3, 
The combination of Park, Ditty and Cacas further discloses wherein the traditional sensor data processing module performs inverse sensor modeling based on sensor measurements from the sensor suite to generate the traditional BEV grid map.  (Ditty, [0564] To the extent possible, some example embodiments consider yaw calibration a task for self-calibration of the steering rack. In any case, example non-limiting embodiments can use the vehicle odometry to determine how far the vehicle has driven and how much it may have turned. Example non-limiting embodiments can then use points on any of the lane lines (see block 3160) to project forward from one point in time to another and require that the points line up with the corresponding lane line at a second moment in time. This involves inverse projection through the unknown yaw of the camera, onto the ground and back through the unknown yaw of the camera at the second moment in time. This allows solving for the yaw of a camera in a way that is very closely tied to the control loop of the vehicle. The effect of yaw on the image of a point that the vehicle has travelled closer to is different due to the distance change. For example, if the distance has shrunk to half, the force-applying lever arms differ by a factor of two, resulting in yaw moving the image point differently by a factor of two; inverse projection of sensor data is performed to obtain grid map of the environment). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.

Regarding Claim 4, 
The combination of Park, Ditty and Cacas further discloses wherein the learned sensor data processing module generates a set of feature maps using the raw sensor data and generates the learned BEV grid map or volume using the set of feature maps.  (Ditty, [0096], Fig. 46, discloses output from a point detector for tracking features over multiple image produced from sensors, in accordance with embodiments of the present technology; features are extracted from image frames and processed to obtain feature maps). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas applied in rejection of claim 1 apply to this claim.

Regarding Claim 5, 
The combination of Park, Ditty and Cacas further discloses wherein the executed instructions cause the computing system to process the hybrid BEV representation of the surrounding environment to derive aspects of road infrastructure of a travel route on which the vehicle operates in real-time, the aspects of the road infrastructure include one or more of road topology, lane topology, lane boundaries, road markings, crosswalks, sidewalks, parking spaces, bicycle lanes, road and traffic signage, traffic signals, or right-of-way rules.  (Ditty, [0609], [0640], [0647], discloses relevant visual landmarks are road markings (parallel solid lines, dashes and dots, perpendicular lines, road images), road boundaries (points where drivable surface transitions into non-drivable surface that can be considered stationary), vertical landmarks such as poles, traffic signs and traffic lights. The visual landmarks are detected in some embodiments by a neural network with appropriate post processing; [0640] To determine the basic lane segments, each infinitesimal portion of a drive (which has a point and a direction) is modeled as a vote for that point and direction in the three-dimensional space of point direction. The embodiments apply a wavelet shaped kernel which falls off at a portion of a lane width parallel to the direction and at a small angle, then suppressing points and directions up to some minimal lane width and angular separation, and then goes to zero. The embodiments then search for connected sets of points in the point direction space that are parallel at a local maxima to the direction and that have positive accumulation of kernel votes (possibly with hysteresis); Multi-way stops (essentially intersections with a stopping behavior without traffic lights where no one has complete right of way) are handled by having a center area connected to each stopping point. This area can to some extent be perceived live or estimated as the drivable area in front of a stopping point, but is also reinforced through mapping. Other traffic is detected and tracked in the center area as well as a contender area surrounding the center area. If other traffic (vehicles, bicyclists, pedestrians) are detected in this area and pointing towards the center area, then they are considered contenders. If they are present and stopped or already proceeding when the vehicle stops, they are judged to be ahead of the vehicle. If they are still moving in a way that is not directly threatening and are not yet adjacent to the center area, the vehicle is ahead of them; road markings or lane segments for crossings or traffic lights are detected for road topology or infrastructure). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim. 

Regarding Claim 6, 
The combination of Park, Ditty and Cacas further discloses wherein the executed instructions cause the computing system to process the hybrid BEV representation of the surrounding environment to perform scene understanding tasks.  (Park, [0113], [0146], discloses location information may be presented as a plurality of categories having a certain range. For example, distance information may be presented as short distance, middle distance, long distance, and the like, and direction information may be presented as a left direction, a forward direction, a right direction, and the like. It is also possible to combine location information and direction information and present the combination as short distance to the left side, long distance to the right side, and the like; control module 11, the input module 13, and the output module 15 may include a controller. The controller may process and calculate various information and control components of the modules. The controller may be physically provided in the form of an electronic circuit which processes an electrical signal. The modules may physically include only a single controller but, to the contrary, may include a plurality of controllers. As an example, the controller may be one or more processors installed in one computing means. As another example, the controller may be provided as processors which are installed on a server and a terminal physically separate from each other and cooperate with each other through communication; data is processed using processors to obtain environment hybrid grid map for situation awareness around). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.  

Regarding Claim 7, 
The combination of Park, Ditty and Cacas further discloses wherein the scene understanding tasks comprise at least one of object detection, object classification, instance segmentation, motion prediction, or traffic rule determination tasks.  (Park, [0277-0280], discloses image segmentation results are visualized, results showing an object detected through the image segmentation may be output through a display panel. For example, the sea, a ship, geographical features, a navigation mark, etc. included in an image may be presented in different colors; it is possible to output obstacle characteristics including the distance, speed, danger, size, and collision probability of an obstacle. Obstacle characteristics may be output using color, which may vary according to the distance, speed, danger, size, and collision probability of an obstacle; Obstacle characteristics sensed by an obstacle sensor may be output together. For example, when a region is observed through image segmentation, image segmentation results of the region may be output, and when a region is not observed, sensing results of the region obtained by an obstacle sensor may be output; obstacle map, the following path, a path history, and/or the like may be visualized. For example, a ship-oriented obstacle map, an existing path, and an obstacle-avoiding path may be output in a bird’s eye view; object detection, obstacle classification, image regio segmentations, motion detection and traffic rule associated with speed are determined). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.    

Regarding Claim 8, 
The combination of Park, Ditty and Cacas further discloses wherein the traditional BEV grid map or volume comprises one of a two-dimensional BEV grid map, a three-dimensional grid volume, or any n-dimensional discretized space.  (Park, [0127], [0189], any kind of image may be used to perform image segmentation. The image may be an image captured by a camera. Referring to FIG. 5, it is possible to use images obtained from various cameras, such as a monocular camera, a binocular camera, an IR camera, and a TOF camera. Also, the image is not limited to a two-dimensional (2D) image and may be a three-dimensional (3D) image or the like; Surroundings may be sensed in a way other than image segmentation. For example, it is possible to sense surroundings using an obstacle detection sensor, such as a radar, a LiDAR, and an ultrasonic detector, or information obtained with the obstacle detection sensor may be processed through a neural network to sense surroundings. Hereinafter, location information obtained as results of image segmentation will be referred to as first location information, and location information obtained in a way other than image segmentation will be referred to as second location information; two- or three-dimensional image data or volume data is captured using traditional sensors of radar or lidar). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     

Regarding Claim 9, 
The combination of Park, Ditty and Cacas further discloses wherein the learned BEV grid map or volume comprises a sensor-fused, learned BEV grid map or volume based on the raw sensor data from the plurality of sensor types.  (Park, [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; traditional camera sensor image data and three-dimensional image data is captured by learned sensors (lidar or radar) and represented as volume grid map are combined (fused) together to obtain representation of environment around the ship). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     

Regarding Claim 10, 
The combination of Park, Ditty and Cacas further discloses wherein the learned BEV grid map or volume is generated using image data, and wherein the traditional BEV grid map or volume is generated using at least one of LIDAR data or radar data.  ((Park, [0077], [0220-0221], [0347-0348], discloses obtaining a target maritime image generated from a camera, the camera being installed on a port or a vessel and monitoring surroundings thereof; and determining a distance of a target vessel in the surroundings based on the distance level index of the maritime information being outputted from the neural network which receives the target maritime image and having the first type index object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship;; discloses a model which converts an image into a bird's eye view may be provided. For example, the model may receive a perspective-view image and output a bird's-eye view image; the model which converts an image into a bird's eye view may be trained with training data including an input perspective-view image and an image obtained by converting the input image into a bird's eye view. For example, the model may be trained on the basis of output data outputted from the model receiving the input perspective-view image and the image obtained by converting the input image into a bird's eye view; three-dimensional volume data is captured by traditional sensors (lidar or radar) and represented as volume grid map; view of surroundings is captured by learned camera image sensors and converted to bird’s eye view). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     

Regarding Claim 11, 
The combination of Park, Ditty and Cacas further discloses wherein vehicle comprises an autonomous vehicle, and wherein the executed instructions further cause the computing system to: dynamically analyze the fused representation of the surrounding environment to autonomously operate a set of control mechanisms of the autonomous vehicle along a travel route.   (Park, [0097], [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; for autonomous navigation of a moving object according to an exemplary embodiment of the present invention, the moving object may perform situation awareness, path planning, and path-following operations. The moving object may determine surrounding obstacles and/or a navigable region through situation awareness, generate a path on the basis of the surrounding obstacles and/or navigable region, and then travel by itself along the generated path; various sensor inputs are combined (fused) to generate hybrid surrounding information using various sensors including camera, lidar, radar or ultrasound sensors and analyzed to determine autonomous navigation of a ship or vehicle). Additionally, the rational and motivation to combine the references P Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     
 
Regarding Claim 12,
The combination of Park, Ditty and Cacas further discloses wherein the learned BEV grid map or volume is generated based on sensor data from one or more sensor types of the plurality of sensor types, and wherein the traditional BEV grid map or volume is generated based on sensor data from one or more sensor types of the plurality of sensor types. (Park, [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; traditional camera sensor image data and three-dimensional image data is captured by learned sensors (lidar or radar) and represented as volume grid map are combined (fused) together to obtain representation of environment around the ship). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     

Regarding Claim 13, 
The combination of Park, Ditty and Cacas further discloses dynamically analyze the fused representation of the surrounding environment to assist a driver of the vehicle during operation of the vehicle by the driver.  (Park, [0097], [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; for autonomous navigation of a moving object according to an exemplary embodiment of the present invention, the moving object may perform situation awareness, path planning, and path-following operations. The moving object may determine surrounding obstacles and/or a navigable region through situation awareness, generate a path on the basis of the surrounding obstacles and/or navigable region, and then travel by itself along the generated path; various sensor inputs are combined (fused) to generate hybrid surrounding information using various sensors including camera, lidar, radar or ultrasound sensors and analyzed to determine autonomous navigation of a ship or vehicle). (Ditty, [0033], [0141], [0121], discloses two distinctly different approaches have been proposed for autonomous vehicles. The first approach, computer vision, is the process of automatically perceiving, analyzing, understanding, and/or interpreting visual data. Such visual data may include any combination of videos, images, real-time or near real-time data captured by any type of camera or video recording device. Computer vision applications implement computer vision algorithms to solve high-level problems. For example, an ADAS system can implement real-time object detection algorithms to detect pedestrians/bikes, recognize traffic signs, and/or issue lane departure warnings based on visual data captured by an in-vehicle camera or video recording device; each controller is essentially one or more onboard supercomputers that can operate in real-time to process sensor signals, and output autonomous operation commands to self-drive vehicle (50) and/or assist the human vehicle driver in driving. Each vehicle may have any number of distinct controllers for functional safety and additional features. For example, Controller (100(1)) may serve as the primary computer for autonomous driving functions, Controller (100(2)) may serve as a secondary computer for functional safety functions, Controller (100(3)) may provide artificial intelligence functionality for in-camera sensors, and Controller (100(4)) (not shown) may provide infotainment functionality and provide additional redundancy for emergency situations; [0141] Self-driving vehicle (50) preferably includes one or more LIDAR sensors (70), which are often used for object and pedestrian detection, emergency braking, and collision avoidance. LIDAR sensors measure distances by measuring the Time of Flight (“ToF”) that it takes a short laser pulse to travel from the sensor to an object and back, calculating the distance from the known speed of light. LIDAR detects smaller objects and is effective at detecting distance under relatively clear atmospheric conditions. However, LIDAR does not work well in adverse weather conditions, and is not particularly effective at detecting non-reflective objects, such as muddy or dusty objects. Thus, unlike RADAR, LIDAR sensors typically must have a clear unobstructed line of sight—the sensors cannot be obscured by dirt, dust, or other obstruction; various sensor inputs are fused to together and processed to detect objects in path or obstacles and traffic to automatically control the vehicle for safe navigation). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     
 
Regarding Claim 14, 
The combination of Park, Ditty and Cacas further discloses wherein the executed instructions cause the ADAS to assist the driver of the vehicle by automatically performing one or more of the following: adaptive cruise control, emergency brake assist, lane-keeping, lane centering, highway assist, autonomous obstacle avoidance, or autonomous parking tasks. (Park, [0097], [0220-0221], discloses object information is not limited to object information obtained through image segmentation, and object information obtained through another sensor, such as a radar or a LiDAR, may also be an input to the obstacle map update operation. It is also possible to combine all or some of the pieces of object information; obstacle map refers to a means for presenting object information. As an example, the obstacle map may be a grid map. In the grid map, the space may be divided into unit regions, and object information may be displayed according to each unit region. As another example, the obstacle map may be a vector map. The obstacle map is not limited to being two dimensional and may be a 3D obstacle map. Meanwhile, the obstacle map may be a global map which presents all zones related to ship sailing from a starting point to a destination or a local map which presents certain zones around the ship; for autonomous navigation of a moving object according to an exemplary embodiment of the present invention, the moving object may perform situation awareness, path planning, and path-following operations. The moving object may determine surrounding obstacles and/or a navigable region through situation awareness, generate a path on the basis of the surrounding obstacles and/or navigable region, and then travel by itself along the generated path; various sensor inputs are combined (fused) to generate hybrid surrounding information using various sensors including camera, lidar, radar or ultrasound sensors and analyzed to determine autonomous navigation of a ship or vehicle). (Ditty, [0137-0138], [0156], [0335], discloses Long-Range RADAR is often used for ACC functionality; short and medium-range RADAR is often used for cross-traffic alerts (for front-facing RADAR), blind spot detection, and rear collision warnings. Suitable long-range RADAR systems include, without limitation, RADAR systems that provide a broad field of view realized by two independent scans, with approximately 250 m range. Example embodiments can include sensors that distinguish between static and moving objects, and can be used in conventional ADAS for Emergency brake Assist or Forward Collision Warning; Rear cameras may be used for park assistance, surround view, rear collision warnings, and creating and updating the occupancy grid. A wide variety of cameras may be used, including, cameras that are also suitable as a front-facing camera. Rear camera may also be a stereo camera (74) of the type discussed above; the camera types provided herein are examples provided without limitation. Almost any type of digital camera may be adapted for use with the technology. Alternate cameras can be any available type including (without limitation) 60 fps and global shutter. Preferably, the color filter pattern is RCCB, and Clear Pixel cameras are used to increase sensitivity. The technology can also include cameras installed to perform known ADAS functions as part of a redundant or fail-safe design, as discussed below. For example, a Multi-Function Mono Camera may be installed to provide functions including lane departure warning, traffic sign assist and intelligent headlamp control; A suitable ADAS SoC is designed to be used for Lane Departure Warning (LDW), alerting the driver of unintended/unindicated lane departure; Forward Collision Warning (FCW), indicating that under the current dynamics relative to the vehicle ahead, a collision is imminent, Automatic Emergency Braking (AEB) identifying imminent collision, Adaptive Cruise Control (ACC), Lane Keeping Assist (LKA), and Lane Centering (LC); brake assist, park assist, lane keeping, lane centering assist controls are determined based on processed data for automatic safe navigation of vehicle). Additionally, the rational and motivation to combine the references Park, Ditty and Cacas as applied in rejection of claim 1 apply to this claim.     
 
Claims 15-19 recite computer readable medium with instructions corresponding to the system elements recited in Claims 1-5 respectively. Therefore, the recited instruction of the computer readable medium Claims 15-19 are mapped to the proposed combination in the same manner as the corresponding elements of Claims 1-5 respectively. Additionally, the rationale and motivation to combine the of Park, Ditty and Cacas references presented in rejection of Claim 1, apply to these claims.
Furthermore, the combination of Park and Ditty further discloses A non-transitory computer-readable medium storing instructions that, when executed by one or more processors of a computing system, cause the computing system (Park [0349], discloses methods according to exemplary embodiments may be implemented in the form of program instructions, which are executable by various computer means, and stored in a computer-readable recording medium. The computer-readable recording medium may include program instructions, data files, and data structures independently or in combination. The program instructions stored in the computer-readable recording medium may be specially designed and constructed for the exemplary embodiments or may be well-known to those of ordinary skill in the computer software field. Examples of the computer-readable recording medium may include magnetic media, such as a hard disk, a floppy disk, and a magnetic tape, optical media, such as a compact disc read-only memory (CD-ROM) and a digital versatile disc (DVD), magneto-optical media, such as a floptical disk, and hardware devices, such as a ROM, a random access memory (RAM), and a flash memory, which are specifically constructed to store and execute program instructions. Examples of the program instructions include high-level language code executable by a computer using an interpreter or the like as well as machine language code made by a compiler. The hardware devices may be configured to operate as one or more software modules or vice versa to perform operations of the exemplary embodiments). 

Claim 20 recite method with steps corresponding to the system elements recited in Claim 1. Therefore, the recited steps of the method Claim 20 are mapped to the proposed combination in the same manner as the corresponding elements of Claims 1. Additionally, the rationale and motivation to combine the Park, Ditty and Cacas references presented in rejection of Claim 1, apply to this claim.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:

Kang et al. (US Pub No. 20250322540 A1, A method of determining a position of a vehicle in which a reference map is provided. The reference map comprises segmentations of a reference image with landmarks. A measurement image of a vehicle environment is captured and segmentations of the measurement image and neighborhood graphs are determined to obtain a measurement map, wherein a segmentation is represented by a vertex and where a neighborhood graph comprises the vertex and edges containing information to identify neighboring vertices of the vertex. Segmentations of the reference image are compared with the segmentations represented by the vertices of the measurement image and the neighborhood graphs, and segmentations contained in the reference image and in measurement image are determined. The vehicle's position is estimated with reference to the reference map during its movement along a road based on a result of the comparison)

KR-102618443-B1 (A method and system for determining the geographic location and orientation of a vehicle traveling through a road network are disclosed. The method includes obtaining, from one or more cameras associated with a vehicle traveling through the road network, a sequence of images reflecting the environment of the road network in which the vehicle is traveling, each of the images being recorded. holds the camera position associated with the image. A local map representation representing the area of the road network over which the vehicle is traveling is then generated using at least some of the acquired images and the associated camera positions. The generated local map representation is compared to a section of a reference map, the reference map section covering the area of the road network on which the vehicle is traveling, and the geographic location and orientation of the vehicle within the road network are determined based on the comparison. do. A method and system for generating and/or updating an electronic map using data acquired by vehicles traveling on a road network represented by the electronic map is also disclosed)

Any inquiry concerning this communication or earlier communications from the examiner should be directed to PINALBEN V PATEL whose telephone number is (571)270-5872. The examiner can normally be reached M-F: 10am - 8pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Wills-burns Chineyere can be reached at 571-272-9752. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/Pinalben Patel/Examiner, Art Unit 2673
Read full office action
Prosecution Timeline

Mar 14, 2023
Application Filed
Sep 16, 2025
Non-Final Rejection mailed — §103
Jan 13, 2026
Response Filed
May 06, 2026
Final Rejection mailed — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/159,670
Patent 12633128
SYSTEMS AND METHODS FOR TARGET ASSIGNMENT FOR END-TO-END THREE-DIMENSIONAL (3D) DETECTION
3y 3m to grant Granted May 19, 2026
18/043,607
Patent 12620130
LOCATING METHOD AND APPARATUS FOR ROBOT, AND STORAGE MEDIUM
3y 2m to grant Granted May 05, 2026
18/061,775
Patent 12614293
MONOCULAR WORLD MESHING
3y 4m to grant Granted Apr 28, 2026
18/248,617
Patent 12614305
TARGET OBJECT DETECTION METHOD AND APPARATUS, AND ELECTRONIC DEVICE, STORAGE MEDIUM AND PROGRAM
3y 0m to grant Granted Apr 28, 2026
18/311,014
Patent 12602824
SUBSTRATE TREATING APPARATUS AND SUBSTRATE TREATING METHOD
2y 11m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
89%
Grant Probability
99%
With Interview (+9.7%)
2y 3m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 551 resolved cases by this examiner. Grant probability derived from career allowance rate.