Last updated: April 19, 2026
Application No. 18/178,705
SYSTEMS AND METHODS FOR DETECTING AND LABELING OBSTACLES ALONG TRAJECTORIES OF AUTONOMOUS VEHICLES

Non-Final OA §103
Filed
Mar 06, 2023
Examiner
ROBARGE, TYLER ROGER
Art Unit
3658
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Kodiak Robotics Inc.
OA Round
3 (Non-Final)
Interview Optional

— +9.1% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 22 resolved cases, 2023–2026
Examiner Intelligence

ROBARGE, TYLER ROGER View full profile →
Grants 77% — above average
Career Allow Rate
17 granted / 22 resolved
+25.3% vs TC avg
Moderate +9% lift
Without
With
+9.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
34 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
13.6%
-26.4% vs TC avg
§103
56.7%
+16.7% vs TC avg
§102
12.3%
-27.7% vs TC avg
§112
16.2%
-23.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 22 resolved cases
Office Action

§103
DETAILED ACTION	
	This Office Action is taken in response to Applicant’s Amendment and Remarks filed on 02/09/2026 regarding Application No. 18/178,705 originally filed on 03/06/2023. Claims 1-20 are pending for consideration:

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
The applicant argues “Christie does not disclose the claimed per obstacle process of using LiDAR data corresponding to a detected obstacle to define an image region in the camera image and then performing obstacle-specific color and shape queries on that image region as part of determining the obstacle label. Christie's discussion of ‘classification based on one or more of the detected object's size, shape, and colour’ is not the same as the claimed obstacle specific query operations tied to a LiDAR-projected image region...” [Remarks, p. 11]. The examiner respectfully disagrees.
To the extent applicant’s argument is directed to the newly added limitation of projecting LiDAR data corresponding to a detected obstacle into the image to define an image region corresponding to the obstacle, that argument is moot in view of the present rejection, which additionally relies on Hölzel for that limitation. Christie still discloses object classification using image-based size, shape, and colour information. (as per “object classification is performed using a predefined lookup table of vehicle types. In particular, an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array)” in P20L1-10, as per “These object detections are input to the dynamic object model, which combines the data to estimate a classification, size, distance, direction, velocity, or direction of travel of the object” in P15L20-33). Hölzel, as set forth in the present rejection, supplies the additional projecting/image-region aspect now recited. Thus, once Hölzel’s projected LiDAR-to-image region is used, Christie’s image-based classification framework teaches using the resulting image information, including shape and colour, for obstacle classification - meeting the claimed color and shape query framework in the present combination.

The applicant argues “However, Saranin does not remedy the deficiencies of Christie… Saranin's pedestrian-related cluster heuristics (e.g., clusters above certain size thresholds being ‘unlikely’ to be pedestrians) do not teach assigning an affirmative ‘not a pedestrian’ indication as part of the obstacle label, and do not teach doing so jointly with a vegetation indication for the same obstacle label... Christie and Saranin, alone or in combination, fail to teach or suggest each and every element of at least independent claims 1, 8, and 15.” [Remarks, p. 11]. The examiner respectfully disagrees.
Saranin discloses a labeling framework that assigns object class information together with image-derived and cluster-derived information to detected LiDAR-based objects. (as per “A point label refers to an indication or description associated with a LiDAR data point that includes information or data particular to that LiDAR data point. For instance, a point label may include an object class identifier (e.g., a vehicle class identifier, a pedestrian class identifier, a tree class identifier, and/or a building class identifier), a color (e.g., an RGB value), at least one unique identifier (e.g., for the object, corresponding image pixel(s), and/or LiDAR data point), and/or an object instance identifier...” in ¶115, as per “Low importance label is assigned to LiDAR data points with, for example, object class identifiers that are associated with static object classes (e.g., a building class, a foliage class, a construction barrier class, and/or a signage class)” in ¶116, as per “Clusters with a height above 2.0-2.5 meters are unlikely to be associated with pedestrians. Clusters over 1 meter in length are unlikely to be associated with pedestrians. Clusters with a length-to-width ratio above 4.0 often tend to be associated with buildings and are unlikely associated with pedestrians...” in ¶186). Thus, Saranin teaches distinguishing vegetation-class objects from pedestrian objects, and further teaches using shape-based criteria to identify clusters that are not pedestrians. Under the broadest reasonable interpretation, Saranin teaches or at least suggests vegetation-class labeling and non-pedestrian determination for the detected object, and in the cited combination teaches determining a label indicating the obstacle is vegetation and not a pedestrian. It would have been obvious to apply the teachings of Christie, Saranin, and Hölzel to enable another standard means of classifying detected obstacles using projected image information together with vegetation and pedestrian/non-pedestrian labeling criteria. Applicant’s traversal is therefore unpersuasive, and the obviousness rejection of the claims should be maintained. 

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier.  Such claim limitation(s) is/are: 
“a processor configured to…” in claim 8
“a computing device, …, configured to…” in claim 15
Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof.
If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, applicant may:  (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-4, 7-11, 14-18, and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Christie (WO Pub. No. 2023025777) in view Saranin (US Pub. No. 20220129684) in further view of Hölzel (US Pub. No. 20220269900). 
As per Claim 1, Christie discloses of a method using sensor fusion of radar, lidar, camera systems (as per Abstract), the method comprising:
generating one or more data points from one or more sensors coupled to a vehicle (as per "combining the output from the sensors shown in figure 1 ( e.g. using the system shown in figure 2), the fused sensor system can be obtained." in P10L25-30), wherein:
the one or more sensors comprise:
a Light Detection and Ranging (LiDAR) sensor; (as per "Current sensors typically employed in autonomous systems or advanced driver-assistance systems (ADAS) in vehicles typically include one or more of: radar, LIDAR, and Stereo Cameras (or a camera array)." in P2L20-30)
a camera, (as per "Current sensors typically employed in autonomous systems or advanced driver-assistance systems (ADAS) in vehicles typically include one or more of: radar, LIDAR, and Stereo Cameras (or a camera array)." in P2L20-30)
the one or more data points comprise:
a LiDAR point cloud generated by the LiDAR sensor; (as per "To put it differently, the LIDAR is used to scan a wide area and the data from the LIDAR is fused with point cloud data from the fast, long-range radar and the camera array before being analyzed in the infrastructure model 8210." in P15L30-35, P16L1-5)
an image captured by the camera; (as per "a fast scan radar and camera array technology are used as the platform backbone" in P13L20-25)
using a processor:
detecting one or more obstacles within the LiDAR point cloud; (as per “At this point partial object identification can be performed. Object identification may be based on the number, or the relative shape, of reflected signals returned from an area." in P13L1-10, as per "Detection using radar sensors therefore provides more time to take action or alert the driver upon the detection of an obstacle ahead." in P13L30-35, P14L1-5)
performing a color query on the image region for each of the one or more detected obstacles; (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
performing a shape query on the image region for each of the one or more detected obstacles; (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
controlling one or more operations of the vehicle (as per “determining a risk classification, wherein the risk classification is indicative of the risk of collision with the detected object; and if the risk classification meets a predetermined criterion, switching to a third mode wherein a vehicle safety sensor system controls the vehicle to avoid the detected object.” in P8C25-33) based on the label for each of the one or more detected obstacles. (as per “These object detections are input to the dynamic object model, which combines the data to estimate a classification, size, distance, direction, velocity, or direction of travel of the object” in P15L20-33, as per “object classification is performed using a predefined lookup table of vehicle types. In particular, an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array)” in P20L1-10)
Christie fails to expressly disclose:
for each of the one or more detected obstacles, projecting LiDAR data corresponding to the obstacle into the image to define an image region corresponding to the obstacle;
for each of the one or more detected obstacles, determining a label for the obstacle based on one or more of:
the color query; 
the shape query, 
wherein the label indicates whether each of the one or more obstacles is:
a piece of vegetation; 
not a pedestrian; 
for each of the one or more detected obstacles, labeling the obstacle with the label
Saranin discloses of camera-lidar fused object detection with segment filtering, comprising:
for each of the one or more detected obstacles, determining a label for the obstacle (as per “A point label refers to an indication or description associated with a LiDAR data point that includes information or data particular to that LiDAR data point. For instance, a point label may include an object class identifier (e.g., a vehicle class identifier, a pedestrian class identifier, a tree class identifier, and/or a building class identifier), a color (e.g., an RGB value), at least one unique identifier (e.g., for the object, corresponding image pixel(s), and/or LiDAR data point), and/or an object instance identifier (e.g., if there are many objects of the same class detected in an image)” in ¶115) based on one or more of:
the color query; (as per “Points of the LiDAR point cloud are projected into a monocular camera frame in order to transfer pixel information to each point in the LiDAR point cloud. The pixel information includes, but is not limited to, a color, an object type and an object instance” in ¶100)
the shape query, (as per “These additional cluster features include a cluster feature H representing a cluster height, a cluster feature L representing a cluster length, and a cluster feature LTW representing a length-to-width ratio for a cluster” in ¶185) 
wherein the label indicates whether each of the one or more obstacles is:
a piece of vegetation; (as per “Low importance label is assigned to LiDAR data points with, for example, object class identifiers that are associated with static object classes (e.g., a building class, a foliage class, a construction barrier class, and/or a signage class)” in ¶116) 
not a pedestrian; (as per “Clusters with a height above 2.0-2.5 meters are unlikely to be associated with pedestrians. Clusters over 1 meter in length are unlikely to be associated with pedestrians” in ¶186)
for each of the one or more detected obstacles, labeling the obstacle with the label. (as per “A point label refers to an indication or description associated with a LiDAR data point that includes information or data particular to that LiDAR data point. For instance, a point label may include an object class identifier (e.g., a vehicle class identifier, a pedestrian class identifier, a tree class identifier, and/or a building class identifier), a color (e.g., an RGB value), at least one unique identifier (e.g., for the object, corresponding image pixel(s), and/or LiDAR data point), and/or an object instance identifier (e.g., if there are many objects of the same class detected in an image)” in ¶115, as per “the detecting comprises using the projection score PS to verify that the given merged segment is part of a particular detected object that is associated with the given detection mask.” in ¶9)
In this way, Saranin operates to improve LVS based algorithm(s) that eliminate or minimize the merging of close objects (¶136). Like Christie, Saranin is concerned with autonomous driving systems. 
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the automotive sensor fusion as taught by Christie with the camera-lidar fused object detection of Saranin to enable another standard means of labeling and classifying objects based on color and shape information (¶115).
Christie and Saranin fail to expressly disclose:
for each of the one or more detected obstacles, projecting LiDAR data corresponding to the obstacle into the image to define an image region corresponding to the obstacle; 
Hölzel discloses of low level sensor fusion based on lightweight semantic segmentation of 3d point clouds, comprising:
for each of the one or more detected obstacles, projecting LiDAR data corresponding to the obstacle into the image to define an image region corresponding to the obstacle; (as per “, This sensor fusion is achieved by transforming one or more points of the ROI of the LIDAR sensor to a corresponding 3D point in the coordinate system of the camera sensor, transforming the 3D points to 2D pixels in an image frame of the camera sensor, drawing a 2D bounding box or a polygon around the 2D points in the image frame of the camera sensor and deriving objects in the camera sensor by performing a cropping operation on the camera sensor's pixels with the bounding box” in ¶8, as per “transforming one or more points of the ROI of the LIDAR sensor to a corresponding 3D point in the coordinate system of the camera sensor; transforming the 3D points to 2D pixels in an image frame of the camera sensor; drawing a 2D bounding box or a polygon around the 2D points in the image frame of the camera sensor; and deriving pixels derived in the camera sensor by performing a cropping operation on the camera sensor's pixels with the bounding box” in Claim 5, as per “Points in the blob from the 3D Lidar coordinate system may first be transformed to corresponding 3D point in the camera/radar coordinates. In one example, this may be done with the help of an extrinsic calibration matrix between the sensor systems, i.e., between LIDAR and the camera/radar/etc” in ¶65)
In this way, Hölzel operates to improve low-level sensor fusion by transforming LiDAR points corresponding to a detected ROI into camera-image pixels and defining a corresponding image region with a 2D bounding box or polygon for subsequent object derivation (as per ¶8). Like Christie and Saranin, Hölzel is concerned with autonomous driving systems.
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the system(s) of Christie and Saranin with the low-level sensor fusion of Hölzel to enable another standard means of projecting LiDAR data corresponding to a detected obstacle into an image to define an image region corresponding to the obstacle (Claim 5; ¶8). Such modification also improves the operation of sensor fusion by permitting obstacle-specific image-region generation from projected LiDAR data (¶65).

As per Claim 2, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 1. Christie further discloses wherein indicating whether an obstacle, is not a pedestrian is based on a color query. (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array)” in P20L5-10, as per “selecting a classification comprises extracting the classification from a look up table using one or more of: a calculated size, shape, or colour of the object.” in P30L1-10)
 Christie fails to expressly disclose wherein a label indicating whether an obstacle, of the one or more detected obstacles, is not a pedestrian is based on a shape query.
See Claim 1 for teachings of Saranin. Saranin further discloses wherein a label indicating whether an obstacle, of the one or more detected obstacles, is not a pedestrian is based on a shape query. (as per “Clusters with a height above 2.0-2.5 meters are unlikely to be associated with pedestrians. Clusters over 1 meter in length are unlikely to be associated with pedestrians. Clusters with a length-to-width ratio above 4.0 often tend to be associated with buildings and are unlikely associated with pedestrians. Clusters with high cylinder convolution score are likely to be associated with pedestrians” in ¶186, as per “A point label refers to an indication or description associated with a LiDAR data point that includes information or data particular to that LiDAR data point. For instance, a point label may include an object class identifier (e.g., a vehicle class identifier, a pedestrian class identifier, a tree class identifier, and/or a building class identifier), a color (e.g., an RGB value), at least one unique identifier (e.g., for the object, corresponding image pixel(s), and/or LiDAR data point), and/or an object instance identifier (e.g., if there are many objects of the same class detected in an image)” in ¶115, as per “The pixel information includes, but is not limited to, a color, an object type and an object instance.” in ¶100) 
In this way, Saranin operates to improve LVS based algorithm(s) that eliminate or minimize the merging of close objects (¶136). Like Christie and Hölzel, Saranin is concerned with autonomous driving systems. 
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the system(s) of Christie & Hölzel with the camera-lidar fused object detection of Saranin to enable another standard means of labeling and classifying objects based on color and shape information (¶115).

As per Claim 3, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 1. Christie further discloses using the processor:
for each of the one or more detected obstacles, based on the label of the obstacle, determining one or more vehicle actions for the vehicle to perform; (as per "The scaled value is received by the APT platform and used to output an appropriate response or warning based on a sliding scale." in P23L1-5)
causing the vehicle to perform the one or more actions. (as per "where risk scores< 60 display NORMAL, 60 - 80 display CAUTION and> 80 display WARNING." in P23L4-8)

As per Claim 4, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 3. Christie further discloses wherein the one or more actions comprises one or more of:
increasing a speed of the vehicle; (as per "and if the risk classification meets a predetermined criterion, switching to a third mode wherein a vehicle safety sensor system controls the vehicle to avoid the detected object." in P8L25-35)
decreasing a speed of the vehicle; (as per "giving an early warning to drivers to be cautious, and enabling them to safely slow down without causing any undue risk." in P27L14-20)
stopping the vehicle; (as per "The 5D perception engine may then take preventative action based on this determination - for example warning the driver early of the potential danger or by employing automatic braking or other measures to slow or stop the vehicle." in P23L20-25)
adjusting a trajectory of the vehicle. (as per "and if the risk classification meets a predetermined criterion, switching to a third mode wherein a vehicle safety sensor system controls the vehicle to avoid the detected object." in P8L25-35)

As per Claim 7, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 1. Christie further discloses for each of the one or more detected obstacles, (as per “Semantic segmentation (which is also referred to as Infrastructure modelling), 5 while not being a safety critical function, enables classification of the objects which appear in the immediate surroundings of the vehicle. For example, semantic segmentation allows the system to classify objects as road kerbs, fencing, side walls, road markings, signage etc.” in P15L1-10), based on the label of the obstacle, determining whether or not the obstacle is an obstacle that the vehicle can hit. (as per “Other exemplary scenarios where a suitably trained artificial intelligence or machine learning engine can aid the correct
classification of objects include:• Low detection road boundaries - some road boundaries, such as flat concrete walls, tend to 'reflect' detection signals away from the radar, unless the wall is perpendicular to the radar. • Safe to drive over debris -early identification of objects that are low and narrow enough
to pass between the wheels, or can be safely driven over such as plastic bottles etc. • Cyclists, motorbikes and pedestrians - the consideration of situations where these road users are most vulnerable e.g., cyclists between lanes of traffic etc” in P25L15-30, as per “APT Lite system of the present disclosure makes an assessment based on one or more of the following weighted parameters. Preferably, the weighting (or priority) applied to a parameter aligns with the order of the parameter in the list - i.e. the priority of a
parameter preferably decreases the lower it appears in the following list.” in P21L15-33)

Claims 8 and 15 are rejected using the same rationale, mutatis mutandis, applied to Claim 1 above, respectively.

Claims 9 and 16 are rejected using the same rationale, mutatis mutandis, applied to Claim 2 above, respectively.

Claims 10 and 17 are rejected using the same rationale, mutatis mutandis, applied to Claim 3 above, respectively.

Claims 11 and 18 are rejected using the same rationale, mutatis mutandis, applied to Claim 4 above, respectively.

Claims 14 and 20 are rejected using the same rationale, mutatis mutandis, applied to Claim 7 above, respectively.

Claim(s) 5-6, 12-13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Christie (WO Pub. No. 2023025777) in view of Saranin (US Pub. No. 20220129684) in view of Hölzel (US Pub. No. 20220269900) in further view of Singh (US Pub. No. 20220388535).
As per Claim 5, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 1. Christie further discloses:
generating a patch for each of the one or more detected obstacles; (as per “When a number of points are detected in the same region (angle, radial range and velocity) of each sensor, they are clustered into blocks. The spatial extent of the blocks in each axis can be interpreted as a vehicle type” in P19L20-25)
projecting the LiDAR point cloud into the image, (as per “This optical data may be fused with the radar data 8130. An AI or machine learning engine can be used to fuse the radar data and the optical data.” in P14L20-30, as per “the LIDAR is used to scan a wide area and the data from the LIDAR is fused… with point cloud data from the fast, long-range radar and the camera array before being analysed in the infrastructure model 8210.” in P15L30-34 & P16L1-5) wherein:
each patch represents a region of the image for each of the one or more detected obstacles, (as per “In the first scanning mode, the infrastructure point cloud is 'fused' with available LIDAR detections. The first mode is used to identify potential hazards. To put it differently, the LIDAR is used to scan a wide area and the data from the LIDAR is fused with point cloud data from the fast, long-range radar and the camera array before being analysed in the infrastructure model 8210.” in P15L30-34 & P16L1-10, as per Fig. 7)
each patch forms a bounding box on the image; (as per Fig. 7)
Christie, Saranin, and Hölzel fail to expressly disclose cropping the region of the image within the bounding box, forming a cropped image.
Singh discloses of image annotation for deep neural networks, further comprising cropping the region of the image within the bounding box, forming a cropped image. (as per "The cropped second image can be transformed to a higher spatial resolution by super resolution. The cropped second image can be transformed to include motion blurring. The cropped second image can be transformed by zooming the cropped second image. The cropped second image can be transformed to include hierarchical pyramid processing to obtain image data including the second object at a plurality of spatial resolutions." in ¶17, as per "At block 720 the cropped image 600 can be modified using super resolution, blurring, zooming, or hierarchical pyramid processing." in ¶59)
In this way, Singh operates to improve annotation of images by modifying the cropped image using super resolution, blurring, zooming, and hierarchical pyramid cropping to improve the training dataset. (as per ¶15). Like Christie, Saranin, and Hölzel, Singh is concerned with neural networks.
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the system(s) of Christie, Saranin, and Hölzel with the image annotation for deep neural networks of Singh to enable another standard means of cropping and resizing the image within the bounding box (¶59). Such modification also improves the operation of neural networks (as per ¶40).

As per Claim 6, the combination of Christie, Saranin, and Hölzel, and Singh teaches or suggests all limitations of Claim 1. Christie further discloses:
performing the color query comprises performing the color query on the resized image, (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
performing the shape query comprises performing a shape query on the resized image. (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
Christie fails to expressly disclose resizing the cropped image, forming a resized image. 
See Claim 5 for teachings of Singh. Singh further discloses resizing the cropped image, forming a resized image. (as per "The cropped second image can be transformed to a higher spatial resolution by super resolution. The cropped second image can be transformed to include motion blurring. The cropped second image can be transformed by zooming the cropped second image. The cropped second image can be transformed to include hierarchical pyramid processing to obtain image data including the second object at a plurality of spatial resolutions." in ¶17, as per "At block 720 the cropped image 600 can be modified using super resolution, blurring, zooming, or hierarchical pyramid processing." in ¶59)
In this way, Singh operates to improve annotation of images by modifying the cropped image using super resolution, blurring, zooming, and hierarchical pyramid cropping to improve the training dataset. (as per ¶15). Like Christie, Saranin, and Hölzel, Singh is concerned with neural networks.
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the system(s) of Christie, Saranin, and Hölzel with the image annotation for deep neural networks of Singh to enable another standard means of cropping and resizing the image within the bounding box (¶59). Such modification also improves the operation of neural networks (as per ¶40).

Claim 12 is rejected using the same rationale, mutatis mutandis, applied to Claim 5 above, respectively.

Claim 13 is rejected using the same rationale, mutatis mutandis, applied to Claim 6 above, respectively.

As per Claim 19, the combination of Christie, Saranin, and Hölzel teaches or suggests all limitations of Claim 15. Christie further discloses wherein:
the programming instructions are further configured, when executed by the processor, to cause the processor to:
generate a patch for each of the one or more detected obstacles; ; (as per “When a number of points are detected in the same region (angle, radial range and velocity) of each sensor, they are clustered into blocks. The spatial extent of the blocks in each axis can be interpreted as a vehicle type” in P19L20-25)
project the LiDAR point cloud into the image, (as per “This optical data may be fused with the radar data 8130. An AI or machine learning engine can be used to fuse the radar data and the optical data.” in P14L20-30, as per “the LIDAR is used to scan a wide area and the data from the LIDAR is fused… with point cloud data from the fast, long-range radar and the camera array before being analysed in the infrastructure model 8210.” in P15L30-34 & P16L1-5)
wherein:
each patch represents a region of the image for each of the one or more detected obstacles, (as per “In the first scanning mode, the infrastructure point cloud is 'fused' with available LIDAR detections. The first mode is used to identify potential hazards. To put it differently, the LIDAR is used to scan a wide area and the data from the LIDAR is fused with point cloud data from the fast, long-range radar and the camera array before being analysed in the infrastructure model 8210.” in P15L30-34 & P16L1-10, as per Fig. 7)
each patch forms a bounding box on the image; (as per Fig. 7)
wherein:
the performing the color query comprises performing the color query on the resized image, (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
the performing the shape query comprises performing a shape query on the resized image. (as per “an object is classified by accessing a predefined classification based on one or more of the detected object's size, shape, and colour (from the camera array).” in P20L4-10)
Christie, Saranin, and Hölzel fail to expressly disclose:
crop the region of the image within the bounding box, forming a cropped image; and
resize the cropped image, forming a resized image,
Singh discloses of image annotation for deep neural networks, further comprising:
crop the region of the image within the bounding box, forming a cropped image; (as per "An image 600 can be cropped by starting with an image 500 from FIG. 5 that includes a bounding box 508 and object 502, for example. All the image data outside of the bounding box 508 is deleted, leaving only image 600 including object 502. The cropped image 600 can be input to a DNN 200 to detect the object 502 included in the cropped image." in ¶45)
resize the cropped image, forming a resized image; (as per "The cropped second image can be transformed to a higher spatial resolution by super resolution. The cropped second image can be transformed to include motion blurring. The cropped second image can be transformed by zooming the cropped second image. The cropped second image can be transformed to include hierarchical pyramid processing to obtain image data including the second object at a plurality of spatial resolutions." in ¶17, as per "At block 720 the cropped image 600 can be modified using super resolution, blurring, zooming, or hierarchical pyramid processing." in ¶59)
In this way, Singh operates to improve annotation of images by modifying the cropped image using super resolution, blurring, zooming, and hierarchical pyramid cropping to improve the training dataset. (as per ¶15). Like Christie, Saranin, and Hölzel, Singh is concerned with neural networks.
It would have been obvious for one of ordinary skill in the art before the effective filing date to have modified the system(s) of Christie, Saranin, and Hölzel with the image annotation for deep neural networks of Singh to enable another standard means of cropping and resizing the image within the bounding box (¶59). Such modification also improves the operation of neural networks (as per ¶40).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Brunner (US Pub. No. 20200219264) discloses using light detection and ranging (lidar) to train camera and imaging radar deep learning networks. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TYLER R ROBARGE whose telephone number is (703)756-5872. The examiner can normally be reached Monday - Friday, 8:00 am - 5:00 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ramon Mercado can be reached on (571) 270-5744. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/T.R.R./Examiner, Art Unit 3658
/Ramon A. Mercado/Supervisory Patent Examiner, Art Unit 3658
Read full office action
Prosecution Timeline

Mar 06, 2023
Application Filed
Jun 09, 2025
Non-Final Rejection — §103
Sep 15, 2025
Response Filed
Dec 02, 2025
Final Rejection — §103
Feb 09, 2026
Request for Continued Examination
Feb 28, 2026
Response after Non-Final Action
Mar 06, 2026
Non-Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/249,399
Patent 12583117
WORKPIECE PROCESSING APPARATUS
2y 5m to grant Granted Mar 24, 2026
17/823,972
Patent 12552029
CONTROLLING MOVEMENT TO AVOID RESONANCE
2y 5m to grant Granted Feb 17, 2026
18/100,087
Patent 12485922
SYSTEM AND METHOD FOR MODIFYING THE LONGITUDINAL POSITION OF A VEHICLE WITH RESPECT TO ANOTHER VEHICLE TO INCREASE PRIVACY
2y 5m to grant Granted Dec 02, 2025
18/221,769
Patent 12459129
METHOD FOR MOTION OPTIMIZED DEFECT INSPECTION BY A ROBOTIC ARM USING PRIOR KNOWLEDGE FROM PLM AND MAINTENANCE SYSTEMS
2y 5m to grant Granted Nov 04, 2025
18/335,251
Patent 12456343
SYSTEMS AND METHODS FOR SUPPLYING ENERGY TO AN AUTONOMOUS VEHICLE VIA A VIRTUAL INTERFACE
2y 5m to grant Granted Oct 28, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
77%
Grant Probability
86%
With Interview (+9.1%)
2y 8m
Median Time to Grant
High
PTA Risk
Based on 22 resolved cases by this examiner. Grant probability derived from career allow rate.