Last updated: April 19, 2026
Application No. 18/825,645
AUTOMATED DRIVING SOTIF VIA SIGNAL REPRESENTATION

Non-Final OA §102§103
Filed
Sep 05, 2024
Examiner
DOROS, KAYLA RENEE
Art Unit
3657
Tech Center
3600 — Transportation & Electronic Commerce
Assignee
Qualcomm Incorporated
OA Round
1 (Non-Final)
Interview Optional

— +2.8% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 26 resolved cases, 2023–2026
Examiner Intelligence

DOROS, KAYLA RENEE View full profile →
Grants 73% — above average
Career Allow Rate
19 granted / 26 resolved
+21.1% vs TC avg
Minimal +3% lift
Without
With
+2.8%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
30 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
7.7%
-32.3% vs TC avg
§103
53.7%
+13.7% vs TC avg
§102
16.7%
-23.3% vs TC avg
§112
19.6%
-20.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 26 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Remarks
The claims being considered in this application are those submitted on 09/05/2024. Claims 1-20 are pending. 
Priority
The applicant’s claim to priority of PRO63/590,899 on 10/17/2023 is acknowledged. 

Information Disclosure Statement
The information disclosure statement(s) filed on 01/14/2025 and 04/10/2025 have been annotated and considered. 

Claim Interpretation
The following is a quotation of 35 U.S.C. 112(f):
(f) Element in Claim for a Combination. – An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. 

The following is a quotation of pre-AIA  35 U.S.C. 112, sixth paragraph:
An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof.

The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art.  The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is invoked. 
As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph:
(A)	the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; 
(B)	the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as “configured to” or “so that”; and 
(C)	the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. 
Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. 
Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. 
Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA  35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action.
Claim 20 recites:
means for obtaining image information from at least one camera module disposed on a vehicle
means for obtaining target information from at least one radar module disposed on the vehicle; 
means for generating a first detection representation with a first signal path based on the image information and the target information; 
means for generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and 
means for outputting the first detection representation and the second detection representation.  
Structure and support for these limitations in Claim 20 that invoke 112(f) was found in at least specification ¶0062-¶0066.


Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.



Claims 1, 8-12, 15, and 20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Nilsson et. al. (US 20220297706 A1).

Regarding Claim 1, Nilsson discloses: 
An apparatus, comprising: at least one memory; at least one camera module; at least one radar module; at least one processor communicatively coupled to the at least one memory, the at least one camera module, and the at least one radar module, and configured to:  (See at least Figure 15C via Data Store(s); Camera Modules: 1568, 1570, 1572, 1574, 1598; Radar Sensor(s) 1560; and Processor(s))
obtain image information from the at least one camera module disposed on a vehicle; obtain target information from the at least one radar module disposed on the vehicle; (See at least ¶0028 via "The processes of executing the systems and architectures described herein may include generating and/or receiving sensor data" and ¶0029 via "As such, the sensor data may include, without limitation, sensor data from any of the sensors of the ego-machine 1500 including, for example and with reference to FIGS. 15A-15C, RADAR sensor(s) 1560, ultrasonic sensor(s) 1562, LIDAR sensor(s) 1564, stereo camera(s) 1568, wide-view camera(s) 1570 (e.g., fisheye cameras), infrared camera(s) 1572, surround camera(s) 1574 (e.g., 360 degree cameras), long-range and/or mid-range camera(s) 1578,…")
generate a first detection representation with a first signal path based on the image information and the target information; generate a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and (See at least Figure 2 via sensors 202A and 202B, and the first signal path corresponding to the first detection representation: rule-based sensor fusion, and Figure 3 via sensors 202A and 202B, and the second signal path corresponding to the second detection representation: learned sensor fusion; **Wherein sensors 202A and 202B include camera(s) and radar(s))

    PNG
    media_image1.png
    400
    488
    media_image1.png
    Greyscale

output the first detection representation and the second detection representation (See at least Figures 2-3 and ¶0035 via "For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output").
 
Regarding Claim 15, Nilsson discloses: 
A method for generating object representations with multiple signal paths, comprising: (See at least ¶0024 via "Systems and methods are disclosed related to combining rule-based and learned sensor fusion for autonomous machine applications" and Figures 13-14)
obtaining image information from at least one camera module disposed on a vehicle; obtaining target information from at least one radar module disposed on the vehicle; (See at least ¶0028 via "The processes of executing the systems and architectures described herein may include generating and/or receiving sensor data" and ¶0029 via "As such, the sensor data may include, without limitation, sensor data from any of the sensors of the ego-machine 1500 including, for example and with reference to FIGS. 15A-15C, RADAR sensor(s) 1560, ultrasonic sensor(s) 1562, LIDAR sensor(s) 1564, stereo camera(s) 1568, wide-view camera(s) 1570 (e.g., fisheye cameras), infrared camera(s) 1572, surround camera(s) 1574 (e.g., 360 degree cameras), long-range and/or mid-range camera(s) 1578,…")
generating a first detection representation with a first signal path based on the image information and the target information; generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and  (See at least Figure 2 via sensors 202A and 202B, and the first signal path corresponding to the first detection representation: rule-based sensor fusion, and Figure 3 via sensors 202A and 202B, and the second signal path corresponding to the second detection representation: learned sensor fusion; **Wherein sensors 202A and 202B include camera(s) and radar(s))

    PNG
    media_image1.png
    400
    488
    media_image1.png
    Greyscale

outputting the first detection representation and the second detection representation (See at least Figures 2-3 and ¶0035 via "For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output").

Regarding Claim 20, Nilsson discloses: 
An apparatus for generating object representations with multiple signal paths, comprising: (See at least Figures 15A-15C)
means for obtaining image information from at least one camera module disposed on a vehicle; means for obtaining target information from at least one radar module disposed on the vehicle;  (See at least ¶0028 via "The processes of executing the systems and architectures described herein may include generating and/or receiving sensor data" and ¶0029 via "As such, the sensor data may include, without limitation, sensor data from any of the sensors of the ego-machine 1500 including, for example and with reference to FIGS. 15A-15C, RADAR sensor(s) 1560, ultrasonic sensor(s) 1562, LIDAR sensor(s) 1564, stereo camera(s) 1568, wide-view camera(s) 1570 (e.g., fisheye cameras), infrared camera(s) 1572, surround camera(s) 1574 (e.g., 360 degree cameras), long-range and/or mid-range camera(s) 1578,…")
means for generating a first detection representation with a first signal path based on the image information and the target information; means for generating a second detection representation with a second signal path based on the image information and the target information, wherein the second signal path is different than the first signal path; and (See at least Figure 2 via sensors 202A and 202B, and the first signal path corresponding to the first detection representation: rule-based sensor fusion, and Figure 3 via sensors 202A and 202B, and the second signal path corresponding to the second detection representation: learned sensor fusion; **Wherein sensors 202A and 202B include camera(s) and radar(s))

    PNG
    media_image1.png
    400
    488
    media_image1.png
    Greyscale

means for outputting the first detection representation and the second detection representation (See at least Figures 2-3 and ¶0035 via "For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output").

Regarding Claim 8, Nilsson discloses the apparatus of Claim 1.
Furthermore, Nilsson discloses: wherein the at least one processor is further configured to: receive the first detection representation via the first signal path and the second detection representation via the second signal path; (See at least Figures 2-3 which illustrate the first and second signal paths and ¶0035 via "…For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output.")
generate one or more object lists based at least in part on the first detection representation and the second detection representation; and (See at least ¶0035 via "For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output." **Wherein the fused output being derived from the first and second detection representations that represent detected objects is a collection of objects that are detected with their corresponding attributes, and therefore is an object list. Furthermore, see at least ¶0113 via " The neural network may take as its input at least some subset of parameters, such as bounding box dimensions, ground plane estimate obtained (e.g. from another subsystem), inertial measurement unit (IMU) sensor 1566 output that correlates with the vehicle 1500 orientation, distance, 3D location estimates of the object obtained from the neural network and/or other sensors (e.g., LIDAR sensor(s) 1564 or RADAR sensor(s) 1560), among others.")
output the one or more object lists (See at least ¶0073 which describes the outputting of the detected objects (the object list(s)) via: "One or more of the controller(s) 1536 may receive inputs (e.g., represented by input data) from an instrument cluster 1532 of the vehicle 1500 and provide outputs (e.g., represented by output data, display data, etc.) via a human-machine interface (HMI) display 1534, an audible annunciator, a loudspeaker, and/or via other components of the vehicle 1500. The outputs may include information such as vehicle velocity, speed, time, map data (e.g., the HD map 1522 of FIG. 15C), location data (e.g., the vehicle's 1500 location, such as on a map), direction, location of other vehicles (e.g., an occupancy grid), information about objects and status of objects as perceived by the controller(s) 1536, etc. For example, the HMI display 1534 may display information about the presence of one or more objects (e.g., a street sign, caution sign, traffic light changing, etc.), and/or information about driving maneuvers the vehicle has made, is making, or will make (e.g., changing lanes now, taking exit 34B in two miles, etc.).").
 
Regarding Claim 9, Nilsson discloses the apparatus of Claim 8.
Furthermore, Nilsson discloses: wherein the one or more object lists includes an object track list indicating a location and velocity of an object (See at least ¶0035, ¶0073, via "…location of other vehicles (e.g., an occupancy grid), information about objects and status of objects as perceived by the controller(s) 1536, etc…", and ¶0081 via " An alternative stereo camera(s) 1568 may include a compact stereo vision sensor(s) that may include two camera lenses (one each on the left and right) and an image processing chip that may measure the distance from the vehicle to the target object and use the generated information (e.g., metadata) to activate the autonomous emergency braking and lane departure warning functions.", ¶0113 via "…3D location estimates of the object…", and also ¶0139 via "The RADAR sensor(s) 1560 may help in distinguishing between static and moving objects, and may be used by ADAS systems for emergency brake assist and forward collision warning." **Wherein determining an object is static is determining that the velocity of the object is 0**)
 
Regarding Claim 10, Nilsson discloses the apparatus of Claim 9.
Furthermore, Nilsson discloses: wherein the object track list indicates a shape of the object (See at least ¶0113 via "…bounding box dimensions…" as well as ¶0052 via "…detecting and/or classifying vehicles and pedestrians…")
 
Regarding Claim 11, Nilsson discloses the apparatus of Claim 9.
Furthermore, Nilsson discloses: wherein the one or more object lists includes static object information (See at least ¶0139 via " The RADAR sensor(s) 1560 may help in distinguishing between static and moving objects, and may be used by ADAS systems for emergency brake assist and forward collision warning." and ¶0073 via "For example, the HMI display 1534 may display information about the presence of one or more objects (e.g., a street sign, caution sign, traffic light changing, etc.)"
 
Regarding Claim 12, Nilsson discloses the apparatus of Claim 8.
Furthermore, Nilsson discloses:  wherein the at least one processor is further configured to output the one or more object lists to an environment model (See at least ¶0073 via "…provide outputs (e.g., represented by output data, display data, etc.) via a human-machine interface (HMI) display 1534, an audible annunciator, a loudspeaker, and/or via other components of the vehicle 1500. The outputs may include information such as vehicle velocity, speed, time, map data (e.g., the HD map 1522 of FIG. 15C), location data (e.g., the vehicle's 1500 location, such as on a map), direction, location of other vehicles (e.g., an occupancy grid), information about objects and status of objects as perceived by the controller(s) 1536, etc…").


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-5, 7, and 16-19 are rejected under 35 U.S.C. 103 as being unpatentable over Nilsson et. al. (US 20220297706 A1) in view of Zeng et. al. (US 20140035775 A1).

Regarding Claim 2 and Claim 16 respectively, Nilsson discloses the apparatus of Claim 1 and the method of Claim 15.
Furthermore, Nilsson discloses: wherein the first detection representation includes a parametric representation for a target object, and the second detection representation  (See at least Figures 2-3 and ¶0002 via "The processing may include detection (e.g., detecting objects), classification (e.g., classifying detected objects), tracking, another task, or a combination thereof. For example, a vehicle may detect other vehicles, pedestrians, intersections, wait conditions, etc., and/or may classify vehicles by type, intersections by type corresponding to associated wait conditions, etc.").
However, although Nilsson discloses non-parametric representations, such as an occupancy grid in at least ¶0073, Nilsson does not explicitly disclose this non-parametric information as a detection representation.
Nevertheless, Zeng--who is directed towards fusion of obstacle detection using radar and camera--discloses: detection representation includes a non-parametric representation for the target object (See at least ¶0007 via " Objects are captured in a field of view by an imaging system. Objects in a substantially same field of view a radar device are sensed. The substantially same field of view sensed by the radar device is partitioned into an occupancy grid having a plurality of observation cells" as well as ¶0006 via "The fusion module extracts features from each corresponding cell using sensor data from the radar device and imaging data from the imaging system. A primary classifier determines whether an extracted feature extracted from a respective observation cell is an obstacle.").
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson in view of including the non-parametric representation as a detection representation such as in Zeng, in order to include a complimentary detection representation: "The fusion module extracts features from each corresponding cell using sensor data from the radar device and imaging data from the imaging system. A primary classifier determines whether an extracted feature extracted from a respective observation cell is an obstacle" [Zeng ¶0006] as an alternative detection method which yields predictable results.
 
Regarding Claim 3 and Claim 17 respectively, Modified Nilsson discloses the apparatus of Claim 2 and the method of Claim 16.
Furthermore, Nilsson discloses:  wherein the parametric representation for the target object includes coordinate information for the target object and dimension information for the target object (See at least ¶0081 via "An alternative stereo camera(s) 1568 may include a compact stereo vision sensor(s) that may include two camera lenses (one each on the left and right) and an image processing chip that may measure the distance from the vehicle to the target object…" as well as ¶0113 via "The neural network may take as its input at least some subset of parameters, such as bounding box dimensions, ground plane estimate obtained (e.g. from another subsystem), inertial measurement unit (IMU) sensor 1566 output that correlates with the vehicle 1500 orientation, distance, 3D location estimates of the object obtained from the neural network and/or other sensors (e.g., LIDAR sensor(s) 1564 or RADAR sensor(s) 1560), among others."). 
 
Regarding Claim 4 and Claim 18 respectively, Modified Nilsson discloses the apparatus of Claim 2 and the method of Claim 16.
Furthermore, Zeng discloses the non-parametric representation, and further discloses:  wherein the non-parametric representation for the target object is an occupancy map (See at least ¶0007 via " Objects are captured in a field of view by an imaging system. Objects in a substantially same field of view a radar device are sensed. The substantially same field of view sensed by the radar device is partitioned into an occupancy grid having a plurality of observation cells").
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson in view of including the non-parametric representation of an occupancy map/grid as a detection representation such as in Zeng, in order to include a complimentary detection representation: "The fusion module extracts features from each corresponding cell using sensor data from the radar device and imaging data from the imaging system. A primary classifier determines whether an extracted feature extracted from a respective observation cell is an obstacle" [Zeng ¶0006] as an alternative detection method which yields predictable results.
 
Regarding Claim 5 and Claim 19 respectively, Modified Nilsson discloses the apparatus of Claim 2 and the method of Claim 16.
Furthermore, Nilsson discloses: wherein the first signal path includes at least a first machine learning model and the at least one processor is further configured to generate the parametric representation based at least in part on the image information and the target information, and (See at least Figure 3 which illustrates learned sensor fusion (LSF) that is implemented via neural networks/machine learning. Also see ¶0029 via the sensors (cameras/radars), and ¶0113)

    PNG
    media_image2.png
    204
    508
    media_image2.png
    Greyscale

the second signal path (See at least Figure 2 which illustrates another signal path).
However, Nilsson does not explicitly disclose the non-parametric representation.
Nevertheless, Zeng discloses:  (See at least ¶0023 via "The occupancy grid 40 is projected such that each observation cell geographically located in the radar data corresponds to the same geographical location in the capture image. The logistic classifier is used to determine whether a feature in a respective cell may be an object or no object. The logistic classifier may also be trained for not only determining whether a feature in each cell is an object, but may be used to further distinguish the object as a pedestrian, vehicle, or other obstacle. The logistic classifier determines a posterior probability of each respective cell being occupied by cooperatively analyzing both the radar parameters of the cell and the image parameters of the cell.").
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson in view of Zeng's machine learning model in order to help characterize objects/obstacles: "The classifier, (e.g., support vector machine or other type of classifier) can be used for classifying whether the respective feature is an object in the captured data" [Zeng ¶0019] and to add a complimentary detection representation of occupancy grid detection: "The fusion module extracts features from each corresponding cell using sensor data from the radar device and imaging data from the imaging system. A primary classifier determines whether an extracted feature extracted from a respective observation cell is an obstacle" [Zeng ¶0006] to provide an alternative detection method which yields predictable results.

Regarding Claim 7, Modified Nilsson discloses the apparatus of Claim 5.
Furthermore, Nilsson discloses: wherein the first machine learning model utilizes at least a first backbone, and (See at least Figure 3 via Learned Sensor Fusion (LSF) via the neural network which utilizes a first backbone)
However, Nilsson does not explicitly disclose, but Zeng discloses: the second machine learning model utilize a least a second backbone (See at least  Figure 1 via Classifier 34 and ¶0019 via "The classifier, (e.g., support vector machine or other type of classifier) can be used for classifying whether the respective feature is an object in the captured data" and ¶0020 which utilizes a second backbone).
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson in view of Zeng's machine learning model that utilizes a backbone in order to help characterize objects/obstacles: "The classifier, (e.g., support vector machine or other type of classifier) can be used for classifying whether the respective feature is an object in the captured data" [Zeng ¶0019] and to add a complimentary detection representation of occupancy grid detection: "The fusion module extracts features from each corresponding cell using sensor data from the radar device and imaging data from the imaging system. A primary classifier determines whether an extracted feature extracted from a respective observation cell is an obstacle" [Zeng ¶0006] to provide an alternative detection method which yields predictable results.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over Nilsson et. al. (US 20220297706 A1) and Zeng et. al. (US 20140035775 A1) in view of Urtasun et. al. (US 20210012116 A1).
Regarding Claim 6, Modified Nilsson discloses the apparatus of Claim 5.
Furthermore, Nilsson discloses the first machine learning model (See at least Figure 3 which illustrates learned sensor fusion (LSF) that is implemented via neural networks/machine learning) and Zeng discloses the second machine learning model (See at least ¶0023 via "logic classifier").
However, Modified Nilsson does not explicitly disclose the common backbone. 
Nevertheless, Urtasun--who is directed towards systems and methods for identifying unknown instances in a vehicle environment--discloses: wherein the first machine learning model and the second machine learning model utilize a common backbone (See at least ¶0026 via "The instance detection system can feed the sensor point cloud input data into machine-learned model(s) to identify one or more known and unknown instances within an environment. As described in further detail below, the machine-learned model(s) can include a backbone feature network (e.g., a machine-learned feature embedding model) with two branches. A first branch can include a machine-learned instance scoring model (e.g., a scoring head) configured to detect known instances (e.g., instances associated with known semantic labels) within an environment. A second branch can include a machine-learned category-agnostic instance model (e.g., an embedding head) configured to provide point embeddings for each point in the sensor point cloud input data. For example, the machine-learned category-agnostic instance model can branch into three outputs. A first output can include a class embedding (e.g., a BEV “thing” embedding) used as a prototypical instance embedding for known classes; a second output can include an instance embedding (e.g., an instance-aware point embedding); and a third output can include a background embedding (e.g., a “stuff” embedding) for known background classes." as well as ¶0087 via "…The three components can include a shared backbone feature extractor such as a machine-learned feature embedding model 205; a detection head such as a machine-learned instance scoring model 215 configured to detect anchors representing instances of known things; and/or an embedding head such as a machine-learned category-agnostic instance model 225 configured to predict instance-aware features for each point as well as prototypes for each object anchor and/or background class…")
Therefore, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Modified Nilsson's two machine learning models that produce complimentary outputs in view of Urtasun's multiple machine learning models that utilize a common backbone for shared model outputs of machine learning models as a predictable implementation supporting the generation of multiple outputs for later processing for a vehicle system.

Claim 13-14 are rejected under 35 U.S.C. 103 as being unpatentable over Nilsson et. al. (US 20220297706 A1).
Regarding Claim 13, Nilsson discloses the apparatus of Claim 8. 
Furthermore, Nilsson discloses: further comprising at least one lidar module disposed on the vehicle, wherein the at least one processor is further configured to: receive further target information from the at least one lidar module via a secondary path  (See at least Fig 15C and ¶0028 via "The processes of executing the systems and architectures described herein may include generating and/or receiving sensor data from one or more sources (e.g., sensors of a vehicle 1500…" as well as ¶0029 via "As such, the sensor data may include, without limitation, sensor data from any of the sensors of the ego-machine 1500 including, for example and with reference to FIGS. 15A-15C, RADAR sensor(s) 1560, ultrasonic sensor(s) 1562, LIDAR sensor(s) 1564, stereo camera(s)…" **Wherein the LiDAR is a separate/distinct sensor module)
generate object detection information based on the further target information; and output the object detection information (See at least ¶0035 via "For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output." and ¶0073 via "…provide outputs (e.g., represented by output data, display data, etc.) via a human-machine interface (HMI) display 1534, an audible annunciator, a loudspeaker, and/or via other components of the vehicle 1500. The outputs may include information such as vehicle velocity, speed, time, map data (e.g., the HD map 1522 of FIG. 15C), location data (e.g., the vehicle's 1500 location, such as on a map), direction, location of other vehicles (e.g., an occupancy grid), information about objects and status of objects as perceived by the controller(s) 1536, etc…").
However, Nilsson does not explicitly disclose the receiving LiDAR target information through a signal path that is separate from the first and second detection/classification algorithms (paths). Nevertheless, Nilsson discloses utilizing multiple detection/classification algorithms that generate respective detection outputs based on sensor data in least ¶0035. Nilsson discloses that the sensor data can be received by camera(s), radar(s), and LiDAR sensors in ¶0029/Figure 15C. Additionally, Nilsson discloses that the various sensor data can be processed by various components of the system in ¶0073 and ¶0152, and that the detection representations/outputs are then used for additional processing by other components in the vehicle system. In view of this, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson to implement a predictable detection process as a separate path for the LiDAR target information which complements Nilsson's teachings of using multiple detection processes via multiple data sources.
 
Regarding Claim 14, Modified Nilsson discloses the apparatus of Claim 13.
Furthermore, Nilsson discloses:  wherein the at least one processor is further configured to: generate the object detection information based on the target information and the image information; and output the object detection information (See at least ¶0035 via "For example, a RADAR sensor may generate RADAR data and a LiDAR sensor may generate LiDAR data, and the RADAR data and the LiDAR data may be processed by a detection and/or classification algorithm to generate one or more detections. Late sensor fusion (e.g., as illustrated in FIGS. 2 and 3) may correspond to fusing together detections from two or more detection and/or classification algorithms—e.g., rule-based or learned processing components, such as DNNs, Kalman filters, etc.—to generate fused detections. For example, a first detection/classification algorithm may generate a first output and a second detection/classification algorithm may generate a second output, and the first output and the second output may be processed to generate a third, fused output." and Figure 15C as well as ¶0029 via "…the sensor data may include, without limitation, sensor data from any of the sensors of the ego-machine 1500 including, for example and with reference to FIGS. 15A-15C, RADAR sensor(s) 1560, ultrasonic sensor(s) 1562, LIDAR sensor(s) 1564, stereo camera(s) 1568, wide-view camera(s) 1570 (e.g., fisheye cameras), infrared camera(s) 1572…", and additionally ¶0073 disclosing the outputting via "…provide outputs (e.g., represented by output data, display data, etc.) via a human-machine interface (HMI) display 1534, an audible annunciator, a loudspeaker, and/or via other components of the vehicle 1500. The outputs may include information such as vehicle velocity, speed, time, map data (e.g., the HD map 1522 of FIG. 15C), location data (e.g., the vehicle's 1500 location, such as on a map), direction, location of other vehicles (e.g., an occupancy grid), information about objects and status of objects as perceived by the controller(s) 1536, etc…").
However, Nilsson does not explicitly disclose generating the object detection information based on the target information from a Radar and LiDAR sensor, as well as the image information from a camera sensor. Nevertheless, Nilsson discloses utilizing multiple detection/classification algorithms that generate respective detection outputs based on sensor data in least ¶0035. Nilsson discloses that the sensor data can be received by camera(s), radar(s), and LiDAR sensors in ¶0029/Figure 15C. Additionally, Nilsson discloses that the various sensor data can be processed by various components of the system in ¶0073 and ¶0152, and that the detection representations/outputs are then used for additional processing by other components in the vehicle system. In view of this, it would have been obvious to one of ordinary skill in the art prior to the effective filing date of the given invention to modify Nilsson to implement a predictable detection process based on a combination of Radar, LiDAR, and camera sensors' target information and image information as consistent with Nilsson's teachings of using multiple sensor data types within the detection/classification algorithms.


Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KAYLA RENEE DOROS whose telephone number is (703)756-1415. The examiner can normally be reached Generally: M-F (8-5) EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Abby Lin can be reached on (571) 270-3976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/K.R.D./Examiner, Art Unit 3657                   
/ABBY LIN/Supervisory Patent Examiner, Art Unit 3657
Read full office action
Prosecution Timeline

Sep 05, 2024
Application Filed
Feb 03, 2026
Non-Final Rejection — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/184,684
Patent 12602048
TRAVEL ROUTE GENERATION METHOD FOR AUTONOMOUS VEHICLE AND CONTROL APPARATUS FOR AUTONOMOUS VEHICLE
2y 5m to grant Granted Apr 14, 2026
18/553,619
Patent 12576840
VEHICLE CONTROL DEVICE
2y 5m to grant Granted Mar 17, 2026
18/407,680
Patent 12570012
ROBOT SYSTEM AND METHOD FOR CREATING VISUAL RECORD OF TASK PERFORMED IN WORKING AREA
2y 5m to grant Granted Mar 10, 2026
17/970,366
Patent 12566451
Interactive Detection of Obstacle Status in Mobile Robots
2y 5m to grant Granted Mar 03, 2026
18/375,291
Patent 12544925
ROBOT CONTROL SYSTEM, ROBOT CONTROL METHOD AND PROGRAM
2y 5m to grant Granted Feb 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
73%
Grant Probability
76%
With Interview (+2.8%)
2y 6m
Median Time to Grant
Low
PTA Risk
Based on 26 resolved cases by this examiner. Grant probability derived from career allow rate.