Last updated: April 19, 2026
Application No. 18/756,419
Cross-Domain Spatial Matching For Monocular 3D Object Detection And/Or Low-Level Sensor Fusion

Non-Final OA §103§112
Filed
Jun 27, 2024
Examiner
DHILLON, PUNEET S
Art Unit
2488
Tech Center
2400 — Computer Networks
Assignee
Aptiv Technologies AG
OA Round
1 (Non-Final)
Interview Optional

— +18.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 281 resolved cases, 2023–2026
Examiner Intelligence

DHILLON, PUNEET S View full profile →
Grants 83% — above average
Career Allow Rate
232 granted / 281 resolved
+24.6% vs TC avg
Strong +18% interview lift
Without
With
+18.4%
Interview Lift
resolved cases with interview
Typical timeline
2y 6m
Avg Prosecution
41 currently pending
Career history
322
Total Applications
across all art units
Statute-Specific Performance

§101
5.4%
-34.6% vs TC avg
§103
49.1%
+9.1% vs TC avg
§102
17.5%
-22.5% vs TC avg
§112
24.9%
-15.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 281 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Election/Restrictions
Applicant's election with traverse of claims 9-20 in the reply filed on 10/16/2025 is acknowledged. After reconsideration, the previously withdrawn claims 1-8 will be reinstated for examination.  
Claims 1-20 are now under examination.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 11-15 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.

Claim 11 recites the limitation "… a point cloud processing network … the point cloud processing system …" (emphasis added to accentuate insufficient antecedent basis).

For the purposes of examination, the limitation is interpreted as the following:
“… a point cloud processing network … the point cloud processing network …”.  
 
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-7, 9-20 are rejected under 35 U.S.C. 103 as being unpatentable over Carvalho et al., hereinafter referred to as Carvalho (US 2023/0053785 A1) in view of Karasev et al., hereinafter referred to as Karasev (US 2023/0260266 A1). 

As per claim 1, Carvalho discloses a computer-implemented method of a vehicle system for performing Cross-Domain Spatial matching (CDSM) for monocular object detection in 3D free space of a physical environment surrounding a vehicle (Carvalho: Abstract; [0047].), the method comprising:
receiving into an image processing network, from a camera of the vehicle, at least one input image including a 2D array of pixel information of the physical environment (Carvalho: Para. [0058] discloses backbone networks 200 which receive respective images as input, process the raw pixels included in the images 202A-202H.);
determining, from the 2D array of pixel information of the input image, a set of 2D image features of potential targets (objects included in images) in the physical environment (Carvalho: Paras. [0029]-[0030], [0058] disclose backbone networks 200 may include weighted bi-directional feature pyramid networks (BiFPN). Output of the BiFPNs may represent multi-scale [set of 2D image] features determined based on the images 202A-202H.);
transforming the set of 2D image features into a set of 3D image features of the potential targets by applying a CDSM rotation to the 2D image features that aligns a lateral axis and a vertical axis of the 2D image features with, respectively, a tensor height axis and a tensor width axis of a 3D birds eye view (BEV) (Carvalho: Fig. 3A & Paras. [0062]-[0063], [0071]-[0072], [0107] disclose  a transformer network engine 302 is trained to project the received vision information 204 [2D image features] into a birds-eye view camera space [3D image features], wherein the BEV space includes static three-dimensional graphics of objects [potential targets] which are positioned in a real-world environment in the lateral and/or longitudinal distance.), and by applying a CDSM aggregation (Carvalho: [0037] discloses the feature queue may be spatially indexed such that information is aggregated over the previous threshold distance.) 

However, Carvalho does not explicitly disclose “… transforming the set of 2D image features to a set of 3D image features … that aligns a lateral axis and a vertical axis of the 2D image features with, respectively, a tensor height axis and a tensor width axis of a 3D birds eye view (BEV) grid, and by applying a CDSM aggregation to the 3D image features to extrapolate depth information for a tensor channel axis that is normal to the tensor width axis and the tensor height axis; generating a set of aggregated 3D features to the potential targets including the depth information extrapolated for the tensor channel axis; and detecting, based on the aggregated 3D features, one or more objects associated with the potential targets.”
Further, Karasev is in the same field of endeavor and teaches transforming the set of 2D image features to a set of 3D image features (projects 2D image features in 3D space to obtain the “set of 3D image features”) that aligns a lateral axis (pixel w) and a vertical axis (pixel h) of the 2D image features with, respectively, a tensor height axis (Cartesian coordinate x) and a tensor width axis (Cartesian coordinate y) of a 3D birds eye view (BEV) grid (Fig. 3 – 3D BEV grid 350), and by applying a CDSM aggregation to the 3D image features to extrapolate depth information (supplement each pixel w, h, described by a feature vector FV(c)w,h with depth information from depth distributions) for a tensor channel axis (CFT(c, d, w, h)) that is normal to the tensor width axis (Cartesian coordinate y) and the tensor height axis (Karasev: Paras. [0047]-[0051], [0054] disclose projecting 2D image features in 3D space to obtain the set of 3D image features. Then a lift transformation combines depth distributions and the camera features to obtain feature tensors. Feature tensors FT (c, d)w,h computed for individual pixels can then be used to obtain a combined feature tensor for the whole image, e.g., by concatenating feature tensors for different pixels: {FT(c, d)w,h}→CFT(c, d, w, h). The combined feature tensor CFT(c, d, w, h) has dimensions C×D×W×H. Then a 2D mapping is performed, where the 2D mapping having perspective coordinates d, w, h can be transformed into 3D Cartesian coordinates d, w, h→x, y, z. The 2D mapping projects the combined feature tensor expressed in the new coordinates, CFT(c, x, y, z) to obtain projected (BEV) feature tensors (e.g., PCT (c, x, y)=ΣiCFT(c, x, y, zi)), which characterizes objects and their locations in the BEV grid.);
generating a set of aggregated 3D features to the potential targets including the depth information extrapolated for the tensor channel axis (Karasev: Paras. [0051], [0087] disclose generating the projected feature tensor or BEV grid [set of aggregated 3D features] by summing/aggregating the combined feature tensor. This grid includes the depth information integrated during the lift/projection steps.); and
detecting, based on the aggregated 3D features, one or more objects associated with the potential targets (Karasev: Paras. [0051], [0062] disclose the detection head 242 can classify boxes of voxels with emphasis on detecting objects and can further perform instance aggregation. Examples of objects include agents (e.g., other vehicles [one or more objects associated with the potential targets]), pedestrians, etc.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Carvalho and Karasev before him or her, to modify the autonomous vehicle birds-eye view camera system of Carvalho to include the BEV grid and aggregation of the 3D image features feature as described in Karasev. The motivation for doing so would have been to improve object detection by providing transformation techniques that reduce perspective distortions. 

	As per claim 2, Carvalho-Karasev disclose the computer-implemented method according to claim 1, further comprising: 
determining that at least one of the one or more objects lies along a travel path of a vehicle (Karasev: Fig. 1 & Paras. [0037]-[0038] disclose data processing system 120 determines that at least one of the one or more objects associated with the potential targets lies along a travel path of a vehicle.).

	As per claim 3, Carvalho-Karasev disclose the computer-implemented method according to claim 2, further comprising: 
controlling the vehicle to avoid a collision with the at least one of the one or more objects (Carvalho: Para. [0047] discloses based on the objects, the processor system 120 may adjust one or more driving characteristics or features. For example, the processor system 120 may cause the vehicle 100 to turn, slow down, brake, speed up, and so on.).

	As per claim 4, Carvalho-Karasev disclose the computer-implemented method according to claim 1, wherein receiving the at least one input image includes centering a cartesian global camera coordinate system (GCCS) on a camera sensor field of view of the camera (Carvalho: Paras. [0058], [0061] disclose backbone networks 200 which receive respective images as input and process the raw pixels, perform rectification and transformation [centering or aligning the camera coordinate system on the camera sensor field of view/parameters] based on the camera’s extrinsic and/or intrinsic parameters to normalize the images.).

	As per claim 5, Carvalho-Karasev disclose the computer-implemented method according to claim 4, wherein the GCCS includes an X-axis defined along a vehicle travel path and Y-axis that is orthogonal to the X-axis, the Y-axis defining a width of the vehicle, and a Z-axis that define the tensor channel axis (Carvalho: Para. [0049] discloses “input data may be in the form of a three-dimensional matrix or tensor (e.g., two-dimensional data across multiple input channels) … output data may be across multiple output channels.” This corresponds to the claimed Z-axis that defines the tensor channel axis (the third dimension of the 3D matrix accommodating the channels) and Para. [0063] further discloses “the birds-eye view may extend laterally by about 70 meters … longitudinally by about 80 meters … birds-eye view may include static objects … in the lateral and/or longitudinal distance.” This corresponds to the claimed X-axis extending along a vehicle travel path (longitudinally) and Y-axis defining a width of the vehicle that is orthogonal to the X-axis (laterally).).

	As per claim 6, Carvalho-Karasev disclose the computer-implemented method according to claim 5, wherein the tensor height axis corresponds to the X-axis of the GCCS and the tensor width axis corresponds to the Y-axis of the GCCS (Carvalho: Para. [0063] discloses “The output tensors … combined (e.g., fused) together into a virtual camera space … In the example described herein, the virtual camera space is a birds-eye view … extend laterally … extend longitudinally.” For example, a vector space tensor having dimensions corresponding to lateral and longitudinal axes of the real-world environment. In standard coordinate systems (GCCS) utilized in such vehicular spatial contexts, the longitudinal direction corresponds to the X-axis and the lateral direction corresponds to the Y-axis. Further, Para. [0096] discloses “image 440 … movement to the left … movement to the right … movement up … movement down,” further indicating the alignment of the image/tensor axes (height/width) with the lateral and longitudinal directions of the environment.).

	As per claim 7, Carvalho-Karasev disclose the computer-implemented method according to claim 5, wherein applying the CDSM rotation to the 2D image features includes aligning the 2D image features with the GCCS by rotating the 2D image features about the Z-axis a first distance and rotating the GCCS about the Y-axis a second distance (Carvalho: Paras. [0060]-[0061] disclose that the transformation/rectification addresses differences in roll, pitch, and/or yaw based on camera parameters. Rotation about a Z-axis (yaw) and Y-axis (pitch/roll) is the standard geometric operation to align a camera coordinate system with a global vehicle coordinate system (GCCS).).

	As per claim 9, Carvalho discloses a computer system for performing Cross-Domain Spatial matching (CDSM) for monocular object detection in 3D free space of a physical environment surrounding a vehicle (Carvalho: Abstract; [0047].), the computer system comprising:
an image processing network for receiving into a network backbone, from a camera of the vehicle, at least one input image including a 2D array of pixel information of the physical environment (Carvalho: Para. [0058] discloses backbone networks 200 which receive respective images as input, process the raw pixels included in the images 202A-202H.);
a bidirectional feature pyramid network for determining, from the 2D array of pixel information of the input image, a set of 2D image features of potential targets (objects included in images) in the physical environment (Carvalho: Paras. [0029]-[0030], [0058] disclose backbone networks 200 may include weighted bi-directional feature pyramid networks (BiFPN). Output of the BiFPNs may represent multi-scale [set of 2D image] features determined based on the images 202A-202H.); and 
a CDSM system for transforming the set of 2D image features to a set of 3D image features of the potential targets by applying a CDSM rotation to the 2D image features that aligns a lateral axis and a vertical axis of the 2D image features with, respectively, a tensor height axis and a tensor width axis of a 3D birds eye view (BEV) (Carvalho: Fig. 3A & Paras. [0062]-[0063], [0071]-[0072], [0107] disclose  a transformer network engine 302 is trained to project the received vision information 204 [2D image features] into a birds-eye view camera space [3D image features], wherein the BEV space includes static three-dimensional graphics of objects [potential targets] which are positioned in a real-world environment in the lateral and/or longitudinal distance.), and by applying a CDSM aggregation (Carvalho: [0037] discloses the feature queue may be spatially indexed such that information is aggregated over the previous threshold distance.) 
However, Carvalho does not explicitly disclose “… transforming the set of 2D image features to a set of 3D image features … that aligns a lateral axis and a vertical axis of the 2D image features with, respectively, a tensor height axis and a tensor width axis of a 3D birds eye view (BEV) grid, and by applying a CDSM aggregation to the 3D image features to extrapolate depth information for a tensor channel axis that is normal to the tensor width axis and the tensor height axis.”
Further, Karasev is in the same field of endeavor and teaches transforming the set of 2D image features to a set of 3D image features (projects 2D image features in 3D space to obtain the “set of 3D image features”) that aligns a lateral axis (pixel w) and a vertical axis (pixel h) of the 2D image features with, respectively, a tensor height axis (Cartesian coordinate x) and a tensor width axis (Cartesian coordinate y) of a 3D birds eye view (BEV) grid (Fig. 3 – 3D BEV grid 350), and by applying a CDSM aggregation to the 3D image features to extrapolate depth information (supplement each pixel w, h, described by a feature vector FV(c)w,h with depth information from depth distributions) for a tensor channel axis (CFT(c, d, w, h)) that is normal to the tensor width axis (Cartesian coordinate y) and the tensor height axis (Karasev: Paras. [0047]-[0051], [0054] disclose projecting 2D image features in 3D space to obtain the set of 3D image features. Then a lift transformation combines depth distributions and the camera features to obtain feature tensors. Feature tensors FT (c, d)w,h computed for individual pixels can then be used to obtain a combined feature tensor for the whole image, e.g., by concatenating feature tensors for different pixels: {FT(c, d)w,h}→CFT(c, d, w, h). The combined feature tensor CFT(c, d, w, h) has dimensions C×D×W×H. Then a 2D mapping is performed, where the 2D mapping having perspective coordinates d, w, h can be transformed into 3D Cartesian coordinates d, w, h→x, y, z. The 2D mapping projects the combined feature tensor expressed in the new coordinates, CFT(c, x, y, z) to obtain projected (BEV) feature tensors (e.g., PCT (c, x, y)=ΣiCFT(c, x, y, zi)), which characterizes objects and their locations in the BEV grid.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Carvalho and Karasev before him or her, to modify the autonomous vehicle birds-eye view camera system of Carvalho to include the BEV grid and aggregation of the 3D image features feature as described in Karasev. The motivation for doing so would have been to improve object detection by providing transformation techniques that reduce perspective distortions. 

	As per claim 10, Carvalho-Karasev disclose the computer system according to claim 9, wherein the CDSM system generates a set of aggregated 3D features of the potential targets including the depth information extrapolated for the tensor channel axis, and detects, based on the aggregated 3D features, one or more objects associated with the potential targets (Karasev: Paras. [0051], [0062] disclose the detection head 242 can classify boxes of voxels with emphasis on detecting objects and can further perform instance aggregation. Examples of objects include agents (e.g., other vehicles [one or more objects associated with the potential targets]), pedestrians, etc.).

	As per claim 11, Carvalho-Karasev disclose the computer system according to claim 10, further comprising: 
a point cloud processing network connected to at least one of a lidar system and a radar system, the point cloud processing system receiving from the at least one of the lidar system and the radar system, 3D point cloud data representing a physical environment around the vehicle (Karasev: Figs. 1, 2A & Paras. [0051]-[0053] disclose the point cloud processing system 132 receiving the set of radar points that represent a 3D radar point cloud.).

	As per claim 12, Carvalho-Karasev disclose the computer system according to claim 11, further comprising: 
a voxel feature extractor (VFE) coupled to the point cloud processing network, the point cloud processing network and the VFE generating a plurality of 3D feature maps of the physical environment around the vehicle (Karasev: Fig. 1 & Paras. [0039], [0044] disclose the input data 201 can further include roadgraph data stored by (or accessible to) perception system 130, e.g., as part of map information 124. Roadgraph data can include any two-dimensional maps of the roadway and its surrounding, three-dimensional maps (including any suitable mapping of stationary objects, e.g., identification of bounding boxes of such objects). The perception system 130 can use global mapping that maps an entire region of 3D environment around the vehicle.).

	As per claim 13, Carvalho-Karasev disclose the computer system according to claim 12, wherein the CDSM system fuses the 2D image features with the 3D feature maps in a CDSM fusion block to generate the set of aggregated 3D features of the potential targets (Carvalho: Paras. [0063], [0071] disclose output tensors from the backbone networks 200 may be combined (e.g., fused) together into a virtual camera space (e.g., a vector space) via the birds-eye view network 210. The birds-eye view network/transformer 302 or 402 corresponds to the claimed CDSM fusion block, and the projection into the virtual/BEV space corresponds to fusing 2D image features with 3D feature maps.).

	As per claim 14, Carvalho-Karasev disclose the computer system according to claim 13, wherein the CDSM system generates a unified output prediction in the BEV grid of the potential targets (Carvalho: Paras. [0030]-[0031], [0063] disclose the machine learning model may effectuate a unified prediction, doing the job of stitching these different views internally by interpreting all images as one. Output tensors are combined (e.g., fused) together into a virtual camera space, wherein the virtual camera space is a birds-eye view; and Karasev: Para. [0054] discloses the BEV grid.).

	As per claim 15, Carvalho-Karasev disclose the computer system according to claim 13, wherein the CDSM system aligns spatial information in a first domain representing the 2D image features with spatial information in a second domain in a CDSM alignment block, that is distinct from the first domain, representing the 3D feature maps to fuse the 2D image features with the 3D feature maps (Carvalho: Paras. [0063], [0071] disclose that output tensors from backbone networks (representing 2D image features in a first domain) are combined or fused into a virtual camera space (distinct second domain) via a transformer network engine (alignment block) which is trained to project the information and perform multi-camera fusion. Further, Karasev: Fig. 1 & Paras. [0039], [0044] disclose the input data 201 can include roadgraph data stored by (or accessible to) perception system 130, e.g., as part of map information 124. Roadgraph data can include any two-dimensional maps of the roadway and its surrounding, three-dimensional maps (including any suitable mapping of stationary objects, e.g., identification of bounding boxes of such objects). The perception system 130 can use global mapping that maps an entire region of 3D environment around the vehicle.).

	As per claim 16, Carvalho-Karasev disclose the computer system according to claim 10, wherein the CDSM system determines that at least one of the one or more objects associated with the potential targets lies along a travel path of a vehicle (Karasev: Fig. 1 & Paras. [0037]-[0038] disclose data processing system 120 determines that at least one of the one or more objects associated with the potential targets lies along a travel path of a vehicle.).

	As per claim 17, Carvalho-Karasev disclose the computer system according to claim 16, wherein the CDSM system controls the vehicle to avoid a collision with the at least one of the one or more objects (Carvalho: Para. [0047] discloses based on the objects, the processor system 120 may adjust one or more driving characteristics or features. For example, the processor system 120 may cause the vehicle 100 to turn, slow down, brake, speed up, and so on.).

	As per claim 18, Carvalho-Karasev disclose the computer system according to claim 17, wherein the CDSM system centers cartesian global camera coordinate system (GCCS) on a camera sensor field of view of the camera when receiving the at least one input image into the network backbone (Carvalho: Paras. [0058], [0061] disclose backbone networks 200 which receive respective images as input and process the raw pixels, perform rectification and transformation [centering or aligning the camera coordinate system on the camera sensor field of view/parameters] based on the camera’s extrinsic and/or intrinsic parameters to normalize the images.).

	As per claim 19, Carvalho-Karasev disclose the computer system according to claim 18, wherein the CDSM system defines the GCCS as an X-axis extending along a vehicle travel path and Y-axis defining a width of the vehicle that is orthogonal to the X-axis, and a Z-axis that defines the tensor channel axis (Carvalho: Para. [0049] discloses “input data may be in the form of a three-dimensional matrix or tensor (e.g., two-dimensional data across multiple input channels) … output data may be across multiple output channels.” This corresponds to the claimed Z-axis that defines the tensor channel axis (the third dimension of the 3D matrix accommodating the channels) and Para. [0063] further discloses “the birds-eye view may extend laterally by about 70 meters … longitudinally by about 80 meters … birds-eye view may include static objects … in the lateral and/or longitudinal distance.” This corresponds to the claimed X-axis extending along a vehicle travel path (longitudinally) and Y-axis defining a width of the vehicle that is orthogonal to the X-axis (laterally).).

	As per claim 20, Carvalho-Karasev disclose the computer system according to claim 19, wherein the tensor height axis in the CDSM corresponds to the X-axis of the GCCS and the tensor width axis in the CDSM corresponds to the Y-axis of the GCCS (Carvalho: Para. [0063] discloses “The output tensors … combined (e.g., fused) together into a virtual camera space … In the example described herein, the virtual camera space is a birds-eye view … extend laterally … extend longitudinally.” A vector space tensor has dimensions corresponding to lateral and longitudinal axes of the real-world environment. In standard coordinate systems (GCCS) utilized in such vehicular spatial contexts, the longitudinal direction corresponds to the X-axis and the lateral direction corresponds to the Y-axis. Further, Para. [0096] discloses “image 440 … movement to the left … movement to the right … movement up … movement down,” further indicating the alignment of the image/tensor axes (height/width) with the lateral and longitudinal directions of the environment.).

Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Carvalho in view of Karasev in further view of Rukhovich et al., hereinafter referred to as Rukhovich (US 2023/0121534 A1). 

	As per claim 8, Carvalho-Karasev disclose the computer-implemented method according to claim 7 (Carvalho: Paras. [0060]-[0061].), 

(Karasev: Para. [0054] discloses the BEV grid.).
However, Carvalho-Karasev do not explicitly disclose “… further comprising: applying 2D convolutional layers to the BEV grid to generate refined 2D image features; and passing the refined 2D image features to 3D prediction head to process the refined 2D image features into 3D image features.”.
Furthermore, Rukhovich is in the same field of endeavor and teaches further comprising: 
applying 2D convolutional layers to the BEV grid to generate refined 2D image features (Rukhovich: Para. [0049] discloses the 2D object detection in the BEV plane is implemented by passing the 2D representation of 3D feature maps through the outdoor object detecting part 2D Cony including parallel 2D convolutional layers for classification and location.); and 
passing the refined 2D image features to 3D prediction head to process the refined 2D image features into 3D image features (Rukhovich: Para. [0049] discloses the 2D object detection in the BEV plane is implemented by passing the 2D representation of 3D feature maps through the outdoor object detecting part 2D Cony including parallel 2D convolutional layers for classification and location.).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention, and having the teachings of Carvalho-Karasev and Rukhovich before him or her, to modify the autonomous vehicle birds-eye view camera system of Carvalho-Karasev to include the application of 2D convolutional layers feature as described in Rukhovich. The motivation for doing so would have been to improve versatility of object detection networks by providing a configuration that works in both monocular and multi-view settings. 

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure and can be viewed in the list of references.
 	Any inquiry concerning this communication or earlier communications from the examiner should be directed to PEET DHILLON whose telephone number is (571)270-5647. The examiner can normally be reached M-F: 5am-1:30pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Sath V. Perungavoor can be reached at 571-272-7455. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/PEET DHILLON/Primary Examiner
Art Unit: 2488
Date: 12-12-2025
Read full office action
Prosecution Timeline

Jun 27, 2024
Application Filed
Dec 12, 2025
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/834,078
Patent 12598346
A DISPLAY DEVICE AND OPERATION METHOD THEREOF
2y 5m to grant Granted Apr 07, 2026
18/731,883
Patent 12567263
IMAGING SYSTEM
2y 5m to grant Granted Mar 03, 2026
18/510,606
Patent 12548338
OBJECT SAMPLING METHOD AND IMAGE ANALYSIS APPARATUS
2y 5m to grant Granted Feb 10, 2026
18/599,015
Patent 12536812
CAMERA PERCEPTION TECHNIQUES TO DETECT LIGHT SIGNALS OF AN OBJECT FOR DRIVING OPERATION
2y 5m to grant Granted Jan 27, 2026
18/819,189
Patent 12537911
VIDEO PROCESSING APPARATUS
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+18.4%)
2y 6m
Median Time to Grant
Low
PTA Risk
Based on 281 resolved cases by this examiner. Grant probability derived from career allow rate.