Prosecution Insights
Last updated: April 19, 2026
Application No. 18/889,907

ESTIMATION DEVICE AND ESTIMATION METHOD

Final Rejection §103
Filed
Sep 19, 2024
Examiner
HILAIRE, CLIFFORD
Art Unit
2488
Tech Center
2400 — Computer Networks
Assignee
Mirise Technologies Corporation
OA Round
2 (Final)
72%
Grant Probability
Favorable
3-4
OA Rounds
2y 8m
To Grant
87%
With Interview

Examiner Intelligence

Grants 72% — above average
72%
Career Allow Rate
313 granted / 438 resolved
+13.5% vs TC avg
Strong +16% interview lift
Without
With
+15.7%
Interview Lift
resolved cases with interview
Typical timeline
2y 8m
Avg Prosecution
32 currently pending
Career history
470
Total Applications
across all art units

Statute-Specific Performance

§101
3.1%
-36.9% vs TC avg
§103
47.9%
+7.9% vs TC avg
§102
19.6%
-20.4% vs TC avg
§112
28.9%
-11.1% vs TC avg
Black line = Tech Center average estimate • Based on career data from 438 resolved cases

Office Action

§103
DETAILED ACTION Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Applicant(s) Response to Official Action The response filed on 01/28/2026 has been entered and made of record. Response to Arguments/Amendments Presented arguments have been fully considered, but are rendered moot in view of the new ground(s) of rejection necessitated by amendment(s) initiated by the applicant(s). Examiner fully addresses below any arguments that were not rendered moot. Claim Rejections - 35 USC § 103 Summary of Arguments: Regarding amended claims 1, 11 and 12 Applicant argues that Qi FPN does not disclose or suggest generating the frustum-shaped point cloud by using a depth map that is generated based on three-dimensional coordinates calculated by the visual odometry, nor does it disclose generating the BEV feature based on such afrustum-shaped point cloud. Therefore, it would therefore not have been obvious to combine Sambo and Qi FPN. Furthermore, even if Sambo and Qi FPN were combined, the resulting system would not generate a frustum- shaped point cloud using a depth map generated from three-dimensional coordinates calculated via VO, as recited in claims 1, 11 and 12. Thus, nothing in Sambo discloses that the feature obtaining unit is configured to "generate, using the three-dimensional coordinates, a depth map of each of the two-dimensional images used to calculate the three-dimensional coordinates" and "generate a frustum-shaped point cloud of the two-dimensional images captured by each of the plurality of cameras by using the image features of the two-dimensional images captured by the corresponding one of the plurality of cameras and the depth maps of the corresponding two-dimensional images," as recited in amended claim 1. Claims 11 and 12 are allowable for reasons analogous to the reasons given above for claim 1 Regarding amended claim 5 Applicant argues that Qi '245 does not disclose or suggest generating RGB-D data by fusing a two-dimensional image and a depth map generated by using three-dimensional coordinates calculated by the visual odometry. Furthermore, Qi '245 does not disclose or suggest using such RGB-D data as input to a plurality of machine learning models to obtain RGB-D data features for BEV feature generation. Furthermore, although Qi '245 teaches generating RGB-D data, it only broadly discloses generating RGB-D data, and does not teach or suggest using a depth map generated by the three- dimensional coordinates calculated via VO from two-dimensional images. Therefore, even if Sambo and Qi '245 were combined, such a combination would not show it to be obvious to generate RGB-D data by fusing two-dimensional images and a depth map, generated by the three-dimensional coordinate calculated via VO from the two-dimensional images, corresponding to the two-dimensional images, as recited in amended claim 5. Thus, nothing in Sambo or Qi '245, alone or in combination, discloses or suggests that the feature obtaining unit is configured to "generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image," as recited in amended claim 5. Regarding amended claim 8, Applicant argues that Liu does not disclose or suggest using three-dimensional coordinates calculated by the visual odometry for the generation of the second BEV feature. In other words, neither Sambo nor Liu discloses or suggests obtaining a three- dimensional feature by inputting the three-dimensional coordinates calculated by the visual odometry to a second machine learning model trained with the three-dimensional coordinated calculated by the visual odometry as input, generating a second BEV feature based on the three- dimensional feature, and fusing the second BEV feature with a first BEV feature to generate a fused feature for generating of a bird's-eye view. Neither Sambo nor Liu discloses or teaches fusion using the VO-calculated three- dimensional coordinates. Thus, even if Sambo and Liu were combined, such a combination would not render it obvious to reach the integration of the first and second machine learning models, which enables LiDAR less BEV generation, as recited in amended claim 8. Thus, nothing in Sambo or Liu, alone or in combination, discloses or suggests that the feature obtaining unit is configured to "generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image," as recited in amended claim 8. Examiner’s Response: Examiner respectfully disagrees. Regarding amended claims 1, 5, 8, 11 and 12, in response to applicant’s arguments against the references individually, one cannot show non-obviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986). MPEP 707.07(f). Regarding claim 5, Examiner respectfully disagree. The claim limitation “"generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image” is patentably indistinguishable from Qi '245 by adding/concatenating another channel (i.e. three-dimensional depth data D from a depth sensor) to the well-known format of a RGB image (i.e. image from a camera) as clearly suggested by Qi '245 (i.e. two-dimensional image data from an optical camera- ¶0009… the autonomous robotic system is an autonomous vehicle. In some embodiments, the two-dimensional image data is RGB image data… In some embodiments, the depth sensor comprises a LiDAR- ¶0010… In some embodiments, the three-dimensional depth data comprises RGB-D data comprising a red-green-blue (RBG) image with the corresponding three-dimensional depth data comprising an image channel in which each pixel relates to a distance between the image plane and the corresponding object in the RGB image- ¶0051, Qi '245). Regarding claim 8, Examiner did not find the limitation “generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image”. In response to applicant’s argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., [generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image]) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). MPEP 707.07(f). Accordingly, Examiner maintains the rejections. Claim Construction Exemplary claim 1 recites “a coordinate calculation unit configured to calculate three-dimensional coordinates of an object present around a vehicle based on two-dimensional images representing outside of the vehicle captured by a plurality of cameras mounted on the vehicle, by using a self-position estimation method including a visual odometry which calculates the three-dimensional coordinates of the object in sequential two-dimensional images captured by a same camera, which is one of the plurality of cameras”. Let us denote vn as a set “sequential two-dimensional images” captured by corresponding camera cn (n is a natural number between 0 and N included, N>1). Each cn belongs to the set of the set C containing the “plurality of cameras mounted on the vehicle”, C={c0, …, cN}. Each vn belong to the set V containing the “two-dimensional images representing outside of the vehicle captured by a plurality of cameras mounted on the vehicle”, V={v0, …,vn}. The “sequential two-dimensional images captured by a same camera, which is one of the plurality of cameras” that includes “the object” can be denoted as vobj ∈ vn for any one value of n as defined above. A function fsel that selects vobj from V can be denoted as vobj=fsel(V). The “a self-position estimation method including a visual odometry” as fego=fodo(g(V))=pobj wherein the fego represent the “self-position estimation method”, fodo represents the “visual odometry” and pobj is the “three-dimensional coordinates of the object present around the vehicle”. Overall, Examiner will contemplate interpreting the first limitations of exemplary claim 1 as fego=fodo(vobj)=pobj. CLAIM INTERPRETATION The following is a quotation of 35 U.S.C. 112(f): (f) ELEMENT IN CLAIM FOR A COMBINATION.—An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The following is a quotation of pre-AIA 35 U.S.C. 112, sixth paragraph: An element in a claim for a combination may be expressed as a means or step for performing a specified function without the recital of structure, material, or acts in support thereof, and such claim shall be construed to cover the corresponding structure, material, or acts described in the specification and equivalents thereof. The claims in this application are given their broadest reasonable interpretation using the plain meaning of the claim language in light of the specification as it would be understood by one of ordinary skill in the art. The broadest reasonable interpretation of a claim element (also commonly referred to as a claim limitation) is limited by the description in the specification when 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is invoked. As explained in MPEP § 2181, subsection I, claim limitations that meet the following three-prong test will be interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph: (A) the claim limitation uses the term “means” or “step” or a term used as a substitute for “means” that is a generic placeholder (also called a nonce term or a non-structural term having no specific structural meaning) for performing the claimed function; (B) the term “means” or “step” or the generic placeholder is modified by functional language, typically, but not always linked by the transition word “for” (e.g., “means for”) or another linking word or phrase, such as "configured to" or "so that"; and (C) the term “means” or “step” or the generic placeholder is not modified by sufficient structure, material, or acts for performing the claimed function. Use of the word “means” (or “step”) in a claim with functional language creates a rebuttable presumption that the claim limitation is to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites sufficient structure, material, or acts to entirely perform the recited function. Absence of the word “means” (or “step”) in a claim creates a rebuttable presumption that the claim limitation is not to be treated in accordance with 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. The presumption that the claim limitation is not interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, is rebutted when the claim limitation recites function without reciting sufficient structure, material or acts to entirely perform the recited function. Claim limitations in this application that use the word “means” (or “step”) are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. Conversely, claim limitations in this application that do not use the word “means” (or “step”) are not being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, except as otherwise indicated in an Office action. This application includes one or more claim limitations that do not use the word “means,” but are nonetheless being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, because the claim limitation(s) uses a generic placeholder that is coupled with functional language without reciting sufficient structure to perform the recited function and the generic placeholder is not preceded by a structural modifier. Such claim limitation(s) is/are: “coordinate calculation unit” (claim 1), “feature obtaining unit” (claim 1) and “bird’s-eye view generation unit” (claim 1). Because this/these claim limitation(s) is/are being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, it/they is/are being interpreted to cover the corresponding structure described in the specification as performing the claimed function, and equivalents thereof. If applicant does not intend to have this/these limitation(s) interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph, applicant may: (1) amend the claim limitation(s) to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph (e.g., by reciting sufficient structure to perform the claimed function); or (2) present a sufficient showing that the claim limitation(s) recite(s) sufficient structure to perform the claimed function so as to avoid it/them being interpreted under 35 U.S.C. 112(f) or pre-AIA 35 U.S.C. 112, sixth paragraph. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows: 1. Determining the scope and contents of the prior art. 2. Ascertaining the differences between the prior art and the claims at issue. 3. Resolving the level of ordinary skill in the pertinent art. 4. Considering objective evidence present in the application indicating obviousness or nonobviousness. This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention. Claims 1, 11 and 12 are rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Charles R. Qi et. al [Frustum PointNets for 3D Object Detection from RGB-D Data: already of record]. Regarding claim 1, Sambo teaches: 1. An estimation device (i.e. FIGS. 1A-1J are diagrams of one or more example implementations 100 associated with utilizing machine learning models to reconstruct a vehicle accident scene from video- ¶0111) comprising: a coordinate calculation unit configured to calculate three-dimensional coordinates of an object (i.e. sparse point cloud- ¶0014) present around a vehicle based on two-dimensional images representing outside of the vehicle (i.e. As shown in FIG. 1B, and by reference number 130, reconstruction system 115 may generate a sparse point cloud of the location associated with the accident. For example, reconstruction system 115 may process the video data, with a simultaneous localization and mapping (SLAM) model, to generate the sparse point cloud of the location associated with the accident. The SLAM model may include a model that constructs a three-dimensional map (e.g., a point cloud) of an environment based on the video data and that identifies a location of the first vehicle 110-1 within the point cloud- ¶0014) captured by a plurality of cameras mounted on the vehicle (i.e. Vehicle device 105 includes one or more devices capable of receiving, generating, storing, processing, and/or providing information, such as information described herein. For example, vehicle device 105 may include a device included in a vehicle (e.g., vehicle 110) for obtaining video data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include a video camera, a dash camera, a parking assist camera, a backup assist camera, a thermal camera, lidar, radar, and/or the like. In some implementations, vehicle device 105 may include a device for obtaining other types of data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include an inertial measurement unit, a three-axis accelerometer, a gyroscope, a global positioning system (GPS) device, an on-board diagnostics (OBD) device, a vehicle tracking unit, an engine control unit (ECU), and/or the like- ¶0065), by using a self-position estimation method including a visual odometry which calculates the three-dimensional coordinates of the object (i.e. In some implementations, a direct sparse odometry (DSO) model is utilized as the SLAM model. A DSO model may include a direct and a sparse model for visual odometry. The DSO model may combine a fully direct probabilistic model (e.g., that minimizes a photometric error) with a consistent, joint optimization of all model parameters, including geometry (e.g., represented as an inverse depth in a reference frame) and camera motion- ¶0014) in sequential two-dimensional images captured by a same camera, which is one of the plurality of cameras (i.e. The DSO model may select particular points in each frame of the video data based on a gradient of pixel intensity in each frame (e.g., corners and edges may be selected due to pixel intensity). The DSO model may utilize the particular points in consecutive frames of the video data to estimate a camera pose in every frame of the video data, with respect to a camera position in a first frame of the video data. After this, the DSO may generate a three-dimensional point cloud that corresponds to a sparse three-dimensional map of the location of the accident based on the particular points, and may generate camera poses, for all frames of the video data, that correspond to a trajectory of the first vehicle 110-1 with vehicle device 105- ¶0015); a feature obtaining unit configured to obtain a bird’s-eye view (BEV) feature (i.e. As shown in FIG. 1F, and by reference number 150, reconstruction system 115 may process the dense semantic point cloud, with a voxelization model, to determine a dense semantic overhead view (BEV) of the location associated with the accident. A simple top-down projection is insufficient to generate the dense semantic BEV from the dense semantic point cloud- ¶0020), which is a feature in a BEV space (i.e. a ground plane- ¶0016… a dense semantic overhead view (BEV) of the location associated with the accident- ¶0020), based on the three-dimensional coordinates and at least one of the two-dimensional images representing outside of the vehicle by using a BEV estimation algorithm (i.e. As shown in FIG. 1C, and by reference number 135, reconstruction system 115 may process the video data, with a convolutional neural network (CNN) model, to generate depth maps for the frames of the video data- ¶0017… As shown in FIG. 1D, and by reference number 140, reconstruction system 115 may utilize the depth maps with the sparse point cloud to generate a dense point cloud. For example, reconstruction system 115 may utilize information associated with the depth maps to enrich the sparse point cloud with additional points- ¶0018… As shown in FIG. 1E, and by reference number 145, reconstruction system 115 may generate a dense semantic point cloud. For example, reconstruction system 115 may process the video data, with a semantic segmentation model, to generate the dense semantic point cloud. In some implementations, reconstruction system 115 may store, in the dense point cloud, information indicating what each point represents. Such information may be needed in order to correctly handle the points in the dense point cloud. In some implementations, the semantic segmentation model may store such information in the dense point cloud to generate the dense semantic point cloud- ¶0019); and a bird’s-eye view generation unit configured to generate a bird’s-eye view (i.e. Reconstruction system 115 may project the three-dimensional point cloud onto a ground plane to obtain a sparse overhead view (e.g., a bird's eye view (BEV)) representation of the location of the accident, and may project the camera poses onto the sparse overhead view to obtain the trajectory of the first vehicle 110-1. However, the resulting overhead view may include only points that the SLAM model selected and tracked, and thus may be sparse (e.g., contain less than a threshold quantity of points)- ¶0016), as a top-down perspective image of the vehicle, based on the BEV feature (i.e. As shown in FIG. 1I, and by reference number 165, reconstruction system 115 may augment the dense semantic BEV and trajectories, with additional data, to generate a final BEV. For example, reconstruction system 115 may augment the dense semantic BEV with additional data, such as satellite images, data identifying road names, data identifying vehicle speeds, data identifying vehicle distances, and/or the like- ¶0024). wherein the feature obtaining unit is configured to: generate, using the three-dimensional coordinates, a depth map of each of the two-dimensional images used to calculate the three-dimensional coordinates (i.e. refining the predicted depth maps with a direct sparse odometry model to generate the depth maps for the frames of the video data- ¶0102); obtain image features (i.e. dense semantic point- fig. 1E), which are features in the two-dimensional images, by inputting the two-dimensional images captured by each of the plurality of cameras (i.e. The device may process the video data, with a third model, to generate a dense semantic point cloud, and may process the dense semantic point cloud, with a fourth model, to determine a dense semantic overhead view of the location associated with the accident. The device may perform actions based on the dense semantic overhead view- Abstract… As shown in FIG. 1E, and by reference number 145, reconstruction system 115 may generate a dense semantic point cloud. For example, reconstruction system 115 may process the video data, with a semantic segmentation model, to generate the dense semantic point cloud- ¶0019) into a corresponding one of a plurality of machine learning models, each of the plurality of machine learning models having been trained to take therein the two-dimensional images captured by a corresponding one of the plurality of cameras as input and to output the image features (i.e. FIG. 2 is a diagram illustrating an example of training a machine learning model and applying a trained machine learning model to a new observation- ¶0003… FIGS. 1A-1J are diagrams of one or more example implementations 100 associated with utilizing machine learning models to reconstruct a vehicle accident scene from video… Reconstruction system 115 may include a system that utilizes machine learning and other models to reconstruct a vehicle accident scene from video- ¶0011); However, Sambo does not teach explicitly: generate a frustum-shaped point cloud of the two-dimensional images captured by each of the plurality of cameras by using the image features of the two-dimensional images captured by the corresponding one of the plurality of cameras and the depth maps of the corresponding two-dimensional images; and generate the BEV feature based on the frustum-shaped point cloud. In the same field of endeavor, Charles teaches: generate a frustum-shaped point cloud of the two-dimensional images captured by each of the plurality of cameras (i.e. point cloud in frustum (in point)- fig. 2) by using the image features of the two-dimensional images captured by the corresponding one of the plurality of cameras (i.e. image input- fig. 2) and the depth maps of the corresponding two-dimensional images (i.e. depth input- fig. 2… We call this entire procedure for extracting frustum point clouds from RGB-D data frustum proposal generation- section 4.1); and generate the BEV feature based on the frustum-shaped point cloud (i.e. bird’s eye view of the LiDAR points in the extruded frustum from 2D box, where we see a wide spread of points with both foreground occluder (bikes) and background clutter (building)- fig. 3). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo with the teachings of Charles to efficiently localize objects in point clouds of large-scale scenes (Charles- Abstract). Regarding claim 11, method claim 11 corresponds to apparatus claim 1, and therefore is also rejected for the same reason of obviousness as listed above. Regarding claim 12, apparatus claim 12 is drawn to the apparatus using/performing the same method as claimed in claim 11. Therefore, apparatus claim 12 corresponds to method claim 11, and is rejected for the same reason of obviousness as used above. Claims 3-4 are rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Charles R. Qi et al. [Frustum PointNets for 3D Object Detection from RGB-D Data: already of record] and further in view of Erick Lavoie et al. [US 20220260993 A1: already of record]. Regarding claim 3, Sambo and Charles teach all the limitations of claim 1. However, Sambo and Charles do not teach explicitly: wherein the self-position estimation method includes a visual inertial odometry. In the same field of endeavor, Erick teaches: wherein the self-position estimation method includes a visual inertial odometry (i.e. The sensors may include, for example, an accelerometer, a gyroscope, and a magnetometer. Changes to the calibrated pose of the camera 122 may be determined with inertial measurements from the IMU to estimate the current pose and location of the camera 122 in real time. For example, visual inertia odometry (VIO) or current odometry and mapping (OCM) may be used- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Charles with the teachings of Charles to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Regarding claim 4, Sambo, Charles and Erick teach all the limitations of claim 3. However, Sambo and Charles do not teach explicitly: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor. In the same field of endeavor, Erick teaches: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor (i.e. Inertial sensors 180 may be configured to detect a location, speed, and heading of vehicle 100 and/or mobile device 120. Inertial sensors may include accelerometers, gyroscopes, wheel speed sensors, compasses or directional sensors, combinations thereof, and the like. Other sensors include those that may be electrically coupled to the vehicle computer 112 or mobile device 120 such that information can be transmitted and received. For example, data collected by the vehicle computer 112 and/or the mobile device 120 may include weather data from weather stations, and the like- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Charles with the teachings of Charles to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Claim 5 is rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Charles R. Qi et. al. [US 20190147245 A1: already of record]. Regarding claim 5, Sambo teaches: 5. An estimation device (i.e. FIGS. 1A-1J are diagrams of one or more example implementations 100 associated with utilizing machine learning models to reconstruct a vehicle accident scene from video- ¶0111) comprising: a coordinate calculation unit configured to calculate three-dimensional coordinates of an object (i.e. sparse point cloud- ¶0014) present around a vehicle based on two-dimensional images representing outside of the vehicle (i.e. As shown in FIG. 1B, and by reference number 130, reconstruction system 115 may generate a sparse point cloud of the location associated with the accident. For example, reconstruction system 115 may process the video data, with a simultaneous localization and mapping (SLAM) model, to generate the sparse point cloud of the location associated with the accident. The SLAM model may include a model that constructs a three-dimensional map (e.g., a point cloud) of an environment based on the video data and that identifies a location of the first vehicle 110-1 within the point cloud- ¶0014) captured by a plurality of cameras mounted on the vehicle (i.e. Vehicle device 105 includes one or more devices capable of receiving, generating, storing, processing, and/or providing information, such as information described herein. For example, vehicle device 105 may include a device included in a vehicle (e.g., vehicle 110) for obtaining video data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include a video camera, a dash camera, a parking assist camera, a backup assist camera, a thermal camera, lidar, radar, and/or the like. In some implementations, vehicle device 105 may include a device for obtaining other types of data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include an inertial measurement unit, a three-axis accelerometer, a gyroscope, a global positioning system (GPS) device, an on-board diagnostics (OBD) device, a vehicle tracking unit, an engine control unit (ECU), and/or the like- ¶0065), by using a self-position estimation method including a visual odometry which calculates the three-dimensional coordinates of the object (i.e. In some implementations, a direct sparse odometry (DSO) model is utilized as the SLAM model. A DSO model may include a direct and a sparse model for visual odometry. The DSO model may combine a fully direct probabilistic model (e.g., that minimizes a photometric error) with a consistent, joint optimization of all model parameters, including geometry (e.g., represented as an inverse depth in a reference frame) and camera motion- ¶0014) in sequential two-dimensional images captured by a same camera, which is one of the plurality of cameras (i.e. The DSO model may select particular points in each frame of the video data based on a gradient of pixel intensity in each frame (e.g., corners and edges may be selected due to pixel intensity). The DSO model may utilize the particular points in consecutive frames of the video data to estimate a camera pose in every frame of the video data, with respect to a camera position in a first frame of the video data. After this, the DSO may generate a three-dimensional point cloud that corresponds to a sparse three-dimensional map of the location of the accident based on the particular points, and may generate camera poses, for all frames of the video data, that correspond to a trajectory of the first vehicle 110-1 with vehicle device 105- ¶0015); a feature obtaining unit configured to obtain a bird’s-eye view (BEV) feature (i.e. As shown in FIG. 1F, and by reference number 150, reconstruction system 115 may process the dense semantic point cloud, with a voxelization model, to determine a dense semantic overhead view (BEV) of the location associated with the accident. A simple top-down projection is insufficient to generate the dense semantic BEV from the dense semantic point cloud- ¶0020), which is a feature in a BEV space (i.e. a ground plane- ¶0016… a dense semantic overhead view (BEV) of the location associated with the accident- ¶0020), based on the three-dimensional coordinates and at least one of the two-dimensional images representing outside of the vehicle by using a BEV estimation algorithm (i.e. As shown in FIG. 1C, and by reference number 135, reconstruction system 115 may process the video data, with a convolutional neural network (CNN) model, to generate depth maps for the frames of the video data- ¶0017… As shown in FIG. 1D, and by reference number 140, reconstruction system 115 may utilize the depth maps with the sparse point cloud to generate a dense point cloud. For example, reconstruction system 115 may utilize information associated with the depth maps to enrich the sparse point cloud with additional points- ¶0018… As shown in FIG. 1E, and by reference number 145, reconstruction system 115 may generate a dense semantic point cloud. For example, reconstruction system 115 may process the video data, with a semantic segmentation model, to generate the dense semantic point cloud. In some implementations, reconstruction system 115 may store, in the dense point cloud, information indicating what each point represents. Such information may be needed in order to correctly handle the points in the dense point cloud. In some implementations, the semantic segmentation model may store such information in the dense point cloud to generate the dense semantic point cloud- ¶0019); and a bird’s-eye view generation unit configured to generate a bird’s-eye view(i.e. Reconstruction system 115 may project the three-dimensional point cloud onto a ground plane to obtain a sparse overhead view (e.g., a bird's eye view (BEV)) representation of the location of the accident, and may project the camera poses onto the sparse overhead view to obtain the trajectory of the first vehicle 110-1. However, the resulting overhead view may include only points that the SLAM model selected and tracked, and thus may be sparse (e.g., contain less than a threshold quantity of points)- ¶0016), as a top-down perspective image of the vehicle, based on the BEV feature (i.e. As shown in FIG. 1I, and by reference number 165, reconstruction system 115 may augment the dense semantic BEV and trajectories, with additional data, to generate a final BEV. For example, reconstruction system 115 may augment the dense semantic BEV with additional data, such as satellite images, data identifying road names, data identifying vehicle speeds, data identifying vehicle distances, and/or the like- ¶0024). wherein the feature obtaining unit is configured to: generate, using the three-dimensional coordinates, a depth map of each of the two-dimensional images used to calculate the three-dimensional coordinates (i.e. refining the predicted depth maps with a direct sparse odometry model to generate the depth maps for the frames of the video data- ¶0102); However, Sambo does not teach explicitly: generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image; obtain an RGB-D data feature, which is a feature of the RGB-D data, by inputting the RGB-D data generated based on the two-dimensional image captured by each of the plurality of cameras into a corresponding one of a plurality of machine learning models, each of the plurality of machine learning models having been trained to take therein the RGB-D data generated based on the two-dimensional image captured by a corresponding one of the plurality of cameras as input and output the RGB-D data feature; and generate the BEV feature based on the RGB-D data feature. In the same field of endeavor, Charles teaches: generate RGB-D data by fusing the two-dimensional image and the depth map corresponding to the two-dimensional image (i.e. the three-dimensional depth data comprises RGB-D data comprising a red-green-blue (RBG) image with the corresponding three-dimensional depth data comprising an image channel in which each pixel relates to a distance between the image plane and the corresponding object in the RGB image- ¶0051… The devices, system, and methods herein outperform previous state-of-the-art methods by a large margin. While MV3D uses multi-view feature aggregation and sophisticated multi-sensor fusion strategy, the methods provided herein based on PointNet (v1) and PointNet++(v2) backbone are much cleaner in design- ¶0146); obtain an RGB-D data feature, which is a feature of the RGB-D data (i.e. point cloud in frustum (in point)- fig. 2), by inputting the RGB-D data generated based on the two-dimensional image captured by each of the plurality of cameras into a corresponding one of a plurality of machine learning models (i.e. Provided herein are methods and systems for implementing three-dimensional perception in an autonomous robotic system comprising an end-to-end neural network architecture that directly consumes large-scale raw sparse point cloud data and performs such tasks as object localization, boundary estimation, object classification, and segmentation of individual shapes or fused complete point cloud shapes- Abstract), each of the plurality of machine learning models having been trained to take therein the RGB-D data (i.e. Some methods involve applying a convolutional network to the collected three-dimensional depth data through deep net architectures or PointNets, to classify objects and semantic segmentations. Although PointNets are capable of classifying a three-dimensional depth data to predicting semantic class, such a technology has yet to be used for instance-level 3D object detection. The challenge with such a task, however, is that the computational complexity of object recognition and classification grows cubically with respect to resolution, which may be too expensive and time intensive for use in large three dimensional scenes- ¶0004… In some embodiments, the deep learning model comprises a PointNet. In some embodiments, the deep learning model comprises a three-dimensional convolutional neural network on voxelized volumetric grids of the point cloud in frustum. In some embodiments, the deep learning model comprises a two-dimensional convolutional neural network on bird's eye view projection of the point cloud in frustum. In some embodiments, the deep learning model comprises a recurrent neural network on the sequence of the three-dimensional points from close to distant- ¶0053…one or more of the learning modules are trained by supervised learning with provided ground-truth objects of interest, attention regions, and oriented three-dimensional boundaries- ¶0055…Architecture is illustrated for PointNet++(v2) models with set abstraction layers and feature propagation layers (for instance of the algorithm, following the same rationale as in the frustum proposal step- ¶0134) generated based on the two-dimensional image captured by a corresponding one of the plurality of cameras as input and output the RGB-D data feature (i.e. The entire procedure for extracting frustum point clouds from RGB-D data is termed frustum proposal generation- ¶0128); and generate the BEV feature based on the RGB-D data feature (i.e. the deep learning model comprises a two-dimensional convolutional neural network on bird's eye view projection of the point cloud in frustum- ¶0010). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo with the teachings of Charles to improve segmentation performance (Charles- Abstract). Claims 6-7 are rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Charles R. Qi et. al. [US 20190147245 A1: already of record] and further in view of Erick Lavoie et al. [US 20220260993 A1: already of record]. Regarding claim 6, Sambo and Charles teach all the limitations of claim 5. However, Sambo and Charles do not teach explicitly: wherein the self-position estimation method includes a visual inertial odometry. In the same field of endeavor, Erick teaches: wherein the self-position estimation method includes a visual inertial odometry (i.e. The sensors may include, for example, an accelerometer, a gyroscope, and a magnetometer. Changes to the calibrated pose of the camera 122 may be determined with inertial measurements from the IMU to estimate the current pose and location of the camera 122 in real time. For example, visual inertia odometry (VIO) or current odometry and mapping (OCM) may be used- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Charles with the teachings of Erick to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Regarding claim 7, Sambo, Charles and Erick teach all the limitations of claim 6. However, Sambo and Charles do not teach explicitly: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor. In the same field of endeavor, Erick teaches: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor (i.e. Inertial sensors 180 may be configured to detect a location, speed, and heading of vehicle 100 and/or mobile device 120. Inertial sensors may include accelerometers, gyroscopes, wheel speed sensors, compasses or directional sensors, combinations thereof, and the like. Other sensors include those that may be electrically coupled to the vehicle computer 112 or mobile device 120 such that information can be transmitted and received. For example, data collected by the vehicle computer 112 and/or the mobile device 120 may include weather data from weather stations, and the like- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Charles with the teachings of Erick to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Zhijian Liu et al. [BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation: already of record]. Regarding claim 8, Sambo teaches: 8. An estimation device (i.e. FIGS. 1A-1J are diagrams of one or more example implementations 100 associated with utilizing machine learning models to reconstruct a vehicle accident scene from video- ¶0111) comprising: a coordinate calculation unit configured to calculate three-dimensional coordinates of an object (i.e. sparse point cloud- ¶0014) present around a vehicle based on two-dimensional images representing outside of the vehicle (i.e. As shown in FIG. 1B, and by reference number 130, reconstruction system 115 may generate a sparse point cloud of the location associated with the accident. For example, reconstruction system 115 may process the video data, with a simultaneous localization and mapping (SLAM) model, to generate the sparse point cloud of the location associated with the accident. The SLAM model may include a model that constructs a three-dimensional map (e.g., a point cloud) of an environment based on the video data and that identifies a location of the first vehicle 110-1 within the point cloud- ¶0014) captured by a plurality of cameras mounted on the vehicle (i.e. Vehicle device 105 includes one or more devices capable of receiving, generating, storing, processing, and/or providing information, such as information described herein. For example, vehicle device 105 may include a device included in a vehicle (e.g., vehicle 110) for obtaining video data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include a video camera, a dash camera, a parking assist camera, a backup assist camera, a thermal camera, lidar, radar, and/or the like. In some implementations, vehicle device 105 may include a device for obtaining other types of data associated with vehicle 110 traveling along a route. For example, vehicle device 105 may include an inertial measurement unit, a three-axis accelerometer, a gyroscope, a global positioning system (GPS) device, an on-board diagnostics (OBD) device, a vehicle tracking unit, an engine control unit (ECU), and/or the like- ¶0065), by using a self-position estimation method including a visual odometry which calculates the three-dimensional coordinates of the object (i.e. In some implementations, a direct sparse odometry (DSO) model is utilized as the SLAM model. A DSO model may include a direct and a sparse model for visual odometry. The DSO model may combine a fully direct probabilistic model (e.g., that minimizes a photometric error) with a consistent, joint optimization of all model parameters, including geometry (e.g., represented as an inverse depth in a reference frame) and camera motion- ¶0014) in sequential two-dimensional images captured by a same camera, which is one of the plurality of cameras (i.e. The DSO model may select particular points in each frame of the video data based on a gradient of pixel intensity in each frame (e.g., corners and edges may be selected due to pixel intensity). The DSO model may utilize the particular points in consecutive frames of the video data to estimate a camera pose in every frame of the video data, with respect to a camera position in a first frame of the video data. After this, the DSO may generate a three-dimensional point cloud that corresponds to a sparse three-dimensional map of the location of the accident based on the particular points, and may generate camera poses, for all frames of the video data, that correspond to a trajectory of the first vehicle 110-1 with vehicle device 105- ¶0015); a feature obtaining unit configured to obtain a bird’s-eye view (BEV) feature (i.e. As shown in FIG. 1F, and by reference number 150, reconstruction system 115 may process the dense semantic point cloud, with a voxelization model, to determine a dense semantic overhead view (BEV) of the location associated with the accident. A simple top-down projection is insufficient to generate the dense semantic BEV from the dense semantic point cloud- ¶0020), which is a feature in a BEV space (i.e. a ground plane- ¶0016… a dense semantic overhead view (BEV) of the location associated with the accident- ¶0020), based on the three-dimensional coordinates and at least one of the two-dimensional images representing outside of the vehicle by using a BEV estimation algorithm (i.e. As shown in FIG. 1C, and by reference number 135, reconstruction system 115 may process the video data, with a convolutional neural network (CNN) model, to generate depth maps for the frames of the video data- ¶0017… As shown in FIG. 1D, and by reference number 140, reconstruction system 115 may utilize the depth maps with the sparse point cloud to generate a dense point cloud. For example, reconstruction system 115 may utilize information associated with the depth maps to enrich the sparse point cloud with additional points- ¶0018… As shown in FIG. 1E, and by reference number 145, reconstruction system 115 may generate a dense semantic point cloud. For example, reconstruction system 115 may process the video data, with a semantic segmentation model, to generate the dense semantic point cloud. In some implementations, reconstruction system 115 may store, in the dense point cloud, information indicating what each point represents. Such information may be needed in order to correctly handle the points in the dense point cloud. In some implementations, the semantic segmentation model may store such information in the dense point cloud to generate the dense semantic point cloud- ¶0019); and a bird’s-eye view generation unit configured to generate a bird’s-eye view(i.e. Reconstruction system 115 may project the three-dimensional point cloud onto a ground plane to obtain a sparse overhead view (e.g., a bird's eye view (BEV)) representation of the location of the accident, and may project the camera poses onto the sparse overhead view to obtain the trajectory of the first vehicle 110-1. However, the resulting overhead view may include only points that the SLAM model selected and tracked, and thus may be sparse (e.g., contain less than a threshold quantity of points)- ¶0016), as a top-down perspective image of the vehicle, based on the BEV feature (i.e. As shown in FIG. 1I, and by reference number 165, reconstruction system 115 may augment the dense semantic BEV and trajectories, with additional data, to generate a final BEV. For example, reconstruction system 115 may augment the dense semantic BEV with additional data, such as satellite images, data identifying road names, data identifying vehicle speeds, data identifying vehicle distances, and/or the like- ¶0024). However, Sambo do not teach explicitly: wherein the feature obtaining unit is configured to: obtain image features, which are features in the two-dimensional images, by inputting the two-dimensional images captured by each of the plurality of cameras into a corresponding one of a plurality of first machine learning models each trained to take therein the two-dimensional images captured by a corresponding one of the plurality of cameras as input and output the image features; generate a first BEV feature, which is a feature in the BEV space, based on the image features; obtain a three-dimensional feature, which is a feature of the three-dimensional coordinates, by inputting the three-dimensional coordinates into a second machine learning model having been trained to take therein the three-dimensional coordinates as input and output the three-dimensional feature; and generate a second BEV feature, which is a feature in the BEV space, based on the three-dimensional feature, and the bird’s-eye view generation unit is configured to generate the bird’s-eye view based on a fused feature obtained by fusing the first BEV feature and the second BEV feature. In the same field of endeavor, Zhijian teaches: wherein the feature obtaining unit is configured to: obtain image features (i.e. camera features- fig. 2), which are features in the two-dimensional images, by inputting the two-dimensional images captured by each of the plurality of cameras into a corresponding one of a plurality of first machine learning models each trained to take therein the two-dimensional images captured by a corresponding one of the plurality of cameras as input and output the image features (i.e. camera encoder- fig. 2… Unlike existing approaches [54, 55, 1] that freeze the camera encoder, we train the entire model in an end-to-end manner. We apply both image and LiDAR data augmentations to prevent overfitting. Optimization is carried out using AdamW [34] with a weight decay of 10-2.); generate a first BEV feature, which is a feature in the BEV space, based on the image features (i.e. camera feat. (in BEV)); obtain a three-dimensional feature (i.e. LiDAR Features- fig. 2), which is a feature of the three-dimensional coordinates, by inputting the three-dimensional coordinates (i.e. LiDAR point cloud- Fig. 2) into a second machine learning model having been trained to take therein the three-dimensional coordinates as input and output the three-dimensional feature(i.e. LiDAR encoder- fig. 2…); and generate a second BEV feature (Lidar feat. (in BEV)- fig. 2), which is a feature in the BEV space, based on the three-dimensional feature, and the bird’s-eye view generation unit is configured to generate the bird’s-eye view based on a fused feature obtained by fusing the first BEV feature and the second BEV feature (i.e. fused BEV features- fig. 2). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo with the teachings of Zhijian to accelerate BEV pooling with precomputation and interval reduction (Zhijian- section 3). Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Francesco Sambo et al. [US 20220044024 A1: already of record] in view of Zhijian Liu et al. [BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird’s-Eye View Representation: already of record] and further in view of Erick Lavoie et al. [US 20220260993 A1: already of record]. Regarding claim 9, Sambo and Zhijian teach all the limitations of claim 8. However, Sambo and Zhijian do not teach explicitly: wherein the self-position estimation method includes a visual inertial odometry. In the same field of endeavor, Erick teaches: wherein the self-position estimation method includes a visual inertial odometry (i.e. The sensors may include, for example, an accelerometer, a gyroscope, and a magnetometer. Changes to the calibrated pose of the camera 122 may be determined with inertial measurements from the IMU to estimate the current pose and location of the camera 122 in real time. For example, visual inertia odometry (VIO) or current odometry and mapping (OCM) may be used- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Zhijian with the teachings of Erick to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Regarding claim 10, Sambo, Zhijian and Erick teach all the limitations of claim 9. However, Sambo and Zhijian do not teach explicitly: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor. In the same field of endeavor, Erick teaches: wherein the self-position estimation method further includes estimation using a detection value of a wheel speed sensor (i.e. Inertial sensors 180 may be configured to detect a location, speed, and heading of vehicle 100 and/or mobile device 120. Inertial sensors may include accelerometers, gyroscopes, wheel speed sensors, compasses or directional sensors, combinations thereof, and the like. Other sensors include those that may be electrically coupled to the vehicle computer 112 or mobile device 120 such that information can be transmitted and received. For example, data collected by the vehicle computer 112 and/or the mobile device 120 may include weather data from weather stations, and the like- ¶0046). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention, to modify the teachings of Sambo and Zhijian with the teachings of Erick to detect a location, speed, and heading of vehicle and/or mobile device (Erick- ¶0032). Conclusion THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action. Any inquiry concerning this communication or earlier communications from the examiner should be directed to CLIFFORD HILAIRE whose telephone number is (571)272-8397. The examiner can normally be reached 5:30-1400. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, SATH V PERUNGAVOOR can be reached at (571)272-7455. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. CLIFFORD HILAIRE Primary Examiner Art Unit 2488 /CLIFFORD HILAIRE/Primary Examiner, Art Unit 2488
Read full office action

Prosecution Timeline

Sep 19, 2024
Application Filed
Oct 26, 2025
Non-Final Rejection — §103
Jan 21, 2026
Examiner Interview Summary
Jan 21, 2026
Applicant Interview (Telephonic)
Jan 28, 2026
Response Filed
Feb 26, 2026
Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12602591
TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING
2y 5m to grant Granted Apr 14, 2026
Patent 12596427
REWARD GENERATING METHOD FOR REDUCING PEAK LOAD OF POWER CONSUMPTION AND COMPUTING DEVICE FOR PERFORMING THE SAME
2y 5m to grant Granted Apr 07, 2026
Patent 12576797
ROTATING DEVICE FOR DISPLAY OF VEHICLE
2y 5m to grant Granted Mar 17, 2026
Patent 12573211
SYSTEMS AND METHODS FOR MANEUVER IDENTIFICATION FROM CONDENSED REPRESENTATIONS OF VIDEO
2y 5m to grant Granted Mar 10, 2026
Patent 12568310
TARGET TRACKING DEVICE, TARGET TRACKING METHOD, AND RECORDING MEDIUM FOR STORING TARGET TRACKING PROGRAM
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
72%
Grant Probability
87%
With Interview (+15.7%)
2y 8m
Median Time to Grant
Moderate
PTA Risk
Based on 438 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month