Last updated: April 18, 2026
Application No. 18/355,721
MODEL EVALUATION AND ENHANCED USER INTERFACE FOR ANALYZING MACHINE LEARNING MODELS

Non-Final OA §101§103§112
Filed
Jul 20, 2023
Examiner
CAMPOS, ALFREDO
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
Tesla Inc.
OA Round
1 (Non-Final)
Interview Optional

— +33.3% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 6 resolved cases, 2023–2026
Examiner Intelligence

CAMPOS, ALFREDO View full profile →
Grants 83% — above average
Career Allow Rate
5 granted / 6 resolved
+28.3% vs TC avg
Strong +33% interview lift
Without
With
+33.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
26 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
33.3%
-6.7% vs TC avg
§103
42.8%
+2.8% vs TC avg
§102
3.9%
-36.1% vs TC avg
§112
20.0%
-20.0% vs TC avg
Black line = Tech Center average estimate • Based on career data from 6 resolved cases
Office Action

§101 §103 §112
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. Information Disclosure Statement The information disclosure statement filed to: The application does not have a copy of the KR100232670B1. The references cited in lines 781-830 list the reference as for example “ WO 19/191578 ” and it should be listed as WO 2019/191578A1. The issues noted above fail to comply with 37 CFR 1.98(a)(2), which requires a legible copy of each cited foreign patent document; each non-patent literature publication or that portion which caused it to be listed; and all other information or that portion which caused it to be listed. It has been placed in the application file, but the information referred to therein has not been considered. Drawings The drawings are objected to under 37 CFR 1.84(a)(1). Black and white drawings should be submitted as the drawings are in grayscale. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance. Specification The disclosure is objected to because of the following informalities: Paragraph 0059 describes Figure 3F “Figure 3F illustrates images 350 which represent a view about ego (e.g., a vehicle) … is illustrated along with an indication of nearby objects (e.g., the shopping carts 352). Advantageously, a ground truth is included in red along with an indication of a model estimate for the location of the nearby objects.” However the drawings do not show any color. Appropriate correction is required. Claim Rejections - 35 USC § 112 The following is a quotation of 35 U.S.C. 112(b): (b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention. The following is a quotation of 35 U.S.C. 112 (pre-AIA), second paragraph: The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention. Claim 6, 17-21 rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention. Claim 6 recites the limitation “ the error " in line 2. There is insufficient antecedent basis for this limitation in the claim. Claim 17 recites the limitation " cause the computers to perform operations comprising: " in line 2. The claim limitation makes the claim indefinite as the limitation does not have any limitations that limit what the operation comprises of. The dependent claims inherit the issue and are equally rejected. ( Examiner Note: Claim 17 missing limitations could be the same as independent claim 1 and for compact prosecution purposes claim 17 will be examined as analogous to claim 1. ) Claim 1 8 recites the limitation s: “ the user interface ” “ the end-user vehicle; ” “ the objects ” “ the location ” “ the output. ” There is insufficient antecedent basis for th ese limitation s in the claim. Claim 19 recites the limitation “ the error " in line 2. There is insufficient antecedent basis for this limitation in the claim. Claim 20 recites the limitation “ the user interface ” in line 2. There is insufficient antecedent basis for this limitation in the claim. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1- 21 rejected under 35 U.S.C. 101 because the claimed invention is directed to abstract idea without significantly more. The claim(s) recite(s) significantly more. The subject matter eligibility test for products and process is describe below for claim 1 in view of dependent claims. Regarding claim 1: Step 1: Is the claim to a process machine manufacture or composition of matter? Yes – Claim 1 recites a method and that falls under the statutory categories. Step 2A Prong 1: Does the claim recite an abstract idea, law of nature, or natural phenomenon? Yes – The claim recites the following: “ determining values associated with metrics based on the obtained output; ” - The limitations of claim 1 recites a mental process of determining values associated with metrics (see MPEP 2106.04(a)(2)III). Step 2 Prong 2: Does the claim recite additional elements that integrate the judicial exception into a particular application? No – The claim includes the additional element(s): “ A method implemented by a system of one or more processors, the method comprising: obtaining information associated with a machine learning (ML) model, wherein the ML model is associated with autonomous or semi-autonomous operation of a vehicle; ” The additional elements fall under Insignificant Extra-Solution Activity as mere data gathering by obtaining information associated with a ML model. See MPEP 2106.5(g). “ obtaining validation data, wherein the validation data includes one or more video sequences obtained from image sensors of an end-user vehicle; ” The additional elements fall under Insignificant Extra-Solution Activity as mere data gathering by obtaining validation data. See MPEP 2106.5(g). “ obtaining output via computing forward pass-through ML model using validation data, wherein the output indicates, at least, location information associated with objects detected via the ML model in the validation data;” The additional elements fall under “apply it” as using a generic computer to obtain output by passing by processing the validation data by the ML model . See Mere Instructions to Apply an Exemption (see MPEP 2106.05(f)). “ generating user interface information based on one or more of the determined values or obtained output ” The additional elements fall under “apply it” as using a generic computer to generate user interface information . See Mere Instructions to Apply an Exemption (see MPEP 2106.05(f)). Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception? No - The claim does not include additional elements that are sufficient to amount to a significantly more than the judicial exemption. As an order whole, the claim is directed to obtaining results by using machine learning model and displaying results . As discussed above with respect to integration of the abstract idea into a practical application , the additional elements of obtaining and generating fall under using generic computer to apply an exemptio n and mere data gathering . The method does not improve on the function of a computer, transforms an article into another article, nor is it applied by a particular machine, making the claim not patent eligible. Regarding claim 2: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 1, wherein the validation data further includes one or more of velocities of the end-user vehicle . ” The additional elements fall under Insignificant Extra-Solution Activity. See MPEP 2106.5(g). Regarding claim 3: Step 2A Prong 2 , Step 2B: The additional element(s): “ The method of claim 1, wherein the user interface information includes a graphical representation of objects detected via the ML model. ” The additional elements fall under Insignificant Extra-Solution Activity. See MPEP 2106.5(g) . The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application Regarding claim 4: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 3, wherein the user interface information further includes error information associated with the objects. ” The additional elements fall under Insignificant Extra-Solution Activity. See MPEP 2106.5(g). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application Regarding claim 5: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 1, wherein the user interface: presents a graphical depiction of the end-user vehicle; presents ground truth locations of the objects which are proximate to the end-user vehicle; The additional elements fall under “apply it” as using a generic computer to present a graphical depiction of the end-user vehicle and preset ground truth locations . See Mere Instructions to Apply an Exemption (see MPEP 2106.05(f)). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application “ adjusts individual presentations of the ground truth locations to reflect individual errors associated with the location information indicated in the output. ” The additional elements fall under “apply it” as using a generic computer to adjust individual presentation of the ground truth to reflect individual errors. See Mere Instructions to Apply an Exemption (see MPEP 2106.05(f)). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application Regarding claim 6: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 5, wherein each adjusted presentation includes a color whose radius is selected based on the error. ” The additional elements fall under “apply it” as using a generic computer to include color in the presentation . See Mere Instructions to Apply an Exemption (see MPEP 2106.05(f)). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application. Regarding claim 7: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 5, wherein the user interface presents a particular video sequence and wherein the ground truth locations of the objects are updated based on the video sequence. ” The additional elements fall under Insignificant Extra-Solution Activity. See MPEP 2106.5(g). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application. Regarding claim 8: Step 2A Prong 2, Step 2B: The additional element(s): “ The method of claim 7, wherein the individual presentations of the ground truth locations are updated based on the video sequence. ” The additional elements fall under Insignificant Extra-Solution Activity. See MPEP 2106.5(g). The judicial exemptions do not integrate into a practical application nor provide an improvement. The process does not provide an inventive concept nor provides a practical application. Claims 9-16 recite s a system and are analogous to the method of claims 1 -8 . Therefore, the rejections of claim 1 -8 above applies to claims 9-16 . Claims 17-21 recite a computer readable medium product and are analogous to the method of claims 1 and 5-8. Therefore, the rejections of claim 1 and 5-8 above applies to claims 17-21. Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claim(s) 1 -3, 9-11, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over Zar e mba et al. (US11919545B2) (“Zar e mba”) in view of Jarquin Arroyo et al. (US20220111864A1) (“Arroyo”) . Regarding claim 1 and analogous claims 9 and 17 , Zaremba teaches A method implemented by a system of one or more processors, the method comprising: obtaining information associated with a machine learning (ML) model, wherein the ML model is associated with autonomous or semi-autonomous operation of a vehicle (Zaremba col 1 line 33-44 An autonomous vehicle uses different types of sensors to receive input describing the surroundings (or environment) of the autonomous vehicle while driving through traffic. For example, an autonomous vehicle may perceive the surroundings using camera images and lidar scans. The autonomous vehicle determines whether an object in the surroundings is stationary, for example, buildings or trees, or the object is non-stationary, for example, a pedestrian, a vehicle, and so on. The autonomous vehicle system predicts the motion of non-stationary objects to make sure that the autonomous vehicle is able to navigate through non-stationary obstacles in the traffic. Col 2 line 25-43, In some embodiments, the system extracts video frames that provides a comprehensive coverage of various types of traffic scenarios to train/validate the ML models configured to predict hidden context attributes of traffic entities in various traffic scenarios that the autonomous vehicle may encounter. The system classifies the video frames according to the traffic scenarios depicted in the videos, each scenario associated with a filter based on one or more attributes including (1) vehicle attributes such as speed, tum direction, and so on and (2) traffic attributes describing behavior of traffic entities, for example, whether a pedestrian crossed the street, and (3) road attributes, for example, whether there is an intersection or a cross walk coming up. The system has access to a large volume of video frames, so it is not practical to train the ML models with all of the video frames. Instead, filters can be applied to video frames to identify subsets of video frames representing the various traffic scenarios to ensure proper training for the various traffic scenarios [ obtaining information associated with a machine learning (ML) model, wherein the ML model is associated with autonomous or semi-autonomous operation of a vehicle; ]. Col 24 line 2-10, The example computer system 800 includes a processor 802 (e.g., a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), one or more application specific integrated circuits (AS I Cs), one or more radio-frequency integrated circuits (RFICs), or any combination of these), a main memory 804, and a static memory 806, which are configured to communicate with each other via a bus 808. Col 24 line 20-29, The storage unit 816 includes a machine-readable medium 822 on which is stored instructions 824 (e.g., software) embodying any one or more of the methodologies or functions described herein. The instructions 824 (e.g., software) may also reside, completely or at least partially, within the main memory 804 or within the processor 802 (e.g., within a processor's cache memory) during execution thereof by the computer system 800, the main memory 804 and the processor 802 also constituting machine-readable media [ A method implemented by a system of one or more processors ]) ; obtaining validation data, wherein the validation data includes one or more video sequences obtained from image sensors of an end-user vehicle (Col line 50 -58, FIG. 5 represents a flowchart illustrating the process of filtering video frames according to various traffic scenarios, according to an embodiment. The system receives 500 an input set of video frames for processing. The video frames may have been captured by one or more vehicles driving through traffic. The system also receives 505 vehicle attributes for each video frame. For example, the system may receive the vehicle speed, location, and any other parameters describing the vehicle. Col 18 line 63-67 and Col 19 line 1-9, The system performs the steps 515 and 520 for each subexpression. The system evaluates 515 the subexpression for each input video frame of the set. The system filters 520 the set of video frames to exclude video frames that do not satisfy the subexpression. Accordingly, the set of video frames may reduce in each iteration as video frames are filtered from the set. The evaluation of a subexpression may involve executing an ML model using the video frame to determine values of an attribute of the video frame. The system determines the value of the attribute and determines whether the value satisfies the subexpression. If the value satisfies the subexpression, the system keeps the video frame for further evaluation of subexpressions or else the system removes the video frame from the set for the next iteration [ wherein the validation data includes one or more video sequences obtained from image sensors of an end-user vehicle ]. Col 19 line 10-17, The system uses 525 the filtered set of video frames for model training and validation. The use of different filtered sets corresponding to different traffic scenarios for model training ensures that the model is properly trained using training dataset representing a variety of traffic scenarios. The use of different filtered sets corresponding to different traffic scenarios for model validation ensures that the model is properly validated across a variety of traffic scenarios [ obtaining validation data ].) ; obtaining output via computing forward pass-through ML model using validation data, wherein the output indicates, at least, location information associated with objects detected via the ML model in the validation data (Zaremba Col 7 line 1-9, The traffic includes one or more traffic entities, for example, a pedestrian. The vehicle computing system 120 analyzes the sensor data 160 and identifies various traffic entities in the scene, for example, pedestrians, bicyclists, other vehicles, and so on. The vehicle computing system 120 determines various parameters associated with the traffic entity, for example, the location (represented as x and y coordinates), a motion vector describing the movement of the traffic entity, skeletal pose, and so on. Col 7 line 63-67, The traffic scenario system 116 extracts videos/video frames representing different traffic scenarios from videos for training and validating ML models. The system determines different types of scenarios that are extracted for training/validation of models so as to provide a comprehensive coverage of various types of traffic scenarios. The system receives a filter based on various attributes including (1) vehicle attributes such as speed, turn direction, and so on and (2) traffic attributes describing behavior of traffic entities, for example, whether a pedestrian has intent to cross the street, and (3) road attributes, for example, whether there is an intersection or a cross walk coming up. The system applies the filter to video frames to identify sets of video frames representing different scenarios. The video frames classified according to various traffic scenarios are used for ML model validation or training [ wherein the output indicates, at least, location information associated with objects detected via the ML model in the validation data; ]. Col 19 line 10-22, The system uses 525 the filtered set of video frames for model training and validation. The use of different filtered sets corresponding to different traffic scenarios for model training ensures that the model is properly trained using training dataset representing a variety of traffic scenarios. The use of different filtered sets corresponding to different traffic scenarios for model validation ensures that the model is properly validated across a variety of traffic scenarios. In some embodiments, if there are traffic scenarios in which the outputs of the ML model causes unnatural movements of the autonomous vehicle, the filtering criteria can be used to identify additional video frames corresponding to these traffic scenarios to be used for further training [ obtaining output via computing forward pass-through ML model using validation data ].) ; Zaremba does not explicitly teach: determining values associated with metrics based on the obtained output; and generating user interface information based on one or more of the determined values or obtained output. However Arroyo teaches determining values associated with metrics based on the obtained output ( Arroyo para 0061, The fourth of the four depicted subcomponents of the model-instance-and-dataset-evaluation system 200 is the dataset-training-and-evaluation subsystem 226. As shown, the user 210 may utilize the model-instance-training API 216 to interact with a model-instance trainer 254, and may use the model-instance-evaluation API 218 to interact with a model-instance evaluator 258 to produce test results 260. The test results 260 could take any suitable form, including numerical output, statistical-test results, line graphs, bar graphs, other graphs, and/or the like. Those of skill in the art can certainly choose what type of machine- leaning -model instance metrics they would like to see during model development, evaluation, and the like [ associated with metrics based on the obtained output ]. Para 0143, In an embodiment, model-validation metrics reflect how well a given instance of a given machine-learning model is fulfilling a given sensing (or perception) task, which in various embodiments can include one or more of detection, classification, prediction, and/or the like. Once or more model-training metrics may also be provided. Some example model-training metrics include amount of time to ingest the data, amount of time to update the internal hyperparameters, consumed computational resources, how fast it is converging to ground-truth performance, and/or the like [ determining values ]) ; and generating user interface information based on one or more of the determined values or obtained output ( Arroyo para 0039, In various embodiments, a user 210 may interact with the system via any suitable user interface (e.g., graphical, text-based, command-line, and/or the like). In the embodiments that are primarily described in the present disclosure, the user 210 interacts with the model-instance-and- dataset-evaluation system 200 using command-line instructions, as described more fully below. In different embodiments, the user 210 may make choices with respect to type of machine-learning model (including parameters such as number of layers and the like), and may select a given one of the automated-driving datasets automated driving dataset 202 for processing. Furthermore, as is also described below, the user 210 may use the model-instance-and- dataset-evaluation system 200 to explore datasets, map labels in datasets to a common representation of an automated-driving dataset that is provided by various embodiments. Further functions available via the model-instance and- dataset-evaluation system 200 are further discussed throughout the present disclosure [ generating user interface information ]. Para 0061, The fourth of the four depicted subcomponents of the model-instance-and-dataset-evaluation system 200 is the dataset-training-and-evaluation subsystem 226. As shown, the user 210 may utilize the model-instance-training API 216 to interact with a model-instance trainer 254, and may use the model-instance-evaluation API 218 to interact with a model-instance evaluator 258 to produce test results 260. The test results 260 could take any suitable form, including numerical output, statistical-test results, line graphs, bar graphs, other graphs, and/or the like. Those of skill in the art can certainly choose what type of machine-learning-model instance metrics they would like to see during model development, evaluation, and the like [ based on one or more of the determined values or obtained output ].)) . Zaremba and Arroyo are considered to be analogous to the claim invention because they are in the same field of machine learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filling date of the claimed invention to have modified Zaremba in view of Jarquin to incorporate determining values to generate metrics . Doing so to show how well a given instance of machine learning model is fulfilling a given sensing task ( Arroyo para 0143, In an embodiment, model-validation metrics reflect how well a given instance of a given machine-learning model is fulfilling a given sensing (or perception) task, which in various embodiments can include one or more of detection, classification, prediction, and/or the like. Once or more model-training metrics may also be provided. Some example model-training metrics include amount of time to ingest the data, amount of time to update the internal hyperparameters, consumed computational resources, how fast it is converging to ground-truth performance, and/or the like. ) . Regarding claim 2 and analogous 10 , Zaremba and Arroyo teach the method according to claim 1. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 1 7 . Zaremba further teaches wherein the validation data further includes one or more of velocities of the end-user vehicle (Zaremba Col 7 line 63-67, The traffic scenario system 116 extracts videos/video frames representing different traffic scenarios from videos for training and validating ML models. The system determines different types of scenarios that are extracted for training/validation of models so as to provide a comprehensive coverage of various types of traffic scenarios. The system receives a filter based on various attributes including (1) vehicle attributes such as speed, turn direction, and so on and (2) traffic attributes describing behavior of traffic entities, for example, whether a pedestrian has intent to cross the street, and (3) road attributes, for example, whether there is an intersection or a cross walk coming up. The system applies the filter to video frames to identify sets of video frames representing different scenarios. The video frames classified according to various traffic scenarios are used for ML model validation or training [ wherein the validation data further includes one or more of velocities of the end-user vehicle ].) . Regarding claim 3 and analogous 11, Zaremba and Arroyo teach the method according to claim 1. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Arroyo teaches wherein the user interface information includes a graphical representation of objects detected via the ML model (Arroyo Para 0095, Moreover, in embodiments of the present disclosure, the model-instance-and-dataset-evaluation system 200 enables users to provide training labels in the frame of reference of the sensing algorithms in either in 2D or 3D. The following portion of this disclosure illustrates an example of extracting a 3D object (box) (identified by, as an example, the camera 312) and a 2D object (Rect) targeted to the local coordinate frame of reference of the image 328 associated with the camera 308. Example Input Para 0100, &gt;&gt;rect=geo.Rect.from_annotation(annotation, data_frame.transform_tree, 'IMAGE_328') para 0101, &gt;&gt;print(rect) Example Output Para 0102, Rect (frame=IMAGE_328, label=None, instance_ id=l 79, xc=591.46, yc=-68.85, w=12.83, h=7.66, orientation=0.00) Para 0103, A visualization of the resulting 2D/3D labels is displayed in the example visualization output 800 of FIG. 8. In that visualization are 2D and 3D label annotations rendered after applied to sensor input from the perspective of image 328, which is the projection onto the 2D image plane of the 3D location of the camera 308. Certainly many other examples could be provided as well [ includes a graphical representation of objects detected via the ML model ].). Claim(s) 4 analogous 12 are rejected under 35 U.S.C. 103 as being unpatentable over Zaremba in view of Arroyo and further in view of Wang et al. (US 20170185868 A1) (“Wang”). Regarding claim 4 and analogous 12, Zaremba and Arroyo teach the method according to claim 1. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Zaremba does not explicitly teach wherein the user interface information further includes error information associated with the objects. However Wang teaches wherein the user interface information further includes error information associated with the objects (Wang para 0054 line 7-13, The screenshot 710 displays an LRP image 714 in which there is a lack of illumination. The user may view this image 714 to verify the classification of the fault. In addition, the erroneous registration identifier 716 and the suggested corrected registration identifier 718 may be displayed to allow the user to verify or choose the correct number [ associated with the objects ]. Para 0057, FIG. 9 shows an exemplary user interface screenshot 902. The user interface includes an upper panel 904, a left panel 906, a central panel 908 and a right panel 910. Upper panel 904 enables the user to pick a particular day ( or date) to inspect. Left panel 906 displays a list of sensors ordered by the error rate calculated by the present framework. The user may select one of the displayed sensors for further inspection. Central panel 908 presents an array of erroneous LPR records and suggested corrections determined by the present framework of the selected sensor. The user may select one of the displayed records for inspection and verification. Right panel 910 displays lane distributions 912a-c. Lane distribution 912a represents proportions of un-recognized, falsely-recognized and other types of errors; lane distribution 912b represents proportions of unrecognized errors; and lane distribution 912c represents proportions of falsely-recognized errors [ user interface information further includes error information ].) . Zaremba and Wang are considered to be analogous to the claim invention because they are in the same field of machine learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filling date of the claimed invention to have modified Zaremba in view of Wang to include generating error information. Doing so to allow the user to determine the correct identification (Wang para 0054 line 7-13, The screenshot 710 displays an LRP image 714 in which there is a lack of illumination. The user may view this image 714 to verify the classification of the fault. In addition, the erroneous registration identifier 716 and the suggested corrected registration identifier 718 may be displayed to allow the user to verify or choose the correct number.). Claim(s) 5 , 7, 8, 13, 15, 16, 18, 20, and 21 are rejected under 35 U.S.C. 103 as being unpatentable over Zaremba in view of Arroyo and further in view of Xie, Enze, et al. "M^2 BEV: Multi-camera joint 3D detection and segmentation with unified birds-eye view representation." arXiv preprint arXiv:2204.05088 (2022) (“Xie”) . Regarding claim 5 and analogous claims 13 and 18, Zaremba and Arroyo teach the method according to claim 1. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Zaremba does not explicitly teach wherein the user interface: presents a graphical depiction of the end-user vehicle; presents ground truth locations of the objects which are proximate to the end-user vehicle; and adjusts individual presentations of the ground truth locations to reflect individual errors associated with the location information indicated in the output. However Xie teaches wherein the user interface: presents a graphical depiction of the end-user vehicle; presents ground truth locations of the objects which are proximate to the end-user vehicle ((Xie Page 5, Fig 3. Page 8, 2D Auxiliary Supervision. As detailed in Fig. 3, after obtaining the image features, we add a 2D detection head on the features at different scales and calculating the losses with 2D GT boxes generated from the 3D boxes in ego-car coordinate. The 2D detection head is implemented in the same way as proposed in FCOS [17]. It is worth noting that the auxiliary head is only used during the training phases and will be removed during the inference phase. As a result, it does not introduce additional computation cost in inference. Fig. 5c illustrates how 2D GT boxes are generated from 3D annotations. The 3D GT boxes from the ego-car coordinates are back-projected to the 2D image space with the camera intrinsic parameters. In this way, 2D box GTs can be obtained without additional efforts [ presents a graphical depiction of the end-user vehicle ]. Page 13 Fig. 7, [ presents ground truth locations of the objects which are proximate to the end-user vehicle; ]) ; and adjusts individual presentations of the ground truth locations to reflect individual errors associated with the location information indicated in the output (Xie Page 3 Introduction para 5, Specifically, To make the framework usable in real-world scenarios with limited computational budget, we propose several empirical designs to significantly improve the accuracy and the GPU memory efficiency. The first one is an efficient BEV encoder, which uses a “Spatial to Channel” (S2C) operator to transform a 4D voxel tensor to a 3D BEV tensor, thus avoiding the usage of memory expensive 3D convolutions. The second one is the dynamic box assignment dedicated for 3D detection tasks. It uses a learning-to-match strategy to assign ground-truth 3D boxes with anchors. The third one is BEV centerness specially designed for BEV segmentation. It is motivated by the fact that longer distance area in BEV havs less pixels in image. We thus re-weight pixels according to the distance to the ego-car and assign larger weights to farther samples. The last one is the 2D detection pre-training and auxiliary supervision on the 2D image encoder. It can significantly speed up the training convergence and improve the performance of 3D tasks. Page 8, BEV Centerness. The concept of “centerness” is commonly used in 2D detectors [17, 31] to re-weight positive samples. Here we extend the concept of “centerness” in a non-trivial distance-aware manner, from 2D image coordinate to 3D BEV coordinate. This process is illustrated in Fig. 5a in more details. The motivation is that area in BEV space farther away from the ego car correspond to fewer pixels in the images…. We use sqrt here to slow down the increase of the centerness. The BEV centerness ranges from 1 to 2 and is used as a loss weight in Eq. 5. Thus, errors in predictions for samples far away from the center are punished more. We show that BEV centerness improves BEV segmentation in different ranges in Fig. 5a. The distance is farther, the IoU improvement is higher [ to reflect individual errors associated with the location information indicated in the output ]. Page 15, B More Visualizations para 3, For multi-camera 3D detection, an obvious challenge is that extra post-processing and care are needed for objects appearing between different cameras. In Fig. 10,M2BEV correctly localizes buses that appear between two different cameras, which shows the advantage from having a unified BEV representation. Page 16 Fig 10 [ adjusts individual presentations of the ground truth locations ]) . Zaremba and Xie are considered to be analogous to the claim invention because they are in the same field of machine learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filling date of the claimed invention to have modified Zaremba in view of Xie to adjust presentation of locations by using ground-truth information . Doing so to achieve state of the art results in both 3d object detection and BEV segmentation ( Xie Abstract line 8-20 Our framework further contains four important designs that benefit both accuracy and efficiency: (1) An efficient BEV encoder design that reduces the spatial dimension of a voxel feature map. (2) A dynamic box assignment strategy that uses learning-to-match to assign ground-truth 3D boxes with anchors. (3) A BEV centerness re-weighting that reinforces with larger weights for more distant predictions, and (4) Large-scale 2D detection pre-training and auxiliary supervision. We show that these designs significantly benefit the ill-posed camera-based 3D perception tasks where depth information is missing. M2BEV is memory efficient, allowing significantly higher resolution images as input, with faster inference speed. Experiments on nuScenes show thatM2BEV achieves state-of-the-art results in both 3D object detection and BEV segmentation, with the best single model achieving 42.5 mAP and 57.0 mIoU in these two tasks, respectively. ). Regarding claim 7 and analogous claims 15 and 20, Zaremba and Arroyo teach the method according to claim 5 and analogous claims 13 and 18 . Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Zaremba and Xie are combined with the same rationale used in claim 5 and analogous claims 13 and 18. Zaremba teaches wherein the user interface presents a particular video sequence ( Zaremba Col 5 line 1-8, The server 106 can be any type of computer system capable of (1) hosting information ( such as image, video and text information) and delivering it to a user terminal (such as client device 108), (2) recording responses of multiple users (or human observers) to the information, and (3) delivering such information and accompanying responses (such as responses input via client device 108) back to the network 104. Col 13 line49-53, The client devices(s) 108 prompt the human observers to predict how the traffic entities shown in the derived stimulus will act, and upon viewing the displayed stimulus, the observers input their responses corresponding to their predictions . ) Xie further teaches and wherein the ground truth locations of the objects are updated based on the video sequence (4 Experiments 4.1 Implementation Details, Dataset. We evaluate M2BEV on the nuScenes dataset [19]. nuScenes contains 1000 video sequences collected in Boston and Singapore with 700/150/150 scenes for training/validation/testing. Each sample consists of a LiDAR scan and images from 6 cameras: front left, front, front right, back left, back, back right. nuScenes includes 10 categories for 3D bounding boxes. B More Visualizations We additionally visualize 6 groups of ground-truth and predicted results in Fig. 9, Fig. 10 and Fig. 11. Fig. 9 is a night driving scene. The result indicates that M2BEV learns to see in the dark. For example, a very tiny car in a far distance can be successfully detected by MBEV , while it is missed to be annotated as ground-truth by human annotators. For multi-camera 3D detection, an obvious challenge is that extra post-processing and care are needed for objects appearing between different cameras. In Fig. 10,M2BEV correctly localizes buses that appear between two different cameras, which shows the advantage from having a unified BEV representation. Fig. 11 shows under a very crowded scene, M2BEV can still detect most of the objects and segment maps although part of these objects and maps are heavily occluded. [ wherein the ground truth locations of the objects are updated based on the video sequence ] ( Examiner Note: The detected objects are updated based on the video that is rendered ) . Regarding claim 8 and analogous claims 16 and 21, Zaremba and Arroyo teach the method according to claim 7 and analogous claims 15 and 20. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Zaremba and Xie are combined with the same rationale used in claim 5 and analogous claims 13 and 18. Xie wherein the individual presentations of the ground truth locations are updated based on the video sequence ( Xie page 1 Fig. 1, ( Examiner Note: The birds eye view of the v ideo sequence is updated based on the representation as the video runs ) Page 17 Fig. 10, [ presentations of the ground truth locations are updated based on the video sequence ] ) . Claim(s) 6 , 1 4 , and 1 9 are rejected under 35 U.S.C. 103 as being unpatentable over Zaremba in view of Arroyo and further in view of Xie and Z. Huang, W. Li, X. -G. Xia and R. Tao, "A General Gaussian Heatmap Label Assignment for Arbitrary-Oriented Object Detection," in IEEE Transactions on Image Processing , vol. 31, pp. 1895-1910, 2022, doi: 10.1109/TIP.2022.3148874 (“Huang”) . Regarding claim 6 and analogous claims 14 and 19, Zaremba and Arroyo teach the method according to claim 1. Zaremba and Arroyo are combined with the same rationale used in claim 1 and analogous claims 9 and 17. Zaremba and Xie are combined with the same rationale used in claim 6 and analogous claims 14 and 19. Zaremba does not teach explicitly wherein each adjusted presentation includes a color whose radius is selected based on the error (Huang page 1899 Fig. 5, Page 1899 A. OLA Strategy para 7, Third, the Spatial and Scale Extents of the Candidate Regions Using the Above Strategy Need to be Carefully Studied: First, a bounding box centered at the Gaussian peak location (called C-BBox) is computed based on the assigned labels. Then, it is assumed that many bounding boxes of different sizes centered at the other Gaussian candidate locations are generated. At a location, if there exists a bounding box whose Intersection over Union (IoU) with the C-BBox is greater than the threshold TIoU , this location is selected as a positive location. As shown in Fig. 5, these positive locations form a subset of the original Gaussian candidate locations (appearing as a smaller ellipse that is co-centered with the original Gaussian ellipse), Page 1902, Therefore, the classification sub-task is affected by the OBB regression error. In the training process, in order to obtain a higher classification accuracy, the model parameters will be jointly adjusted to approach the optimal results of not only the classification sub-task but also the OBB regression task [ whose radius is selected based on the error ]. Page 1905, Page 1905, Third, based on the baseline, i.e., Vanilla-AF, the proposed OLA and ORC are used to make the positive candidate region conform to the shape and direction characteristics of the objects. This improvement makes the mAP increase by 2.56. The object candidates of ORC are further improved by OWAM, i.e., ORC-OWAM, and mAP is further improved by 0.54. For the non-Gaussian center prior objects analyzed previously, like the harbor (HA), the performance improves more. The visualized feature maps of the CNN output layer in Fig. 9 verify this claim. Further using the proposed JOL, the mAP increases by 1.52 [ wherein each adjusted presentation includes a color ]) . Zaremba and Huang are considered to be analogous to the claim invention because they are in the same field of machine learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filling date of the claimed invention to have modified Zaremba in view of Huang to properly represent an object shape and direction on a heatmap . Doing so to fit the Gaussian center prior to fit the characteristics of different object thought neural network learning (Huang Abstract, Recently, many arbitrary-oriented object detection (AOOD) methods have been proposed and attracted widespread attention in many fields. However, most of them are based on anchor-boxes or standard Gaussian heatmaps. Such label assignment strategy may not only fail to reflect the shape and direction characteristics of arbitrary-oriented objects, but also have high parameter-tuning efforts. In this paper, a novel AOOD method called General Gaussian Heatmap Label Assignment (GGHL) is proposed. Specifically, an anchor-free object-adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects. Based on OLA, an oriented-boundingbox (OBB) representation component (ORC) is developed to indicate OBBs and adjust the Gaussian center prior weights to fit the characteristics of different objects adaptively through neural network learning. Moreover, a joint-optimization loss (JOL) with area normalization and dynamic confidence weighting is designed to refine the misalign optimal results of different subtasks. Extensive experiments on public datasets demonstrate that the proposed GGHL improves the AOOD performance with low parameter-tuning and time costs. Furthermore, it is generally applicable to most AOOD methods to improve their performance including lightweight models on embedded platforms.). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT ALFREDO CAMPOS whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (571)272-4504 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT 7:00 - 4:00 pm M - F . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Michael J. Huntley can be reached at FILLIN "SPE Phone?" \* MERGEFORMAT (303) 297-4307 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /ALFREDO CAMPOS/ Examiner, Art Unit 2129 /MICHAEL J HUNTLEY/ Supervisory Patent Examiner, Art Unit 2129
Read full office action
Prosecution Timeline

Jul 20, 2023
Application Filed
Apr 02, 2026
Non-Final Rejection — §101, §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/528,305
Patent 12561407
ONE-PASS APPROACH TO AUTOMATED TIMESERIES FORECASTING
2y 5m to grant Granted Feb 24, 2026
17/558,355
Patent 12561559
Neural Network Training Method and Apparatus, Electronic Device, Medium and Program Product
2y 5m to grant Granted Feb 24, 2026
17/820,419
Patent 12554973
HIERARCHICAL DATA LABELING FOR MACHINE LEARNING USING SEMI-SUPERVISED MULTI-LEVEL LABELING FRAMEWORK
2y 5m to grant Granted Feb 17, 2026
17/938,431
Patent 12536260
SYSTEM, APPARATUS, AND METHOD FOR AUTOMATICALLY GENERATING NEGATIVE KEYSTROKE EXAMPLES AND TRAINING USER IDENTIFICATION MODELS BASED ON KEYSTROKE DYNAMICS
2y 5m to grant Granted Jan 27, 2026
Study what changed to get past this examiner. Based on 4 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
83%
Grant Probability
99%
With Interview (+33.3%)
3y 9m
Median Time to Grant
Low
PTA Risk
Based on 6 resolved cases by this examiner. Grant probability derived from career allow rate.