Last updated: May 29, 2026
Application No. 17/234,118
OBJECT CLASSIFICATION WITH CONTENT AND LOCATION SENSITIVE CLASSIFIERS

Final Rejection §103§112
Filed
Apr 19, 2021
Priority
May 13, 2020 — EU 20174435.6
Examiner
LEY, SALLY THI
Art Unit
2147
Tech Center
2100 — Computer Architecture & Software
Assignee
Robert Bosch GmbH
OA Round
4 (Final)
This examiner grants 19% of cases after interview

— +33.3% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 36 resolved cases, 2023–2026
Examiner Intelligence

LEY, SALLY THI View full profile →
Grants only 19% of cases
Career Allowance Rate
7 granted / 36 resolved
-35.6% vs TC avg
Strong +33% interview lift
Without
With
+33.3%
Interview Lift
resolved cases with interview
Typical timeline
4y 8m
Avg Prosecution
17 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
10.3%
-29.7% vs TC avg
§103
83.2%
+43.2% vs TC avg
§102
3.8%
-36.2% vs TC avg
§112
2.7%
-37.3% vs TC avg
Black line = Tech Center average estimate • Based on career data from 36 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Status of Claims
	This Office Action is in response to the communication filed on 17 Nov 2025.
	Claims 1-14 are being considered on the merits.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 17 February 2026 has been considered. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, initialed and dated copies of Applicant's IDS form 1499 is attached to the instant Office action. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-14 are rejected under 35 U.S.C. 103 as being unpatentable over Brebner, David (US 2020/0005523 A1; hereinafter, “Brebner”) in view of Kristensen, et. al. (US 2021/0286923 A1; hereinafter, “Kristensen”). 

Claim 1—Kristensen teaches: 
A computer-implemented method for training a machine learnable model for classification of objects in spatial sensor data (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”), wherein the objects are classifiable into different object classes (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.).”) by combining content information and location information contained in the spatial sensor data, (Kristensen, para. 0027: “Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”)  the method comprising the following steps: 
accessing training data, the training data including instances of spatial sensor data, the instances of the spatial sensor data including objects belonging to different object classes; (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
providing the machine learnable model, wherein the machine learnable model includes a convolutional part comprising one or more convolutional layers for generating one or more feature maps from an instance of spatial sensor data, (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.” Examiner notes that deep neural networks by definition teach one or more layers and that a convolutional neural network teaches a convolutional part. Examiner also notes that the output of a convolutional layer of a convolutional neural network is a feature map). 
wherein the one or more feature maps have spatial dimensions which represent spatial dimensions of the spatial sensor data, (Kristensen, para. 0420: “In embodiments, the abstract instances of classes may also be spatially and temporally arranged in a data structure.”) wherein an activation in the one or more feature maps at a particular location represents an occurrence of a feature representing content at the particular location and providing a first classification part and a second classification part in the machine learnable model; (Kristensen, para. 0420: “In embodiments, the nodes may also be partitioned in a plurality of dimensions, such as four dimensions based on the node properties (e.g., time and x, y, z location, or x, y, z, location and viewing angle)”)
generating, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps, and training the first classification part on the content information-specific feature map to obtain a content classification part; (Kristensen, para. 0029: “For the corresponding input scene configurations, sensor data—such as LIDAR data and/or camera image(s)—may be processed and/or encoded into a suitable representation. For example, images from any number of cameras may be segmented, classified, and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above). 
generating a location information-specific feature map by removing content information from the one or more feature maps, and training the second classification part on the location information-specific feature map to obtain a location classification part; (Kristensen, para. 0029 and 0111: ““Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above). 
wherein modifying the one or more previously generated feature maps includes at least one of: removing feature information from the previously generated feature maps, pseudo-randomly shuffling locations of the feature information in the previously generated feature maps; mixing the feature information between feature maps of different object classes; swapping the feature information at different locations in the previously generated feature maps, and training the outlier detection part on the pseudo outlier feature map. (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
Kristensen does not explicitly disclose:
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, 
However, Brebner teaches:
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, (Brebner, para. 0009: “The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen. Kristensen teaches systems and methods are disclosed for learning a sensor model, and verifying one or more features of a real-world system using the learned sensor model; Brebner teaches a content generation system that generates content based on data collected from one or more data sources. One of ordinary skill would have been motivated to combine the teachings of Brebner into Kristensen in order to deploy machine learning models to perform location-based services in areas that are not traditionally well served by traditional location techniques (Brebner para. 0484). 

Claim 2—Kristensen as modified teaches claim 1 above. Kristensen further teaches:
wherein the method further comprises generating the pseudo outlier feature map for the location-and-content outlier detection part by modifying the feature information which is contained in the one or more previously generated feature maps and which is associated with both the location information and the content information. (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”) 
Kristensen does not explicitly disclose: 
The method according to claim 1, wherein the machine learnable model includes a location-and-content outlier detection part, and 
However, Brebner teaches:
The method according to claim 1, wherein the machine learnable model includes a location-and-content outlier detection part, and (Brebner, para. 0009: “The method also includes generating, by the processing system, a graph based on the plurality of sample nodes. The graph connects the sample nodes to point of interest nodes, wherein each point of interest node corresponds to a point of interest within the area. The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest…The machine-learned classification model is configured to receive signal profiles indicating detected network identifiers and measured signal strengths corresponding to the detected network identifiers and to output one or more candidate locations within the area and, for each candidate location, a confidence score corresponding to the candidate location, wherein each candidate location corresponds to a different point of interest within the area”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 
 
Claim 3—Kristensen as modified teaches claim 1 above. Kristensen further teaches:
wherein the method further comprises generating the pseudo outlier feature map for the location outlier detection part by modifying the feature information which is contained in the one or more previously generated feature maps and which is associated with the location information. (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
Kristensen does not explicitly disclose: 
The method according to claim 1, wherein the machine learnable model includes a location outlier detection part, and  
However, Brebner teaches: 
The method according to claim 1, wherein the machine learnable model includes a location outlier detection part, and  (Brebner, para. 0009: “The method also includes generating, by the processing system, a graph based on the plurality of sample nodes. The graph connects the sample nodes to point of interest nodes, wherein each point of interest node corresponds to a point of interest within the area. The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest…The machine-learned classification model is configured to receive signal profiles indicating detected network identifiers and measured signal strengths corresponding to the detected network identifiers and to output one or more candidate locations within the area and, for each candidate location, a confidence score corresponding to the candidate location, wherein each candidate location corresponds to a different point of interest within the area”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 4—Kristensen as modified teaches claim 3 above. Kristensen further teaches:
...by providing the pseudo outlier feature map to the location classification part (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”) as part of a separate outlier object class to be learned. (Kristensen, para. 0006: “a sensor model may be learned for any number of sensor types, SKUs, sensor installation locations, and/or the like. As such, one or more sensor models may be used as virtual sensors in any of a variety of applications, such as in a simulated environment to test one or more autonomous or semi-autonomous driving software stacks that may include a multitude of DNNs, in a re-simulation system that uses physical sensor data in combination with virtual sensor data to train, test, verify, and/or validate one or more DNNs for use in software stacks, or otherwise.”)  
Kristensen does not explicitly disclose:
The method according to claim 3, wherein the location outlier detection part is implemented by the location classification part…
However, Brebner teaches: 
The method according to claim 3, wherein the location outlier detection part is implemented by the location classification part… (Brebner, para. 0009: “The method also includes generating, by the processing system, a graph based on the plurality of sample nodes. The graph connects the sample nodes to point of interest nodes, wherein each point of interest node corresponds to a point of interest within the area. The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest…The machine-learned classification model is configured to receive signal profiles indicating detected network identifiers and measured signal strengths corresponding to the detected network identifiers and to output one or more candidate locations within the area and, for each candidate location, a confidence score corresponding to the candidate location, wherein each candidate location corresponds to a different point of interest within the area”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 5—Kristensen as modified teaches claim 1 above. Kristensen further teaches: 
wherein the method further comprises generating the pseudo outlier feature map for the content outlier detection part by modifying the feature information which is contained in the one or more previously generated feature maps and which is associated with the content information. (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
Kristensen does not explicitly disclose:
The method according to claim 1, wherein the machine learnable model includes a content outlier detection part, and 
However, Brebner teaches: 
The method according to claim 1, wherein the machine learnable model includes a content outlier detection part, and (Brebner, para. 0009: “The method also includes generating, by the processing system, a graph based on the plurality of sample nodes. The graph connects the sample nodes to point of interest nodes, wherein each point of interest node corresponds to a point of interest within the area. The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest…The machine-learned classification model is configured to receive signal profiles indicating detected network identifiers and measured signal strengths corresponding to the detected network identifiers and to output one or more candidate locations within the area and, for each candidate location, a confidence score corresponding to the candidate location, wherein each candidate location corresponds to a different point of interest within the area”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 6—Kristensen as modified teaches claim 1 above. Kristensen further teaches:
The method according to claim 1, wherein each one of the one or more feature maps generated by the convolutional part each has at least two spatial dimensions associated with the location information and wherein feature values of the one or more feature maps at each respective spatial coordinate together form a feature vector representing content information at the respective spatial coordinate, and wherein: (Kristensen, para. 0005 and 0119: “For example, a sensor model may include a generative adversarial network (GANs), a variational autoencoder (VAE), and/or another type of deep neural network (DNN) or machine learning model. At a high level, a sensor model may accept some encoded representation of a scene configuration as an input using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.) and may output virtual sensor data. Real-world data and/or virtual data may be collected and used to derive training data (e.g., input scene configurations and/or ground truth sensor data), which may be used to train the sensor model to predict virtual sensor data for a given scene configuration.” “GNNS sensors (e.g., GPS sensors) may be simulated within the simulation space to generate real-world coordinates. In order to this, noise functions may be used to approximate inaccuracy. As with any virtual sensors described herein, the virtual sensor data may be generated using a learned sensor model or otherwise”)
the removing of the location information from the one of the one or more feature maps includes aggregating the one or more feature maps over the spatial dimensions to form a content information-specific feature map comprising one feature vector; (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
the removing of the content information from the one of the one or more feature maps includes aggregating the feature values per spatial coordinate over the one or more feature maps to form the location information-specific feature map having at least two spatial dimensions and one feature value channel. (Kristensen, para. 0035, 0042, and Fig. 2: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors…For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor” “By way of illustration, FIG. 2 is a visualization of sample RADAR data generated by a RADAR sensor(s). FIG. 2 shows an example 3D world space with a ground plane 220 and an example coordinate system defined by a first axis 240 and a second axis 250. Generally, a RADAR system may include a transmitter that emits radio waves. The radio waves reflect off of certain objects and materials, and a RADAR sensor (which may correspond to the origin of the coordinate system in FIG. 2) may detect these reflections and reflection characteristics such as bearing, azimuth, elevation, range (e.g., time of beam flight), intensity, Doppler velocity, RADAR cross section (RCS), reflectivity, SNR, and/or the like. Generally, reflections and reflection characteristics may depend on the objects in a scene, speeds, materials, sensor mounting position and orientation, etc. Reflection data may be combined with position and orientation data (e.g., from GNSS and IMU sensors) to generate point clouds. In FIG. 2, each of the RADAR points 212 represents the location of a detected reflection in the world space. Collectively, the RADAR points 212 may form a point cloud representing detected reflections in the scene”) 

Claim 7—Kristensen as modified teaches claim 1 above. Kristensen further teaches: 
The method according to claim 1, wherein the machine learnable model is a deep neural network, (Kristensen, para. 0006: “a sensor model may be learned for any number of sensor types, SKUs, sensor installation locations, and/or the like. As such, one or more sensor models may be used as virtual sensors in any of a variety of applications, such as in a simulated environment to test one or more autonomous or semi-autonomous driving software stacks that may include a multitude of DNNs, in a re-simulation system that uses physical sensor data in combination with virtual sensor data to train, test, verify, and/or validate one or more DNNs for use in software stacks, or otherwise.”)  
wherein the convolutional part is a convolutional part of the deep neural network and  (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.”)
wherein the content classification part and the location classification part are respective classification heads of the deep neural network. (Kristensen, para. 0029 and 0111: “For the corresponding input scene configurations, sensor data—such as LIDAR data and/or camera image(s)—may be processed and/or encoded into a suitable representation. For example, images from any number of cameras may be segmented, classified, and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.)…Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).

Claim 8—Kristensen teaches: 
A computer-implemented method for classifying objects in spatial sensor data, (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”), wherein the objects are classifiable into different object classes (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.).”) by combining content information and location information contained in the spatial sensor data, (Kristensen, para. 0027: “Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”) the method comprising the following steps:
accessing a machine learned model, wherein the machine learned model is a machine learnable model trained by: accessing training data, the training data including instances of spatial sensor data, the instances of the spatial sensor data including objects belonging to different object classes, (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
providing the machine learnable model, wherein the machine learnable model includes a convolutional part comprising one or more convolutional layers for generating one or more feature maps from an instance of spatial sensor data, and (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.” Examiner notes that deep neural networks by definition teach one or more layers and that a convolutional neural network teaches a convolutional part. Examiner also notes that the output of a convolutional layer of a convolutional neural network is a feature map). 
providing a first classification part and a second classification part in the machine learnable model, (Kristensen, para. 0035: “In some embodiments, spatially-adaptive normalization may be applied in which an input image such as a segmentation map may be fed into a normalization layer to modulate layer activations. These are meant simply as examples, as any suitable architecture may be implemented within the scope of the present disclosure.”)
generating, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps, and training the first classification part on the content information-specific feature map to obtain a content classification part, (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generating a location information-specific feature map by removing content information from the one or more feature maps, and training the second classification part on the location information-specific feature map to obtain a location classification part, (Kristensen, para. 0029 and 0111: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
wherein modifying the one or more previously generated feature maps includes at least one of: removing feature information from the previously generated feature maps, pseudo-randomly shuffling locations of the feature information in the previously generated feature maps, mixing the feature information between feature maps of different object classes, swapping the feature information at different locations in the previously generated feature maps, and training the outlier detection part on the pseudo outlier feature map; (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
accessing first input data, the first input data including an instance of spatial sensor data, the instance of the spatial sensor data including an object to be classified; (Kristensen, para. 0029: “For the corresponding input scene configurations, sensor data—such as LIDAR data and/or camera image(s)—may be processed and/or encoded into a suitable representation. For example, images from any number of cameras may be segmented, classified, and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). LIDAR may be used to identify reflections and values for the reflections such as lateral bearing, elevation, range (e.g., time of beam flight), reflectivity, signal-to-noise ratio (SNR), some combination thereof, and/or the like. This reflection data may be combined with position and orientation data (e.g., from GNSS and/or IMU sensors) to generate LIDAR point clouds. Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.). In another example, geometric description(s) of a scene may be encoded into a suitable network input(s). For example, two or three dimensional geometric model(s) may be arranged in a scene and rendered (e.g., from a desired point of view for the particular sensor being modeled) to form an image, which serve as an encoded scene configuration (or a portion thereof). The encoded scene configuration(s) may be used as input data for a training dataset.”)
applying the convolutional part of the machine learned model to the first input data to generate one or more first feature maps, (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.”).
wherein the one or more feature maps have spatial dimensions which represent spatial dimensions of the spatial sensor data, (Kristensen, para. 0420: “In embodiments, the abstract instances of classes may also be spatially and temporally arranged in a data structure.”) wherein an activation in the one or more feature maps at a particular location represents an occurrence of a feature representing content at the particular location; (Kristensen, para. 0420: “In embodiments, the nodes may also be partitioned in a plurality of dimensions, such as four dimensions based on the node properties (e.g., time and x, y, z location, or x, y, z, location and viewing angle)”)
generating a first content information-specific feature map by removing location information from one of the one or more first feature maps, and applying the content classification part to the first content information-specific feature map to obtain a content-based object classification result; (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generating a first location information-specific feature map by removing content information from one of the one or more first feature maps, and applying the location classification part to the first location information-specific feature map to obtain a location-based object classification result; (Kristensen, para. 0029 and 0111: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
classifying the object in the spatial sensor data in accordance with the content-based object classification result, the location-based object classification result and the outlier detection result, (Kristensen, para. 0027: “Generally, an autonomous or semi-autonomous vehicle may use a variety of sensors to measure and/or derive a representation of a scene in the real-world at a given point in time. Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like. Generally, a sensor model may learn to predict virtual sensor data from a representation of a scene configuration. As such, the architecture for a sensor model may be selected to fit the shape of the desired input and output data.”)
wherein the classifying includes classifying the first input data in accordance with an object class when the content-based object classification result and the location-based object classification result both indicate the object class (Kristensen, para. 0027: “For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”) and…does not indicate a presence of an outlier. (Kristensen, para. 0069: “The validation/verification sub-system may verify and/or validate performance, accuracy, and/or other criteria associated with the sensor model.”)
Kristensen does not explicitly disclose:
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data, and 
applying the outlier detection part to one or more previously generated first feature maps which are generated for the instance of the spatial sensor data, to obtain an outlier detection result; and 
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, 
…when the outlier detection result…  
However, Brebner teaches: 
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data, and (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, (Brebner, para. 0009: “The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.”)
…when the outlier detection result… (Brebner: “wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.” Examiner notes that Brebner teaches outlier detection)   
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 9—Kristensen as modified teaches claim 8 above. Kristensen further teaches: 
The method according to claim 8, wherein the training of the machine learnable model is performed before (Kristensen, para. 0483: “In some embodiments, the generative content system 1100 may collect, structure, and generate training data that is used to train machine learned models used in various types of systems” Examiner notes that Kristensen teaches collecting and generating training data before using it to train machine learned models) using the machine learned model to classify the objects in the spatial sensor data. (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”). 

Claim 10—Kristensen teaches:
…for training a machine learnable model for classification of objects in spatial sensor data, (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”), wherein the objects are classifiable into different object classes (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.).”) by combining content information and location information contained in the spatial sensor data, (Kristensen, para. 0027: “Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”), the computer program, when executed by a computer, causing the computer to perform the following steps:
accessing training data, the training data including instances of spatial sensor data, the instances of the spatial sensor data including objects belonging to different object classes; (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
providing the machine learnable model, wherein the machine learnable model includes a convolutional part comprising one or more convolutional layers for generating one or more feature maps from an instance of spatial sensor data, (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.” Examiner notes that deep neural networks by definition teach one or more layers and that a convolutional neural network teaches a convolutional part. Examiner also notes that the output of a convolutional layer of a convolutional neural network is a feature map).
wherein the one or more feature maps have spatial dimensions which represent spatial dimensions of the spatial sensor data, wherein an activation in the one or more feature maps at a particular location represents an occurrence of a feature representing content at the particular location and (Kristensen, para. 0420: “In embodiments, the abstract instances of classes may also be spatially and temporally arranged in a data structure.”) providing a first classification part and a second classification part in the machine learnable model; (Kristensen, para. 0035: “In some embodiments, spatially-adaptive normalization may be applied in which an input image such as a segmentation map may be fed into a normalization layer to modulate layer activations. These are meant simply as examples, as any suitable architecture may be implemented within the scope of the present disclosure.”)
generating, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps and training the first classification part on the content information-specific feature map to obtain a content classification part; (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generating a location information-specific feature map by removing content information from the one or more feature maps, and training the second classification part on the location information-specific feature map to obtain a location classification part; (Kristensen, para. 0029 and 0111: “ Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
wherein modifying the one or more previously generated feature maps includes at least one of: removing feature information from the previously generated feature maps, pseudo-randomly shuffling locations of the feature information in the previously generated feature maps; mixing the feature information between feature maps of different object classes; swapping the feature information at different locations in the previously generated feature maps, and training the outlier detection part on the pseudo outlier feature map. (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
Kristensen does not explicitly disclose:
A non-transitory computer-readable medium on which is stored a computer program… 
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and 
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, 
However, Brebner teaches:
A non-transitory computer-readable medium on which is stored a computer program… (Brebner, para. 0632: “The processor, or any machine utilizing one, may include non-transitory memory that stores methods, codes, instructions and programs as described herein and elsewhere.”)
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, (Brebner, para. 0009: “The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 11—Kristensen teaches: 
training a machine learnable model for classification of objects in spatial sensor data, (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”), wherein the objects are classifiable into different object classes (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.).”) by combining content information and location information contained in the spatial sensor data, (Kristensen, para. 0027: “Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”), 
the training data including instances of spatial sensor data, the instances of the spatial sensor data including objects belonging to different object classes; (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
a processor subsystem (Kristensen, para. 0172: “The GPU(s) 1108 may include a high bandwidth memory (HBM) and/or a 16 GB HBM2 memory subsystem to provide, in some examples, about 900 GB/second peak memory bandwidth”) configured to: provide the machine learnable model, wherein the machine learnable model includes a convolutional part comprising one or more convolutional layers for generating one or more feature maps from an instance of spatial sensor data, (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.” Examiner notes that deep neural networks by definition teach one or more layers and that a convolutional neural network teaches a convolutional part. Examiner also notes that the output of a convolutional layer of a convolutional neural network is a feature map).
wherein the one or more feature maps have spatial dimensions which represent spatial dimensions of the spatial sensor data, (Kristensen, para. 0420: “In embodiments, the abstract instances of classes may also be spatially and temporally arranged in a data structure.”)
wherein an activation in the one or more feature maps at a particular location represents an occurrence of a feature representing content at the particular location and providing a first classification part and a second classification part in the machine learnable model; (Kristensen, para. 0420: “In embodiments, the nodes may also be partitioned in a plurality of dimensions, such as four dimensions based on the node properties (e.g., time and x, y, z location, or x, y, z, location and viewing angle)”)
generate, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps and training the first classification part on the content information-specific feature map to obtain a content classification part; (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generate a location information-specific feature map by removing content information from the one or more feature maps, and training the second classification part on the location information-specific map to obtain a location classification part; (Kristensen, para. 0029 and 0111: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
wherein modifying the one or more previously generated feature maps includes at least one of: removing feature information from said feature maps; pseudo-randomly shuffling locations of feature information in said feature maps; mixing feature information between feature maps of different object classes; swapping feature information at different locations in said feature maps, and training the outlier detection part on the pseudo outlier feature map; and (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
an output interface configured to output machine learned model data representing the machine learnable model after training. (Kristensen, para. 0102: “In order to increase accuracy in SIL embodiments, the vehicle simulator component(s) 420 may be configured to communicate over one or more virtual connection types and/or communication protocols that are not standard in computing environments. For example, a virtual CAN interface, virtual LVDS interface, virtual USB interface, virtual Ethernet interface, and/or other virtual interfaces may be used by the computer(s) 440, CPU(s), and/or GPU(s) of the vehicle simulator component(s) 420 to provide for communication (e.g., over one or more communication protocols, such as LVDS) between the software stack(s) 116 and the simulation software 438 within the simulation system 400. For example, the virtual interfaces may include middleware that may be used to provide a continuous feedback loop with the software stack(s) 116. As such, the virtual interfaces may simulate or emulate the communications between the vehicle hardware 104 and the physical vehicle using one or more software protocols, hardware (e.g., CPU(s), GPU(s), computer(s) 440, etc.), or a combination thereof.”)
Kristensen does not explicitly disclose:
A tangible system for…
the tangible system comprising: an input interface configured to access training data, 
provide, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and  
generate, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, 
However, Brebner teaches:
A tangible system for… (Brebner, para. 0642: “The methods and systems described herein may transform physical and/or intangible items from one state to another. The methods and systems described herein may also transform data representing physical and/or intangible items from one state to another”) 
the tangible system comprising: an input interface configured to access training data, (Brebner, para. 0066: “The application system 100 may include various internal and external communication facilities 184, such as using various networking and software communication protocols (including network interfaces, application programming interfaces, database interfaces, search capabilities, and the like).”) 
provide, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data; and  (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
generate, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, (Brebner, para. 0009: “The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.”)
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 12—Kristensen teaches: 
classifying objects in spatial sensor data, (Kristensen, para. 0036: “Generally, training data for a sensor model may be generated at least in part from real-world data. As such, one or more vehicles 102 may collect sensor data from one or more sensors of the vehicle(s) 102 in real-world (e.g., physical) environments”), wherein the objects are classifiable into different object classes (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.).”) by combining content information and location information contained in the spatial sensor data, (Kristensen, para. 0027: “Data from any of these sensors may be used to generate a representation of a scene configuration, which may be used to drive a sensor model. For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”),
the first input data including an instance of spatial sensor data, the instance of the spatial sensor data including an object to be classified; (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
a processor subsystem (Kristensen, para. 0172: “The GPU(s) 1108 may include a high bandwidth memory (HBM) and/or a 16 GB HBM2 memory subsystem to provide, in some examples, about 900 GB/second peak memory bandwidth”) configured to: access a machine learned model, wherein the machine learned model is a machine learnable model trained by: accessing training data, the training data including instances of spatial sensor data, the instances of the spatial sensor data including objects belonging to different object classes, (Kristensen, para. 0047: “In some embodiments, objects may be classified and/or categorized such as by labeling differing portions of real-world data based on class (e.g., for an image of a landscape, portions of the image—such as pixels or groups of pixels—may be labeled as car, sky, tree, road, building, water, waterfall, vehicle, bus, truck, sedan, etc.). In some embodiments, an encoded input scene configuration may include labeled or annotated the sensor data 102 (e.g., images, depth maps, point clouds, etc.) with bounding shapes and/or corresponding class labels (e.g., vehicle, pedestrian, building, airplane, watercraft, street sign, etc.). As such, object data such as object properties and/or classification data may be generated and associated with other data (such as corresponding image(s), LIDAR data, and/or RADAR data), which may be used to encode the representation of a scene configuration.”)
providing the machine learnable model, wherein the machine learnable model includes a convolutional part comprising one or more convolutional layers for generating one or more feature maps from an instance of spatial sensor data, and (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.” Examiner notes that deep neural networks by definition teach one or more layers and that a convolutional neural network teaches a convolutional part. Examiner also notes that the output of a convolutional layer of a convolutional neural network is a feature map). 
providing a first classification part and a second classification part in the machine learnable model, (Kristensen, para. 0420: “In embodiments, the nodes may also be partitioned in a plurality of dimensions, such as four dimensions based on the node properties (e.g., time and x, y, z location, or x, y, z, location and viewing angle)”)
generating, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps, and training the first classification part on the content information-specific feature map to obtain a content classification part, (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generating a location information-specific feature map by removing content information from the one or more feature maps, and training the second classification part on the location information-specific feature map to obtain a location classification part, (Kristensen, para. 0029 and 0111: “ Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
wherein modifying the one or more previously generated feature maps includes at least one of: removing feature information from the previously generated feature maps, pseudo-randomly shuffling locations of the feature information in the previously generated feature maps, mixing the feature information between feature maps of different object classes, swapping the feature information at different locations in the previously generated feature maps, and training the outlier detection part on the pseudo outlier feature map; (Kristensen, para. 0035: “Further, some neural network architectures—such as GANs—may include multiple input channels, which may be used to accept and evaluate multiple input images and/or input vectors. Some generative techniques such as conditional image synthesis may be applied to generate an output such as a photorealistic image conditioned on some input data. Any or all of these techniques may be applied and/or combined to generate an architecture for the sensor model 120. For example, different input layers, channels, and/or networks may be used to encode different features (e.g., vectors, tensors, etc.) that may be combined using another layer, network, and/or some other operation. In this manner, any number of inputs may be combined. Any number of layers, networks, and/or other operations may be applied to normalize, re-shape, and/or otherwise output virtual sensor data for a desired sensor.”)
apply the convolutional part of the machine learned model to the first input data to generate one or more first feature maps, (Kristensen, para. 0035: “Generally, an architecture for the sensor model 120 may be selected to fit the shape of the desired input and output data. Some non-limiting examples of DNNs include perceptron, feed-forward, radial basis, deep feed forward, recurrent, long/short term memory, gated recurrent unit, autoencoder, variational autoencoder, convolutional, deconvolutional, and generative adversarial, to name a few. Some DNNs like GANs may include a convolutional neural network that accepts and evaluates an input image.”).
wherein the one or more feature maps have spatial dimensions which represent spatial dimensions of the spatial sensor data, (Kristensen, para. 0420: “In embodiments, the abstract instances of classes may also be spatially and temporally arranged in a data structure.”) wherein an activation in the one or more feature maps at a particular location represents an occurrence of a feature representing content at the particular location; (Kristensen, para. 0420: “In embodiments, the nodes may also be partitioned in a plurality of dimensions, such as four dimensions based on the node properties (e.g., time and x, y, z location, or x, y, z, location and viewing angle)”)
generate a first content information-specific feature map by removing location information from one of the one or more first feature maps, and apply the content classification part to the first content information-specific feature map to obtain a content-based object classification result; (Kristensen, para. 0029: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)” Examiner notes that Kristensen teaches segmentation masking of matrices (i.e. feature maps) and encoding of object-specific properties as well as a CNN as set forth above)
generate a first location information-specific feature map by removing content information from one of the one or more first feature maps, and apply the location classification part to the first location information-specific feature map to obtain a location-based object classification result; (Kristensen, para. 0029 and 0111: “Any of this LIDAR data, images from one or more cameras, properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene (e.g., segmentation masks corresponding to the images), and/or other types of data may be encoded into a suitable representation of a scene configuration using any number of data structures and/or channels (e.g., concatenated vectors, matrices, tensors, images, etc.)”“As described herein, for vehicles or other objects that may be far away and may not have an impact on a current sensor(s), the system may choose not to apply physics for those objects and only determine locations and/or instantaneous motion vectors.” Examiner notes that Kristensen teaches encoding any types of data in the scene and ignoring object properties (i.e. removing and a CNN as set forth above) as well as a CNN as set forth above).
classify the object in the spatial sensor data in accordance with the content-based object classification result, the location-based object classification result and the outlier detection result, wherein the classifying includes classifying the first input data in accordance with an object class when the content-based object classification result and the location-based object classification result both indicate the object class (Kristensen, para. 0027: “For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”) and…does not indicate a presence of an outlier. (Kristensen, para. 0069: “The validation/verification sub-system may verify and/or validate performance, accuracy, and/or other criteria associated with the sensor model.”)
Kristensen does not explicitly disclose:
A tangible system for…  
the tangible system comprising: an input interface for accessing first input data, 
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data, and 
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model,
apply the outlier detection part to one or more previously generated first feature maps which are generated for the instance of the spatial sensor data, to obtain an outlier detection result;
…when the outlier detection result…  
However, Brebner teaches:
A tangible system for…  (Brebner, para. 0642: “The methods and systems described herein may transform physical and/or intangible items from one state to another. The methods and systems described herein may also transform data representing physical and/or intangible items from one state to another”) 
the tangible system comprising: an input interface for accessing first input data, (Brebner, para. 0066: “The application system 100 may include various internal and external communication facilities 184, such as using various networking and software communication protocols (including network interfaces, application programming interfaces, database interfaces, search capabilities, and the like).”)
providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data, and (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
generating, as part of the training of the machine learnable model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of the spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model, (Brebner, para. 0009: “The method further includes generating, by the processing system, a simulated sample node based on an analysis of the graph, wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.”)
apply the outlier detection part to one or more previously generated first feature maps which are generated for the instance of the spatial sensor data, to obtain an outlier detection result; (Brebner, para. 0507: “In embodiments, the generative content system 1100 may generate an abstract representation of the signal profile using the class-specific executable classes. The generative content system 1100 may then perform conformance simulation on the abstract representation. During this process, the generative content system 1100 may remove outliners and/or obvious errors from the abstract representation until the abstract representation converges on one or more fitness criteria”)
…when the outlier detection result… (Brebner: “wherein the synthesized sample node indicates simulated sample data generated in response to identifying a missing signal strength or an outlier signal strength corresponding to a particular point of interest.” Examiner notes that Brebner teaches outlier detection)   
It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine the teachings of Brebner into Kristensen as set forth above with respect to claim 1. 

Claim 13—Kristensen as modified teaches claim 12 above. Kristensen further teaches:
The system according to claim 12, wherein the input interface is a sensor interface to a sensor, (Kristensen, para. 0085: “For example, similar interfaces used in the physical vehicle 102 may need to be used by the vehicle simulator component(s) 406 to communicate with the vehicle hardware 104. In some examples, the interfaces may include: (1) CAN interfaces, including a PCAN adapter, (2) Ethernet interfaces, including RAW UDP sockets with IP address, origin, VLA, and/or source IP all preserved, (3) Serial interfaces, with a USB to serial adapter, (4) camera interfaces, (5) InfiniBand (IB) interfaces, and/or other interface types.”)
wherein the sensor is configured to acquire the spatial sensor data. (Kristensen, para. 0027: “For example, a representation of a scene configuration may include sensor data (e.g., LIDAR data, RADAR data, ultrasonic sensor data, camera image(s), etc.), properties of objects in the scene such as positions or dimensions (e.g., depth maps), classification data identifying objects in the scene, some combination thereof, and/or the like.”) 

Claim 14—Kristensen as modified teaches claim 12 above. Kristensen further teaches:
The system according to claim 12, wherein the system is a control system configured to adjust a control parameter based on the classification of the object. (Kristensen, para. 0125: “The one or more operations or commands may be transmitted to the simulation engine 630 which may update the behavior of one or more of the virtual objects based on the operations and/or commands. For example, the simulation engine 630 may use the AI engine 632 to update the behavior of the AI agents as well as the virtual objects in the simulated environment 628. The simulation engine 630 may then update the object data and characteristics (e.g., within the asset data store(s) 636), may update the GI (and/or other aspects such as reflections, shadows, etc.), and then may generate and provide updated sensor inputs to the GPU platform 624. This process may repeat until a simulation is completed.”)

Response to Applicant Remarks/Argument
35 USC § 112(b)
In light of applicant’s amendments, previously asserted rejection of claim 6 under 35 USC § 112(b) has been withdrawn. 
35 USC § 103
Applicant Remarks: 
At the top of page 14 of applicant’s remarks, Kristensen para. 0029 does not teach the limitation, “generating, as part of the training of the machine learnable model, a content information-specific feature map by removing location information from the one or more feature maps, and training the first classification part on the content information-specific feature map to obtain a content classification part”. Applicant specifically argues that, “A POSITA would not have agreed that the above blurb from [0029] of Kristensen teaches any location removal whatsoever, much less the sort recited in the claim limitation” (emphasis added). 
Examiner Response: 
In this case, the claim language does not recite “location removal”—the claim limitation recites “removing location information”. Location information is reasonably interpreted as any information relating to location, including masking objects that may provide location information i.e. serving as a geographic feature or relational as nearer or farther than another object or perspective. Kristensen teaches not only segmentation but also masking such that the masked data is not taken into consideration i.e. removed.  

Applicant Remarks: 
Towards the bottom of page 14 of applicant’s remarks, applicant argues Kristensen fails to disclose “generating a location information-specific feature map by removing content information from one or more feature maps, and training the second classification part on the location information-specific feature map to obtain a location classification part”. Applicant specifically argues that the blurb does not teach content removal but rather “it simply describes leaving certain content alone (‘may choose not to apply physics for those objects’)”. 
Examiner Response: 
In this case, merriam-webster includes “to get rid of” as a definition of “remove”. In this case, Kristensen teaches “getting rid of” object physics from consideration or processing. The physics of an object i.e. how an object behaves within the world is information about the object i.e. content information. “Leaving certain content alone” would mean leaving the real-world physics of an object alone i.e. not “getting rid of” the object physics. 

Applicant Remarks: 
At the top of page 16, applicant argues that Kristensen does not disclose “wherein modifying the one or more previously generated feature maps includes at least one of…pseudo-randomly shuffling locations of the feature information in the previously generated feature maps” Applicant argues that the word “pseudo” does not appear in the cited portion of the reference. 
Examiner Response: 
Kristensen at para 0035 teaches use of any or all techniques, “and/or some other operation” i.e. including such as using a pseudo-random shuffling of locations of feature information in previously generated feature maps. Kristensen teaches broadly of combining features using any and all techniques to combine any number of inputs and features.  

Applicant Remarks: 
Towards the top of page 17, applicant argues that Brebner does not disclose “providing, as part of the machine learnable model, at least one outlier detection part for being trained for detecting outliers in input data of the machine learnable model which do not fit a distribution of the training data” Applicant argues that the cited paragraph, 0507, “fails to even mention ‘outlier’”. 
Examiner Response: 
A person of reasonable skill in the art would recognize the typographical error in Brebner where “outliner” is written instead of “outlier” i.e. in the context of “outliners[sp] and other obvious errors”. As a result, a person of ordinary skill in the art would readily recognize that Brebner at 0507 teaches detecting (and removing) outliers.  

Applicant Remarks: 
In the middle of page 18 of applicant’s remarks, applicant argues that  Brebner does not disclose, “generating, as part of the training of the machine learning model, a pseudo outlier feature map by modifying one or more previously generated feature maps which are generated for the instance of spatial sensor data, to mimic a presence of an actual outlier in the input data of the machine learnable model”. 
Examiner Response: 
As set forth in the rejection, Kristensen teaches generating feature maps and Brebner at paragraph 0009 teaches “the synthesized sample node indicates simulated sample data generated in response to identifying missing signal strength or an outlier signal strength corresponding to a particular point of interest” where Brebner teaches simulated sample data i.e. pseudo data (where pseudo means “not genuine”) where such simulated sample data is generated in response to an outlier signal strength. Therefore Kristensen and Brebner teach a feature map generated corresponding with the simulated data corresponding to outlier data i.e. a pseudo outlier feature map.  

Conclusion

THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Sally T. Ley whose telephone number is (571)272-3406. The examiner can normally be reached Monday - Thursday, 10:00am - 6:00pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at (571) 270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/STL/Examiner, Art Unit 2147                                                                                                                                                                                                        
/VIKER A LAMARDO/Supervisory Patent Examiner, Art Unit 2147
Read full office action
Prosecution Timeline

Show 1 earlier event
Aug 20, 2024
Non-Final Rejection mailed — §103, §112
Nov 18, 2024
Response Filed
Dec 16, 2024
Final Rejection mailed — §103, §112
Mar 17, 2025
Request for Continued Examination
Mar 24, 2025
Response after Non-Final Action
Jun 17, 2025
Non-Final Rejection mailed — §103, §112
Nov 17, 2025
Response Filed
Mar 10, 2026
Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/981,796
Patent 12632746
A METHOD AND APPARATUS FOR DISPLAYING CATEGORIZED CARBON EMISSIONS
3y 6m to grant Granted May 19, 2026
16/733,393
Patent 12443830
COMPRESSED WEIGHT DISTRIBUTION IN NETWORKS OF NEURAL PROCESSORS
5y 9m to grant Granted Oct 14, 2025
16/835,892
Patent 12135927
EXPERT-IN-THE-LOOP AI FOR MATERIALS DISCOVERY
4y 7m to grant Granted Nov 05, 2024
17/992,958
Patent 11880776
GRAPH NEURAL NETWORK (GNN)-BASED PREDICTION SYSTEM FOR TOTAL ORGANIC CARBON (TOC) IN SHALE
1y 2m to grant Granted Jan 23, 2024
Study what changed to get past this examiner. Based on 4 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

5-6
Expected OA Rounds
19%
Grant Probability
53%
With Interview (+33.3%)
4y 8m (~0m remaining)
Median Time to Grant
High
PTA Risk
Based on 36 resolved cases by this examiner. Grant probability derived from career allowance rate.