Last updated: April 19, 2026
Application No. 18/236,037
METHOD FOR PARAMETERIZING AN IMAGE SYNTHESIS FROM A 3-D MODEL

Final Rejection §103
Filed
Aug 21, 2023
Examiner
TRUONG, KARL DUC
Art Unit
2614
Tech Center
2600 — Communications
Assignee
Dspace GmbH
OA Round
2 (Final)
This examiner grants 52% of cases after interview

— +31.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 29 resolved cases, 2023–2026
Examiner Intelligence

TRUONG, KARL DUC View full profile →
Grants 52% of resolved cases
Career Allow Rate
15 granted / 29 resolved
-10.3% vs TC avg
Strong +31% interview lift
Without
With
+31.0%
Interview Lift
resolved cases with interview
Typical timeline
2y 7m
Avg Prosecution
45 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
3.2%
-36.8% vs TC avg
§103
85.3%
+45.3% vs TC avg
§102
9.5%
-30.5% vs TC avg
§112
2.1%
-37.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 29 resolved cases
Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
This action is in response to the amendment filed on 28th August, 2025. Claims 1, 8, and 13 have been amended. Claims 14-20 have been added. Claims 1-20 remain rejected in the application. Applicant's amendments to the specification has overcome each and every objection previously set forth in the non-final office action mailed 5th May, 2025.

Response to Arguments
Applicant’s arguments with respect to Claims 1 and 13, filed on 28th August, 2025, with respect to the rejection under 35 U.S.C. § 103 regarding that the prior art does not teach “parameterizing the program logic according to the output parameter set, wherein the neural network is designed as an autoencoder.” The proposed amended claim limitations have been fully considered, but are not persuasive.

In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “the autoencoder is not the neural network from which the intermediate layer activation levels are read”) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993). Additionally, if the prior art structure is capable of performing the intended use, then it meets the claim. Therefore, applicant’s remark cannot be considered persuasive.

In response to applicant’s argument that the prior art does not teach “parameterizing the program logic according to the output parameter set, wherein the neural network is designed as an autoencoder” as recited in Claim 1, these limitations are taught by Taralova. In particular, Taralova teaches the following:
Paragraph [0098]: the machine learning algorithm outputting parameters, where "various parameters which may be output from such an algorithm include, but are not limited to, a size of the grid used as input to the SVM, brightness, exposure, distance, Bayer filtering, number of histogram bins, a bireflectance distribution function (BRDF), noise, optical components and/or distortion, Schott noise, dark current, etc."
Paragraph [0063]: the machine learning model being stacked auto-encoders
Therefore, applicant’s remark cannot be considered persuasive.

Regarding arguments to Claims 2-12 and 14-20, they directly/indirectly depend on independent Claims 1 and 13 respectively. Applicant does not argue anything other than independent Claims 1 and 13. The limitations in those claims, in conjunction with combination, was previously established as explained.









Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-7 and 10-13 are rejected under 35 U.S.C. 103 as being unpatentable over Taralova (US 20210027111 A1, previously cited) in view of King (US 11494538 B1, previously cited).

Regarding Claim 1, Taralova discloses a method for parameterizing a program logic for image synthesis, which is designed to synthesize a photorealistic perspective representation of a 3D model (Taralova, FIG. 3 teaches a method for generating three types of data sets to train a machine learning model to generate a simulated environment),
    PNG
    media_image1.png
    322
    437
    media_image1.png
    Greyscale

the appearance of which depends on a variety of adjustable parameters (Taralova, [0028]: teaches "tuning one or more parameters <read on adjustable parameters> to improve the similarity (and thus, decrease the difference) between the real environment 100 and the simulated environment 114," where "the one or more parameters can include, but are not limited to, white balance (e.g., temperature, tint, etc.), color global (e.g., saturation, contrast, gamma, gain, offset, etc.), color shadow (e.g., saturation, contrast, gamma, gain, offset, shadow maximum, etc.), color midtones (e.g., saturation, contrast, gamma, gain, offset, etc.), color highlights (e.g., saturation, contrast, gamma, gain, offset, highlight minimum, etc.), blue correction, gamut, chromatic aberration (e.g., intensity, start offset, etc.), bloom (e.g., method, intensity, threshold, etc.), shutter speed, ISO, exposure (e.g., compensation, metering mode, minimum brightness, low percent, high percent, etc.), histogram log minimum, histogram log maximum, calibration constant, lens flare (e.g., intensity, Bokeh size, threshold, etc.), vignette intensity, grain (e.g., jitter, intensity, etc.), material properties, angles, distance, etc."), the method comprising:
providing a digital photograph of a three-dimensional scene (Taralova, [0027]: teaches "the evaluating system 128 can input real data (an image) <read on digital photograph> and simulated data (a simulated image) into an artificial neural network and can compare activations of the Nth neural network layer 112N associated with the real data input to the network (e.g., of the real environment 100 <read on 3D scene>) with activations of the Nth neural network layer 126N associated with input of simulated data to the same network (e.g., of the simulated environment 114) to determine a similarity between activations of the two neural network layers");
processing the digital photograph by a neural network (Taralova, [0027]: teaches inputting real data <read on digital photograph> to a neural network, which is associated with a real environment 100; [0028]: teaches "different parameters can be adjusted based on the type of output(s) analyzed <read on processing> (e.g., vision, LIDAR, etc.)");
extracting a first representation of the photograph from a selection of neurons from the neural network (Taralova, [0086]: teaches "the training system 246 can select a first intermediate output <read on extracting first representation from photograph> and a second intermediate output," where "the first intermediate output can be associated with a first perceived object in a first image associated with a real environment and the second intermediate output can be associated with a second perceived object in a second image associated with a real environment"; [0087]: teaches "the training system 246 can compare neural network activations <read on selected neurons> of the intermediate outputs, which are associated with a same layer of a neural network, to determine the similarity metric");
providing a digital three-dimensional model of the scene (Taralova, [0013]: teaches training data that includes pairs of images of real-world environments/real environments <read on 3D model of scene> with neural network activations associated with each image of the pairs of images);
parameterizing the program logic according to an initial set of output parameters (Taralova, [0077]: teaches the training system 246 generating training data <read on initial set of output parameters>, "which can include pairs of neural network activations that are associated with “different,” “same,” and/or “similar” subjects (e.g., data (e.g., images), portions of data, objects, portions of objects, etc.)"; [0067]: teaches "a similarity score output from the trained SVM (or otherwise) can be used as an input to another machine learning algorithm and/or optimization," where "such an algorithm may incorporate the determined similarity metric as a loss function so that the model learns which parameters can be tuned <read on parameterize program logic> to create simulated data (e.g., simulated images, LIDAR, RADAR, etc.) which causes activations in neural networks used to evaluate real data");
synthesizing a synthetic image recreating the digital photograph via the parameterized program logic based on the three-dimensional model (Taralova, [0013]: teaches "the accuracy of a simulated environment can be analyzed using the machine-trained data model by comparing intermediate outputs of a neural network (e.g., activations) in response to an image associated with a real environment and a corresponding image associated with a simulated environment <read on synthetic image>");
processing the synthetic image by the neural network (Taralova, [0032]: teaches system 200 "tuning simulated data <read on processing synthetic image> for optimized neural network activation");
extracting a second representation of the synthetic image from the same selection of neurons from which the first representation is extracted (Taralova, [0092]: teaches "receiving <read on extracting> a pair of intermediate outputs, a first intermediate output of the pair of intermediate outputs being associated with a real environment and a second intermediate output of the pair of intermediate outputs <read on second representation of synthetic image> being associated with a corresponding simulated environment"; [0093]: teaches "the evaluating system 248 can analyze a first intermediate output of the vision system (e.g., based on a simulated environment) with a second intermediate output of the vision system (e.g., based on a corresponding real environment) and can determine a similarity metric (e.g., a difference) that can be representative of how similarly the simulated environment and the real environment activate a neural network <read on same selection of neurons>"; Note: it should be noted that although not expressly disclosed, one skilled in the art would understand that specific neurons in a neural network would be used for each input type, such as differentiating shape types);
calculating a distance between the synthetic image and the photograph using a metric taking into account the first representation and the second representation (Taralova, [0093]: teaches comparing activations "by discretizing a region of an input space into corresponding grids and building histograms of activations in the associated grids for input data and comparison data," where "once determined, the histograms may be analyzed, for example, by a SVM, wherein a distance is used to determine how similar the two data sets <read on calculating distance between synthetic image and photograph> are");
producing an improved set of output parameters [[by an evolutionary algorithm]] with the following method steps a) to c) (Taralova, [0028]: teaches "tuning one or more parameters <read on improved set of output parameters> to improve the similarity (and thus, decrease the difference) between the real environment 100 and the simulated environment 114"):
(a) producing a plurality of parameter sets by varying the output parameter set (Taralova, [0028]: teaches "tuning one or more parameters to improve the similarity (and thus, decrease the difference) between the real environment 100 and the simulated environment 114," where "different parameters can be adjusted <read on varying> based on the type of output(s) analyzed (e.g., vision, LIDAR, etc.)");
(b) for each set of parameters from the plurality of parameter sets:parameterizing the program logic according to the parameter set (Taralova, [0098]: teaches the machine learning algorithm outputting parameters <read on set of parameters>, where "various parameters which may be output <read on parameterizing program logic> from such an algorithm include, but are not limited to, a size of the grid used as input to the SVM, brightness, exposure, distance, Bayer filtering, number of histogram bins, a bireflectance distribution function (BRDF), noise, optical components and/or distortion, Schott noise, dark current, etc.");
resynthesizing the synthetic image via the program logic parameterized according to the parameter set (Taralova, [0097]: teaches "the evaluating system 248 can tune <read on resynthesize synthetic image> one or more parameters, as illustrated in block 514" to tune "one or more parameters to observe changes to the one or more metrics"; [0096]: teaches the machine learning system utilizing "the simulated environment for training, testing, validation, etc., as illustrated in block 512");
    PNG
    media_image2.png
    635
    452
    media_image2.png
    Greyscale


processing the new synthetic image by the neural network (Taralova, FIG. 5 teaches block 510, where if the difference meets or exceeds a threshold, the parameters and further tuned in block 514 and an updated simulated environment <read on new synthetic image> is generated in block 502);
re-extracting the second representation of the new synthetic image from the same selection of neurons from which the first representation is extracted (Taralova, [0092]: teaches "receiving <read on re-extracting> a pair of intermediate outputs, a first intermediate output of the pair of intermediate outputs being associated with a real environment and a second intermediate output of the pair of intermediate outputs <read on second representation of new synthetic image> being associated with a corresponding simulated environment"; [0093]: teaches "the evaluating system 248 can analyze a first intermediate output of the vision system (e.g., based on a simulated environment) with a second intermediate output of the vision system (e.g., based on a corresponding real environment) and can determine a similarity metric (e.g., a difference) that can be representative of how similarly the simulated environment and the real environment activate a neural network <read on same selection of neurons>"); and
calculating the distance between the new synthetic image and the digital photograph (Taralova, [0093]: teaches comparing activations "by discretizing a region of an input space into corresponding grids and building histograms of activations in the associated grids for input data and comparison data," where "once determined, the histograms may be analyzed, for example, by a SVM, wherein a distance is used to determine how similar the two data sets <read on calculating distance between new synthetic image and photograph> are");
c) selecting a parameter set via which a synthetic image was synthesized in method step b) with a shorter distance than one synthesized via the output parameter set, as a new output parameter set (Taralova, FIG. 5 teaches determining whether the difference between distance meets or exceeds a threshold and if so, tune the parameters to generate an updated simulated environment; Note: it should be noted that over time, the difference in distance between the two data sets will get smaller until it no longer meets or exceeds a threshold); and
repeating the method steps (a) to (c) until the distance between the synthetic image synthesized via the output parameter set and the photograph meets a termination criterion [[of the evolutionary algorithm]] (Taralova, [0097]: teaches "when the similarity metric (e.g., the difference) is below the threshold (e.g., the first intermediate output and the second intermediate output are similar) or some other stopping criterion is reached <read on meeting a termination criterion>, the evaluating system 248 can determine that the simulated environment and the real environment similarly activate the neural network," where "another stopping criterion can correspond to a change in difference (or other similarity metric) falling below a threshold"; FIG. 5 teaches block 510, where if the difference does not meet or exceed a threshold, then the simulated environment is used for training, testing, and validation in block 512); and
parameterizing the program logic according to the output parameter set (Taralova, [0098]: teaches the machine learning algorithm outputting parameters, where "various parameters which may be output <read on parameterizing program logic> from such an algorithm include, but are not limited to, a size of the grid used as input to the SVM, brightness, exposure, distance, Bayer filtering, number of histogram bins, a bireflectance distribution function (BRDF), noise, optical components and/or distortion, Schott noise, dark current, etc."), wherein
the neural network is designed as an autoencoder (Taralova, [0063]: teaches the machine learning model being stacked auto-encoders).

However, Taralova does not expressly disclose
producing an improved set of output parameters by an evolutionary algorithm with the following method steps a) to c); and
repeating the method steps (a) to (c) until the distance between the synthetic image synthesized via the output parameter set and the photograph meets a termination criterion of the evolutionary algorithm.

King discloses
producing an improved set of output parameters by an evolutionary algorithm with the following method steps a) to c) (King, [Column 4, Lines 8-12]: teaches a simulation computing system generating simulations of a driving environment using evolutionary algorithms); and
repeating the method steps (a) to (c) until the distance between the synthetic image synthesized via the output parameter set and the photograph meets a termination criterion of the evolutionary algorithm (King, [Column 10, Lines 24-28]: teaches a simulation computing system using an evolutionary algorithm, where the system will perform termination 116 step when a simulation violates a limitation <read on termination criteria>).



King is analogous art with respect to Taralova because they are from the same field of endeavor, namely training neural networks for safe autonomous driving. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to train neural networks using an evolutionary/genetic algorithm for simulating autonomous driving through a simulated environment as taught by King into the teaching of Taralova. The suggestion for doing so would narrow down the number of machine learning agents, thereby resulting in only the safest and determined best autonomous driving model(s) remaining. Therefore, it would have been obvious to combine King with Taralova.

Regarding Claim 13, it recites the limitations that are similar in scope to Claim 1, but in a test bench setup. As shown in the rejection, the combination of Taralova and King discloses the limitations of Claim 1. Additionally, Taralova discloses a test bench setup to test a control system set up to feed image data into the control system via a camera model, on which a program logic is programmed for image synthesis that is designed to synthesize a photorealistic perspective representation of a 3D model (Taralova, [0048]: teaches a computing device <read on control system> including "a simulation system 244, a training system 246 <read on test bench setup>, an evaluating system 248, a map(s) storage 250 (e.g., storing one or more maps), a training data storage 252 (e.g., storing training data accessible to the training system 246), and a model(s) storage 254 (e.g., models output by the training system 246)"; [0044]: teaches the vehicle including cameras <read on camera model> to capture input data of the real world environment; [0072]: teaches "memory 218 and 242 can store an operating system and one or more software applications, instructions, programs, and/or data"; [0091]: teaches generating a simulated environment <read on image synthesis>),
the appearance of which depends on a variety of adjustable parameters (Taralova, [0028]: teaches "tuning one or more parameters <read on adjustable parameters> to improve the similarity (and thus, decrease the difference) between the real environment 100 and the simulated environment 114," where "the one or more parameters can include, but are not limited to, white balance (e.g., temperature, tint, etc.), color global (e.g., saturation, contrast, gamma, gain, offset, etc.), color shadow (e.g., saturation, contrast, gamma, gain, offset, shadow maximum, etc.), color midtones (e.g., saturation, contrast, gamma, gain, offset, etc.), color highlights (e.g., saturation, contrast, gamma, gain, offset, highlight minimum, etc.), blue correction, gamut, chromatic aberration (e.g., intensity, start offset, etc.), bloom (e.g., method, intensity, threshold, etc.), shutter speed, ISO, exposure (e.g., compensation, metering mode, minimum brightness, low percent, high percent, etc.), histogram log minimum, histogram log maximum, calibration constant, lens flare (e.g., intensity, Bokeh size, threshold, etc.), vignette intensity, grain (e.g., jitter, intensity, etc.), material properties, angles, distance, etc."), and
on which a camera emulation is programmed to emulate the camera model, which is set up to read images synthesized by the program logic (Taralova, [0039]: teaches "the sensor system(s) 206 can include multiple instances of each of these or other types of sensors," which include camera sensors; [0046]: teaches "the vehicle computing device(s) 204, sensor system(s) 206, emitter(s) 208, and the communication connection(s) 210 can be implemented outside of an actual vehicle (i.e., not onboard the vehicle 202), for instance, as a simulated vehicle or as simulated systems <read on camera emulation>, for use in “traversing” a simulated environment") and
to generate an image data stream and feed it into an image data input of the control system, the test bench setup (Taralova, [0078]: teaches generating a "different" data set, where "the training system 246 can build a “different” data set, representative of pairs of data (e.g., images <read on image data stream>) of different objects (or portions thereof)" and the images are analyzed <read on feed into image data input> for the purpose of comparisons) comprising:
a computer program product (Taralova, [0072]: teaches "memory 218 and 242 can store an operating system and one or more software applications <read on computer program product>, instructions, programs, and/or data"):…

Thus, Claim 13 is met by Taralova according to the mapping presented in the rejection of Claim 1, given the method corresponds to a test bench setup.

Regarding Claim 2, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses wherein the selection of neurons comprises,
at least proportionately, neurons from a hidden layer of the neural network (Taralova, FIG. 4 teaches the neural network comparing a first intermediate output of a pair of intermediate outputs with a second intermediate output of a pair of intermediate outputs; the intermediate outputs are being interpreted as neurons from hidden layers).
    PNG
    media_image3.png
    328
    441
    media_image3.png
    Greyscale


Regarding Claim 3, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses
capturing the digital photograph with a camera model provided for feeding image data or camera raw data into a control system for controlling a robot, a semi-autonomous vehicle or an autonomous vehicle (Taralova, [0015]: teaches a real environment 100 perceived by an autonomous vehicle 102 <read on capturing digital photograph with camera model> as shown in FIG. 1A; [0015]: further teaches "the autonomous vehicle 102 can utilize a detector 108 to analyze the sensor data 104 to map and make determinations about the real environment 100 within which it is positioned," where "such an autonomous vehicle 102 can utilize the map of the real environment 100 to determine trajectories for driving within the real environment 100" and "can utilize the sensor data 104 to detect objects in the real environment 100, segment the real environment 100, localize its position in the real environment 100, classify objects in the real environment 100, etc.").
    PNG
    media_image4.png
    402
    595
    media_image4.png
    Greyscale


Regarding Claim 4, the combination of Taralova and King discloses the method of Claim 3. Additionally, Taralova further discloses
generating, after completion of the parameterization of the program logic, synthetic image data or camera raw data via the program logic (Taralova, [0016]: teaches "the detector 108 can represent a system that analyzes the sensor data 104 and generates one or more outputs 110 <read on synthetic image data/camera raw data> based on the sensor data 104," where "'image' can refer to any output whether the image is an image captured by a vision system or an aggregate presentation of data generated from another modality (e.g., LIDAR, RADAR, ToF systems, etc.)"; Note: it should be noted that the system determines if the generated output is based on a real image (photo) or a simulated image); and
feeding the synthetic image data into the control system for testing or validation of the control system ortraining a neural network of the control system using the synthetic image data (Taralova, [0025]: teaches simulated environments being useful "for enhancing training, testing, and/or validating systems <read on testing/validating control system> (e.g., one or more components of an AI stack) onboard an autonomous vehicle").

Regarding Claim 5, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses wherein
the first representation and the second representation are designed as a set of activation function values or activation function arguments of neurons from the selection of neurons (Taralova, [0067]: teaches a machine learning algorithm incorporating "the determined similarity metric as a loss function so that the model learns which parameters can be tuned to create simulated data <read on second representation> (e.g., simulated images, LIDAR, RADAR, etc.) which causes activations <read on set of activation function values> in neural networks used to evaluate real data <read on first representation>"; Note: it should be noted that one skilled in the art understands that an activation function in machine learning is used by neural network models to learn complex patterns and relationships in a given dataset).

Regarding Claim 6, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses wherein
the neural network is designed as a classifier for the recognition of at least one object type (Taralova, [0068]: teaches "a resulting data model can be provisioned to, or accessible by, the vehicle 202, and the vehicle 202 can utilize the data model for classifying objects <read on classifier of object type> in real-time (e.g., while driving or otherwise operating in the real environment)," where "the perception system 222 can utilize the data model (trained based on simulated data associated with a simulated environment) onboard in near real-time to classify objects").

Regarding Claim 7, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses
training the neural network by contrastive learning (Taralova, [0082]: teaches training a neural network model, where "the training system 246 can leverage a two-class SVM to discriminate between same/similar and different data sets, a two-class SVM to discriminate <read on contrastive learning> between same and different/moving data sets, a three-class SVM, and/or a two-class SVM to discriminate between same and different data sets"; Note: it should be noted that although not expressly stated, one skilled in the art would understand that contrastive learning is discriminative, where the model learns to differentiate and find similarities between two or more datasets).

Regarding Claim 10, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses
calculating the distance by calculating a similarity between a first histogram of a frequency of vectors or scalars in the second representation (Taralova, [0031]: teaches the evaluating system 128 comparing "the histograms 134 and 138 to generate a vector resulting from the comparison of the histograms 134 and 138"; [0052]: teaches "discretizing a region of an input space (e.g., a region of input data, a region of an input image, etc.) into corresponding grids and building histograms of activations <read on first histogram of frequency of vectors> in the associated grids for input data and comparison data" by analyzing histograms of activations between real and simulated environments <read on second representation>, where "a distance (e.g., a statistical distance) is used to determine how similar the two data sets are") and
a second histogram of a frequency of vectors or scalars in the first representation (Taralova, [0051]: teaches "the evaluating system 248 can analyze a first intermediate output of the vision system (e.g., based on a simulated environment) with a second intermediate output of the vision system (e.g., based on a corresponding real environment) and can determine a similarity metric (e.g., a difference) that can be representative of how similar the perception system 222 views the simulated environment when compared to the corresponding real environment <read on first representation> (e.g., how similar the networks are activated)"; [0052]: teaches "discretizing a region of an input space (e.g., a region of input data, a region of an input image, etc.) into corresponding grids and building histograms of activations <read on second histogram of frequency of vectors> in the associated grids for input data and comparison data" by analyzing histograms of activations between real <read on first representation> and simulated environments, where "a distance (e.g., a statistical distance) is used to determine how similar the two data sets are").

Regarding Claim 11, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses
calculating the distance by calculating at least one vector similarity or a distance between the second representation and the first representation (Taralova, [0052]: teaches "discretizing a region of an input space (e.g., a region of input data, a region of an input image, etc.) into corresponding grids and building histograms of activations in the associated grids for input data and comparison data" by analyzing histograms of activations, where "a distance (e.g., a statistical distance) is used to determine how similar the two data sets <read on calculating distance between second and first representations> are"; Note: it should be noted that the data sets are being interpreted to be data from the real and simulated environments, which are being interpreted as first and second representations respectively).

Regarding Claim 12, the combination of Taralova and King discloses the method of Claim 1. Additionally, Taralova further discloses wherein
the selection of neurons is designed as a selection of layers of the neural network and includes all neurons belonging to the respective layer from the selection of layers (Taralova, [0094]: teaches "the evaluating system 248 can compare each layer of the neural network or a sampling of the layers of the neural network," where the evaluating system 248 can select the pair of intermediate outputs <read on neurons> and representative layers (e.g., the last layer before an output or downsampling); [0094]: further teaches "the evaluating system 248 can select each layer <read on selection of layers> of the neural network layers for comparison").

Claims 8 and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Taralova (US 20210027111 A1, previously cited) in view of King (US 11494538 B1, previously cited) as applied to Claim 1 above respectively, and further in view of Wiest et al. (US 11537134 B1, previously cited), hereinafter referenced as Wiest.

Regarding Claim 8, the combination of Taralova and King discloses the method of Claim 1. The combination of Taralova and King does not expressly disclose the limitations of Claim 8; however, Wiest discloses wherein
the first representation is an encoded representation of the digital photograph extracted from at least one layer of the autoencoder arranged between an encoder part and a decoder part (Wiest, [Column 20, Lines 26-30]: teaches "a multi-layer encoding of at least a portion of the data sets, representing the driving environment state in a reduced-dimension or abstracted format (i.e., not just in the form of raw pixels) may be generated (element 904)"; [Column 20, Lines 40-44]: teaches "using the encodings as well as observations obtained from the environment <read on first representation> and/or representations <read on encoded representation> of latent variables (such as goals of drivers and other entities which may be obtained from external estimators), a deep neural network-based state prediction model may be trained," where the neural network model is a conditional variational auto-encoder (CVAE); Note: it should be noted that one skilled in the art would understand that it is common to extract features of an input in an encoder-decoder neural network model, which is performed between the encoder and decoder layers).

Wiest is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely training neural networks for safe autonomous driving using generated simulated environments. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to utilize encodings and observations of the surrounding real-world environment as input for a conditional variational auto-encoder (CVAE) neural network model as taught by Wiest into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network model to extract latent variables of the environment to predict changes in the surrounding environment, thereby teaching the model to adopt safer driving behavior and yielding desired results. Therefore, it would have been obvious to combine Wiest with Taralova, in view of King.

Regarding Claim 9, the combination of Taralova, King, and Wiest discloses the method of Claim 8. The combination of Taralova and King does not expressly disclose the limitations of Claim 9; however, Wiest discloses
training the autoencoder with the training goal of a perfect reconstruction by the decoder part of an image encoded by the encoder part (Wiest, [Column 17, Lines 32-34]: teaches "abstract representations of the vehicle operation environment may be generated in the form of encodings that are suitable for use as inputs to deep neural network models"; [Column 19, Lines 26-31]: teaches training the neural network model to recognize infrastructure objects; [Column 19, Lines 50-57]: teaches a neural network model processing tasks, such as "recognizing various infrastructure elements, e.g., by transforming raw input data representing infrastructure elements into internal representations, and then using the internal representations to reconstruct <read on perfect reconstruction by decoder> or learn the input representations of the infrastructure," where "such a transformation-reconstruction approach may be referred to as an autoencoder-decoder technique").

Wiest is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely training neural networks for safe autonomous driving using generated simulated environments. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to utilize encodings and observations of the surrounding real-world environment as input for a conditional variational auto-encoder (CVAE) neural network model to generate an accurate simulated representation of the surrounding real-world environment as taught by Wiest into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network model to extract latent variables of the environment to predict and anticipate changes in the surrounding environment, thereby teaching the model to adopt safer driving behavior and yielding desired results. Therefore, it would have been obvious to combine Wiest with Taralova, in view of King.

Claims 14-16 are rejected under 35 U.S.C. 103 as being unpatentable over Taralova (US 20210027111 A1, previously cited) in view of King (US 11494538 B1, previously cited) as applied to Claim 1 above respectively, and further in view of Lakshmi Narayanan et al. (US 20200089977 A1), hereinafter referenced as Lakshmi.


Regarding Claim 14, the combination of Taralova and King discloses the method of Claim 1. The combination of Taralova and King does not expressly disclose the limitations of Claim 14; however, Lakshmi discloses wherein
the selection of neurons is formed exclusively of complete layers of the neural network (Lakshmi, [0063]: teaches a plurality of fully connected layers <read on complete layers> as shown in FIG. 4; Note: it should be noted that an alternative term for "complete layers" is "fully connected layers", where every neuron in each layer is connected to every neuron in the preceding layer; in addition, it is common in the art to create complete layers by selecting and connecting neurons from a current layer to a previous layer).
    PNG
    media_image5.png
    208
    449
    media_image5.png
    Greyscale


Lakshmi is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely generating image views for autonomous vehicles. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement a fully connected network to obtain image features as taught by Lakshmi into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to associate objects with context temporally, thereby improving results. Therefore, it would have been obvious to combine Lakshmi with Taralova, in view of King.

Regarding Claim 15, the combination of Taralova, King, and Lakshmi discloses the method of Claim 14. The combination of Taralova and King does not expressly disclose the limitations of Claim 15; however, Lakshmi discloses wherein
at least one of the complete layers is a hidden layer (Lakshmi, [0074]: teaches the convolutor 110 including processing layers, such as fully connected layers and hidden layers; [0081]: teaches an image sequence being passed through one or more processing layers, such as a fully connected layer or a hidden layer).

Lakshmi is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely generating image views for autonomous vehicles. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement a fully connected network to obtain image features as taught by Lakshmi into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to associate objects with context temporally, thereby improving results. Therefore, it would have been obvious to combine Lakshmi with Taralova, in view of King.

Regarding Claim 16, the combination of Taralova, King, and Lakshmi discloses the method of Claim 15. The combination of Taralova and King does not expressly disclose the limitations of Claim 16; however, Lakshmi discloses wherein
the first representation includes at least one complete intermediate representation of a digital photograph stored in the hidden layer (Lakshmi, [0065]: teaches processor 102 extracting an image representation <read on complete intermediate representation of digital photograph> from a hidden layer <read on stored in hidden layer> of a CNN of the convolutor, where the hidden layer is a processing layer, which can be a fully connected layer).

Lakshmi is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely generating image views for autonomous vehicles. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement a fully connected network to obtain image features as taught by Lakshmi into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to associate objects with context temporally, thereby improving results. Therefore, it would have been obvious to combine Lakshmi with Taralova, in view of King.

Claims 17-20 are rejected under 35 U.S.C. 103 as being unpatentable over Taralova (US 20210027111 A1, previously cited) in view of King (US 11494538 B1, previously cited) as applied to Claim 1 above respectively, and further in view of Wrenninge et al. (US 20190156151 A1), hereinafter referenced as Wrenninge.

Regarding Claim 17, the combination of Taralova and King discloses the method of Claim 1. The combination of Taralova and King does not expressly disclose the limitations of Claim 17; however, Wrenninge discloses wherein
the 3D model is a semantic description of a replica of a scene that can be read and processed by the program logic for image synthesis (Wrenninge, [0040]: teaches the output(s) of Block S100 includes the definition <read on semantic description> of each object property (e.g., defined by parameter values) as well as the layout of each object within the scene (e.g., defined by parameter values), which can be used to generate the three dimensional virtual representation <read on 3D model> in Block S200; the definition of each object property and scene layout is being interpreted as the semantic description of a replica of a scene; [0032]: teaches "individual object classes, such as geometry, materials, color, and placement can be parameterized, and a synthesized image <read on replica of scene> and its corresponding annotations (e.g., of each instance of an object class in a virtual scene) represent a sampling of that parameter space (e.g., multidimensional parameter space)"; [0055]: teaches outputting a plurality of synthesized virtual scenes via scene synthesis <read on image synthesis>).

Wrenninge is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely processing input image data for vehicle-purposed neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement an automatic semantic labelling system for detected objects in a scene as taught by Wrenninge into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to recognize various types of detected objects, thereby improving the system. Therefore, it would have been obvious to combine Wrenninge with Taralova, in view of King.

Regarding Claim 18, the combination of Taralova, King, and Wrenninge discloses the method of Claim 17. The combination of Taralova and King does not expressly disclose the limitations of Claim 18; however, Wrenninge discloses wherein the semantic description includes
a list of graphic objects and an assignment of parameters to each graphic object (Wrenninge, [0037]: teaches determining a set of parameter values, such as DV parameters, of a parameter group, where each DV parameter is determined for each candidate object in a set of candidate objects <read on list of graphic objects with assigned parameters>).

Wrenninge is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely processing input image data for vehicle-purposed neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement an automatic semantic labelling system for detected objects in a scene as taught by Wrenninge into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to recognize various types of detected objects, thereby improving the system. Therefore, it would have been obvious to combine Wrenninge with Taralova, in view of King.

Regarding Claim 19, the combination of Taralova, King, and Wrenninge discloses the method of Claim 18. The combination of Taralova and King does not expressly disclose the limitations of Claim 19; however, Wrenninge discloses wherein
the parameters include a position and a spatial orientation (Wrenninge, [0035]: teaches determining a set of parameters associated with object classes to be depicted in a scene, where "these parameters can include geometric parameters (e.g., size, three-dimensional position, three-dimensional orientation <read on spatial orientation> or attitude, etc.)").

Wrenninge is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely processing input image data for vehicle-purposed neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement an automatic semantic labelling system for detected objects in a scene as taught by Wrenninge into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to recognize various types of detected objects, thereby improving the system. Therefore, it would have been obvious to combine Wrenninge with Taralova, in view of King.

Regarding Claim 20, the combination of Taralova, King, and Wrenninge discloses the method of Claim 18. The combination of Taralova and King does not expressly disclose the limitations of Claim 20; however, Wrenninge discloses wherein the program logic is designed to
generate a suitable texture for each object (Wrenninge, [0058]: teaches generated geometry based on object classes and properties, such as placement, orientation, and certain texture and material aspects <read on generate suitable texture>, where these properties are "generated based on values determined via LDS sampling").

Wrenninge is analogous art with respect to Taralova, in view of King because they are from the same field of endeavor, namely processing input image data for vehicle-purposed neural networks. Before the effective filing date of the claimed invention, it would have been obvious to a person of ordinary skill in the art to implement an automatic semantic labelling system for detected objects in a scene as taught by Wrenninge into the teaching of Taralova, in view of King. The suggestion for doing so would allow the neural network to recognize various types of detected objects, thereby improving the system. Therefore, it would have been obvious to combine Wrenninge with Taralova, in view of King.





Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
Choi et al. (US 20210064954 A1) discloses a CNN that includes a front, back, and other layers that are connected between the front and back layers; and
Shrivastava et al. (US 20200311548 A1) discloses a neural network that receives a dataset for generating dataset features.

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KARL TRUONG whose telephone number is (703)756-5915. The examiner can normally be reached 7:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kent Chang can be reached at (571) 272-7667. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would l
Read full office action
Prosecution Timeline

Aug 21, 2023
Application Filed
Apr 23, 2025
Non-Final Rejection — §103
Aug 28, 2025
Response Filed
Sep 10, 2025
Final Rejection — §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/324,617
Patent 12573149
DATA PROCESSING METHOD AND APPARATUS, DEVICE, COMPUTER-READABLE STORAGE MEDIUM, AND COMPUTER PROGRAM PRODUCT
2y 5m to grant Granted Mar 10, 2026
18/455,592
Patent 12561875
ANIMATION FRAME DISPLAY METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM
2y 5m to grant Granted Feb 24, 2026
18/211,149
Patent 12494013
AUTODECODING LATENT 3D DIFFUSION MODELS
2y 5m to grant Granted Dec 09, 2025
18/125,596
Patent 12456258
SYSTEMS AND METHODS FOR GENERATING A SHADOW MESH
2y 5m to grant Granted Oct 28, 2025
18/028,063
Patent 12444020
FLEXIBLE IMAGE ASPECT RATIO USING MACHINE LEARNING
2y 5m to grant Granted Oct 14, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
52%
Grant Probability
83%
With Interview (+31.0%)
2y 7m
Median Time to Grant
Moderate
PTA Risk
Based on 29 resolved cases by this examiner. Grant probability derived from career allow rate.