Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
This Office Action is in response to the amendment filed on 12/1/2025. Claims 1, 3, 21, and 27-28 are amended. Claims 1-28 are presently pending and are presented for examination.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim(s) 1-28 is/are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. Independent claim 1 is directed toward a machine, independent claim 21 is directed to a machine, independent claim 27 is directed to a method, and independent claim 28 is directed to a method. Each of the independent claim(s) 27 and 28 along with the corresponding dependent claims are directed to a statutory category of invention under Step 1. Independent claims 1 and 21 are not directed to statutory categories of invention, with explanation provided at the end of the 35 USC 101 section.
Under Step 2A, Prong 1, the claims are analyzed to determine whether one or more of the claims recites subject matter that falls within one of the following groups of abstract ideas: (1) mental processes, (2) certain methods of organizing human activity, and/or (3) mathematical concepts. In this case, the independent claim(s) 1, 21, 27, and 28 is/are directed to an abstract idea without significantly more. Specifically, the claim(s), under its/their broadest reasonable interpretation(s) cover(s) certain mental processes. The language of independent claim 27 is used for illustration:
A method, comprising:
receiving first sensor data comprising a plurality of frames corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors;
generating, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data (A human with access to sensor data, e.g. camera and camera viewing direction and location data, could use the data to construct a detailed map suitable for navigation by hand.); and
executing a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task.
Independent claims 1 and 27 further recite execute a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task (A human with access to HD map data could execute a perception, localization, or planning task, i.e. visually perceiving objects, determining a location of the vehicle corresponding to other visual data, and planning a route for the vehicle. This is therefore a mental process.).
As explained above, independent claim 27 recites at least one abstract idea. The other independent claim(s), claim(s) 1, 21, 27, and 28, which is/are similar in scope to claim 27 likewise recite(s) at least one abstract idea under Step 2A, Prong 1.
Under Step 2A, Prong 2, the claims are analyzed to determine whether the claim, as a whole, integrates the abstract idea into a practical application. As noted in the 2019 PEG, it must be determined whether any additional elements in the claim beyond the abstract idea integrate the exception into a practical application in a manner that imposes a meaningful limit on the judicial exception. The courts have indicated that additional elements such as merely using a computer to implement an abstract idea, adding insignificant extra-solution activity, or generally linking use of a judicial exception to a particular technological environment or field of use do not integrate a judicial exception into a "practical application"; see at least MPEP 2106.04(d).
In this case, the mental processes are not integrated into a practical application. Independent claims 1, 21, 27, and 28 recite additional elements. These/this limitation(s) amount to implementing the abstract idea on a computer, add insignificant extra-solution activity, and/or generally link use of the judicial exception to a particular technological environment or field of use; see at least MPEP 2106.04(d). More specifically,
one or more memories; and one or more processors… found in independent claim(s) 1 and 21. This limitation amounts to implementing the abstract idea on a computer.
receive first sensor data comprising a plurality of frames corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors… found with slight variations in independent claim(s) 1, 21, and 27. This limitation amounts to insignificant extra-solution activity.
a first neural implicit surface network… found in independent claim(s) 1, 21, 27, and 28. This limitation amounts to implementing the abstract idea on a computer.
determine a location of a vehicle… found in independent claim(s) 21 and 28. This limitation amounts to generally linking the use of the abstract idea to a particular technological environment or field of use.
render, based on the one or more output modalities, one or more two-dimensional representations… found in independent claim(s) 21 and 28. This limitation amounts to insignificant extra-solution activity.
select[ing] a first neural implicit surface network … found in independent claim(s) 21 and 28. A human could select from a plurality of models representing an environment based on sensor data, e.g. camera or location data, making this limitation a mental process and also an abstract idea.
Therefore, taken alone, the additional elements do not integrate the abstract idea into a practical application. Furthermore, looking at the additional limitation(s) as an ordered combination or as a whole, the limitations add nothing significant that is not already present when looking at the elements taken individually. Because the additional elements do not integrate the abstract idea into a practical application by imposing meaningful limits on practicing the abstract idea, independent claim(s) 1, 21, 27, and 28 is/are directed to an abstract idea.
Under Step 2B, the claims do not include any additional elements that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application in Step 2A, Prong Two, the additional element of limiting the use of the idea to one particular environment employs generic computer functions to execute an abstract idea and, therefore, does not add significantly more. Mere instruction to apply an exception using generic computer components and limiting the use of the abstract idea to a particular environment or field of use cannot provide an inventive concept. Additionally, as discussed above, the remaining limitation(s) as recited above is/are considered insignificant extra-solution activity.
A conclusion that an additional element is insignificant extra-solution activity in Step 2A must be re-evaluated in Step 2B to determine if the element is more than what is well-understood, routine, and conventional in the field. In this case, the additional limitation of one or more memories; and one or more processors… is well-understood, routine, and conventional activity, because the specification does not provide any indication that the one or more memories; and one or more processors… is/are anything more than conventional computer(s). Additionally, the remaining element(s) has/have been deemed insignificant extra-solution activity by one or more courts; see at least MPEP 2106.05(d) and MPEP 2106.05(g):
receive first sensor data comprising a plurality of frames corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors… is considered well-understood, routine, and conventional activity under CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information so that the information can be analyzed by an abstract mental process.).
render, based on the one or more output modalities, one or more two-dimensional representations… is considered well-understood, routine, and conventional activity under TLI Communications, 823 F.3d at 612-13, 118 USPQ2d at 1747-48 (Gathering and analyzing information using conventional techniques and displaying the result.).
Because the claims fail to recite anything sufficient to amount to significantly more than the judicial exception, independent claim(s) 1, 21, 27, and 28 is/are patent ineligible under 35 U.S.C. 101.
Dependent claims 2-20 and 22-26 have been given the full two-part analysis, including analyzing the additional limitations, both individually and in combination. Dependent claims 2-20 and 22-26, when analyzed both individually and in combination, are also patent ineligible under 35 U.S.C. 101 based on the same analysis as above. The additional limitations recited in the dependent claims fail to establish that the dependent claims are not directed to an abstract idea. The additional limitations of the dependent claims, when considered individually and as an ordered combination, do not amount to significantly more than the abstract idea. Accordingly, claims 2-20 and 22-26 are patent ineligible under 35 U.S.C. 101.
Claims 1 and 21 are directed to an apparatus comprising memories configured to cause the apparatus to perform specified functions. This indicates that computer instructions are stored in a memory. There is no indication that the memory of the computer program product is non-transitory. Claims 1 and 21 and their corresponding dependent claims are therefore directed to a signal per se, which is not a statutory category of invention.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-5, 8-12, and 15 are rejected under 35 U.S.C. 103 as being obvious over NPL document “Large-Scale 3D Semantic Reconstruction for Automated Driving
Vehicles with Adaptive Truncated Signed Distance Function”, hereinafter “Hu”, NPL document “DeepSDF: Learning Continuous Signed Distance Functions
for Shape Representation”, hereinafter “Park”, US 20230136492 A1, hereinafter “Park2”, NPL document “In-Place Scene Labelling and Understanding with Implicit Scene Representation”, hereinafter “Zhi”, US 20220254165 A1, hereinafter “Yokota”.
Regarding claim 1, Hu, in the same field of endeavor and solving a related problem, discloses An apparatus (See page 5 column 2 paragraph 1-page 7 paragraph 1 and Figs. 7-13, the method is run on several data sets. The method was necessarily run on a computer, which is an apparatus.)., comprising:
one or more memories (See page 5 column 2 paragraph 1-page 7 paragraph 1 and Figs. 7-13, the method is run on several data sets. The method was necessarily run on a computer, which inherently comprise a memory.); and
one or more processors, coupled to the one or more memories (See page 5 column 2 paragraph 1-page 7 paragraph 1 and Figs. 7-13, the method is run on several data sets. The method was necessarily run on a computer, which inherently comprise a processor coupled to memory. These were necessarily configured to execute the program described..), configured to cause the apparatus to:
receive first sensor data comprising a plurality of data corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors (See page 5 column 2 paragraph 2, the method is tested on a dataset recorded with the authors’ experimental vehicle equipped with LIDAR and cameras. This is a plurality of sensors. Camera data necessarily comprises a plurality of frames. LIDAR sensors and cameras inherently gather data corresponding to an environment.); and generate, from an implicit surface model, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data (See page 3 column 2 paragraph 1, Fig. 3, and page 4 column 1 paragraph 2, the method uses the measurements to generate an implicit surface model, specifically one based on a signed distance function. See page 1 column 1 paragraph 3-column 2 paragraph 1, the semantic mapped 3D models are used to extract semantic HD maps for automated driving vehicles. See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, perform texture mapping, and perform semantic mapping, i.e. labeling. These are all characteristics corresponding to the sensor data and therefore the environment.); and
execute a task, the task comprising at least one of a perception, localization, or planning task (See page 1 column 1 paragraph 3, the 3D environment representation is used for localization, i.e. vehicle localization, and trajectory planning, i.e. a route training operation.).
Hu does not explicitly disclose generate, from a first neural implicit surface network.
Park, in the same field of endeavor and solving a related problem, renders obvious generate, from a first neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model based on a signed distance function disclosed by Hu to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Hu combined with Park does not explicitly disclose the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone or based on the first HD map.
Park2, in the same field of endeavor and solving a related problem, discloses based on the first HD map (See [0006], the method converts 3D feature point information from HD map data to a 2D format, i.e. generates a two-dimensional representation, and compares the converted data to 2D feature point information captured by an image capturing device in order to perform localization. This is using the HD map for localization.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu and Park to include converting 3D model information to 2D in order to perform localization based on acquired 2D sensor data of Park2. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy of localization, as suggested by Park2 at [0003]-[0004].
Hu combined with Park and Park2 does not explicitly disclose the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone
Zhi renders obvious the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone (See Fig. 2 and page 3 column 1 paragraph 3-column 2 paragraph 3, the system comprises several heads, specifically volumne density, semantic labeling, and RGB output heads, attached to a backbone, i.e. the shared common layers of the network. See page 2 column 2 paragraph 1-4, the system uses neural radiance fields (NeRF) implemented on a multi-layer perceptron, i.e. a neural network based implicit representation of the geometry of the scene. It would be obvious to try combining a similar architecture comprising backbone and plurality of output heads with a neural implicit surface network.)
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu, Park, and Park2 to include the multi-headed network of Zhi. One of ordinary skill in the art would have been motivated to make this modification because join learning of the geometry and semantics can improve labeling, as suggested by Zhi at page 4 column 2 paragraph 1.
Hu combined with Park, Park2, and Zhi does not explicitly disclose frames.
Yokota, renders obvious frames (See Abstract, the invention synchronizes data gathered by vehicle sensors. See [0060], the invention can synchronize vehicle camera and LIDAR data.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu, Park, Park2, and Zhi to include synchronizing the sensor data of Yokota. One of ordinary skill in the art would have been motivated to make this modification so that data gathered from multiple sensors, including sensors used for localization and vehicle pose estimation, can have accurate relative timestamps for processing, leading to more accurate results when the data are processed together, as suggested by Yokota at [0010] and [0016]-[0019].
Regarding claim 2, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Hu further discloses wherein the one or more characteristics comprise geometry information, appearance information, and semantic information corresponding to the first environment (See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, which is geometry information and appearance information, perform texture mapping, which is appearance information, and perform semantic mapping, i.e. labeling, which is semantic information.).
Regarding claim 3, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Hu further discloses determine, based on the first sensor data, the one or more characteristics (See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, perform texture mapping, and perform semantic mapping. The implicit surface representation, texture mapping, and semantic labeling are all characteristics.); encode, with the first implicit surface model, the one or more characteristics (See page 3 paragraph column 2 1-page 4 column 1 paragraph 2, the implicit surface is encoded with voxel blocks. See page 4 column 1 paragraph 3-page 5 column 2 paragraph 1, the texture and semantic mapping are defined on faces, which are defined by the voxels. All characteristics are therefore encoded by the implicit surface model.);
generate, with the first implicit surface model, one or more predicted output modalities (See page 3 paragraph column 2 1-page 4 column 1 paragraph 2, the implicit surface is encoded with voxel blocks. See Fig. 5, the voxel blocks are used to create the reconstruction. The blocks are necessarily outputted from the data structure storing them for this operation and they are therefore an output modality. They correspond to predicting the shape of a surface given sensor data and are therefore a predicted output modality.);
render, based on the one or more predicted output modalities, one or more two-dimensional representations of the first environment, each of the one or more two-dimensional representations comprising respective per-pixel modality information (See Fig. 1, geometric, textured, and semantic data, all of which are predicted output modalities, are outputted by the model. See page 3 paragraph column 2 1-page 4 column 1 paragraph 2, the implicit surface is encoded with voxel blocks. See page 4 column 1 paragraph 3-page 5 column 2 paragraph 1, the texture and semantic mapping are defined on faces, which are defined by the voxels. These are images, and therefore two-dimensional and rendered. Each pixel necessarily comprises modality information because the voxels, i.e. the predicted output modality, is used to define the image and therefore each pixel.); and
determine one or more losses corresponding to the one or more two-dimensional representations (See page 4 column 1 paragraph 3-column 2 paragraph 1 and page 5 column 1 paragraph 1-column 2 paragraph 1, the texture mapping and semantic mapping are determined by minimizing one or more cost functions, i.e. losses. The resulting solutions determine in part the two-dimensional representations and therefore the losses correspond to the two-dimensional representations.).
Park, in the same field of endeavor and solving a related problem, renders obvious the first neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.); and adjust one or more weights of the first neural implicit surface network to reduce the one or more losses (See Figure 2, two 2D cross-sections of the signed distance function as learned by the neural network and a 2D rendering of the 3D surface corresponding to the SDF learned by the network are rendered. Each pixel of the image is determined by the learned SDF and therefore comprises modality information, i.e. information from the SDF. See page 3 column 2 paragraph 5-page 4 column 1 paragraph 4, the network is trained using a loss function to predict the signed distance function. The SDF learned by the network determines the 2D representations and therefore corresponds to them. See page 3 column 2 paragraph 5-page 4 column 1 paragraph 4, the network is trained using a loss function to predict the signed distance function. The network is trained by adjusting the weights of the network to minimize the loss function.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Zhi renders obvious with one or more of the plurality of output heads (See Fig. 2 and page 3 column 1 paragraph 3-column 2 paragraph 3, the system comprises several heads, specifically volumne density, semantic labeling, and RGB output heads, attached to a backbone, i.e. the shared common layers of the network. See page 2 column 2 paragraph 1-4, the system uses neural radiance fields (NeRF) implemented on a multi-layer perceptron, i.e. a neural network based implicit representation of the geometry of the scene. It would be obvious to try combining a similar architecture comprising backbone and plurality of output heads with a neural implicit surface network.)
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu, Park, Park2, and Yokota to include the multi-headed network of Zhi. One of ordinary skill in the art would have been motivated to make this modification because join learning of the geometry and semantics can improve labeling, as suggested by Zhi at page 4 column 2 paragraph 1.
Regarding claim 4, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 3. Park renders obvious a first predicted output modality of the one or more predicted output modalities comprises a predicted signed distance to closest surfaces of the first environment represented by the one or more characteristics (See Figure 2, two 2D cross-sections of the signed distance function as learned by the neural network and a 2D rendering of the 3D surface corresponding to the SDF learned by the network are rendered. Each pixel of the image is determined by the learned SDF and therefore comprises modality information. See page 3 column 2 paragraph 5-page 4 column 1 paragraph 4, the signed distance function is a function that gives a points distance to the closest surface. The network is trained using a loss function to predict the signed distance function. The SDF learned by the network determines the 2D representations and therefore corresponds to them.); and the one or more two-dimensional representations comprise a signed distance field based on the predicted signed distance to the closest surfaces (See Figure 2, two 2D cross-sections of the signed distance function as learned by the neural network and a 2D rendering of the 3D surface corresponding to the SDF learned by the network are rendered. Each pixel of the image is determined by the learned SDF and therefore comprises modality information. See page 3 column 2 paragraph 5-page 4 column 1 paragraph 4, the signed distance function is a function that gives a points distance to the closest surface. The network is trained using a loss function to predict the signed distance function. The SDF learned by the network determines the 2D representations and therefore corresponds to them.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu, Park2, Zhi, and Yokota to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Regarding claim 5, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 3. Hu further discloses wherein each of the plurality of sensors corresponding to a different perception modality (See page 5 column 2 paragraph 2, the method is tested on a dataset recorded with the authors’ experimental vehicle equipped with LIDAR and cameras. This is a plurality of sensors corresponding to different perception modalities.).
Regarding claim 8, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 3. Hu further discloses a semantic head configured to generate a first predicted output modality of the one or more predicted output modalities, the first predicted output modality comprising semantic class probabilities (See page 5 column 1 paragraph 2-column 2 paraph 1, a loss function based on the probability that a face belongs to a given class is used to perform semantic labeling. Semantic class probabilities are therefore outputted as part of evaluating the loss function, and corresponding structure, i.e. semantic head configured to output the predicted output modality, exists in the network. See Fig. 11-12, the semantic map is rendered in a 2D format.); and
the one or more two-dimensional representations comprise a semantic segmentation map based on the class probabilities (See page 5 column 1 paragraph 2-column 2 paraph 1, a loss function based on the probability that a face belongs to a given class is used to perform semantic labeling. Semantic class probabilities are therefore outputted as part of evaluating the loss function, and corresponding structure, i.e. semantic head configured to output the predicted output modality, exists in the network. See Fig. 11-12, the semantic map is rendered in a 2D format.).
Zhi renders obvious logits (See Fig. 2 and page 3 column 1 paragraph 4-column 2 paragraph 1, the network outputs semantic logits.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu combined with Park, Park2, and Yokota to include the semantic logits of Zhi. One of ordinary skill in the art would have been motivated to make this modification in order to assign semantic labels to help mapping, as suggested by Zhi at page 3 column 1 paragraph 4.
Regarding claim 9, Hu combined with Park, Park2, Zhi and Yokota renders obvious the limitations of claim 8. Hu discloses the first sensor data comprises image data comprising a plurality of images (See page 5 column 2 paragraph 2, the method is tested on a dataset recorded with the authors’ experimental vehicle equipped with LIDAR and cameras. Camera data necessarily comprises images. Examiner asserts that the dataset contained more than one, i.e. a plurality, of images.).
Zhi renders obvious the one or more processors are configured to further cause the apparatus to receive a set of semantic labels for a subset of images of the plurality of images (See page 3 column 2 paragraph 2, the system performs supervised semantic training using ground truth labels. The system necessarily receives the semantic labels used in training.); and
to determine the one or more losses comprises to determine a multi-class cross-entropy loss between the semantic segmentation map and the set of semantic labels (See page 3 column 2 paragraph 2, the system performs supervised semantic training using ground truth labels using multi-class cross-entropy loss.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu combined with Park, Park2, and Yokota, to supervised semantic label training of Zhi. One of ordinary skill in the art would have been motivated to make this modification in order to combine training of a 3D representation of an object with semantic labeling to improve accuracy on sparse data, as suggested by Zhi at page 2 column 2 paragraph 4.
Regarding claim 10, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 3. Hu further discloses the one or more processors are configured to further cause the apparatus to obtain a viewing direction with respect to the first environment (See Figs. 1-2, 4, and 7-13, the method renders a 2D reconstruction of the environment, i.e. an image. This necessarily comprises obtaining a viewing direction.);
the first implicit surface model comprises a color head configured to generate a first predicted output modality of the one or more predicted output modalities, the first predicted output modality comprising per-pixel color values corresponding to the viewing direction (See page 4 column 1 paragraph 1-page 5 column 1 paragraph 1, the method performs texture mapping, i.e. prediction of what texture corresponds to a given location. The texture defines the color values. Exactly one color value exists for each pixel in the corresponding 2D rendering, i.e. the predicted output is per-pixel color values corresponding to the viewing direction. The section of code that outputs the texture for each face is therefore a color head configured to generate the color values. The texture selection and color adjustments further take place by minimizing the parameters of several cost functions, meaning the parameters of the model are analogous to the weights of a neural network trained to accomplish a similar task.), and
the one or more two-dimensional representations comprise a color map based on the per-pixel color values (See Figs. 1-2, 4, and 7-13, the method renders a 2D reconstruction of the environment, i.e. an image. The images map each pixel location to a color value, i.e. are color maps.).
Park renders obvious neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model based on a signed distance function disclosed by Hu, Park2, Zhi, and Yokota to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Regarding claim 11, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 10. Hu discloses wherein to determine the one or more losses comprises to determine a photometric loss between a ground truth image and the color map (See page 4 column 1 paragraph 3-page 4 column 2 paragraph 2, faces of the implicit surface representation are textured with patches from the gathered images. The best images are selected by minimizing equation 5. The first term of equation 5, i.e. equation 6, integrates the gradient magnitude of an image patch over all pixels in the region obtained by projecting the face onto the image. Because this is a loss function involving images, it is a photometric loss. Because it compares the image patch to be selected by the model to a projection of the model’s representation of the corresponding area into the real image, i.e. the ground truth, it is a loss between the model’s color map and a ground truth image.).
Regarding claim 12, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Hu discloses wherein the one or more processors are configured to further cause the apparatus to:
receive second sensor data comprising a second plurality of frames corresponding to a second environment, wherein the second sensor data is generated from a plurality of sensors (See page 5 column 2 paragraph 2, the method is used on several datasets, two of which are gathered from real world data, i.e. the method is used on second sensor data. See Abstract, the method uses LIDAR and camera data, i.e. both datasets are gathered from a plurality of sensors. This also indicates that the second data set comprises a second plurality of frames, i.e. images from the camera. Images from a camera inherently correspond to an environment. See Figs. 7 and 8, the two datasets correspond to different environments, as can be seen in the corresponding reconstructions.);
and generate from a second implicit surface model, a second HD map comprising second labels created from one or more second characteristics corresponding to the second environment determined based on the second sensor data (See page 3 column 2 paragraph 1, Fig. 3, and page 4 column 1 paragraph 2, the method uses the measurements to generate an implicit surface model, specifically one based on a signed distance function. See page 1 column 1 paragraph 3-column 2 paragraph 1, the semantic mapped 3D models are used to extract semantic HD maps for automated driving vehicles. See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, perform texture mapping, and perform semantic mapping, i.e. labeling. These are all characteristics corresponding to the sensor data and therefore the environment.).
Park renders obvious implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model based on a signed distance function disclosed by Hu, Park2, Zhi, and Yokota to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Hu renders obvious from the plurality of sensors (See page 5 column 2 paragraph 2, the method is used on several datasets of sensor data. It would be obvious to use the same set of sensors to gather multiple sets of data from multiple locations of interest.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu, Park, Park2, Zhi, and Yokota to include gathering sensor data from multiple locations of interest, as suggested by Hu. One of ordinary skill in the art would have been motivated to make this modification in order to gather sensor data from multiple locations of interest without purchasing a new set of sensors for each location, as suggested by Hu at page 5 column 2 paragraph 2.
Regarding claim 15, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1.
Yokota, in the same field of endeavor and solving a related problem, renders obvious wherein the first sensor data is pairwise aligned and synchronized (See Abstract, the invention synchronizes data gathered by vehicle sensors. See [0060], the invention can synchronize vehicle camera and LIDAR data.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park, Park2 and Zhi to include synchronizing the sensor data of Yokota. One of ordinary skill in the art would have been motivated to make this modification so that data gathered from multiple sensors, including sensors used for localization and vehicle pose estimation, can have accurate relative timestamps for processing, leading to more accurate results when the data are processed together, as suggested by Yokota at [0010] and [0016]-[0019].
Claims 6-7 are rejected under 35 U.S.C. 103 as being obvious over Hu, Park, Park2, Zhi, and Yokota in view of NPL document “Vox-Surf: Voxel-based Implicit Surface Representation”, hereinafter “Li”.
Regarding claim 6, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 3. Hu combined with Park, Park2, Zhi, and Yokota does not explicitly disclose wherein the one or more two-dimensional representations comprise a depth map. Li, in the same field of endeavor and solving a related problem, discloses wherein the one or more two-dimensional representations comprise a depth map (See Abstract, the paper is directed toward the creation of implicit surface representation of objects. See page 3 column 2 paragraph 2, the method uses a network to map a point to its signed distance value, i.e. the network is a signed distance network. See page 6 column 1 paragraphs 5-7, training the network comprises use of a depth loss based on corresponding depth data, i.e. a depth map, to supervise training of the network. The comparison of the processed output of the SDF based on the provided depth data indicates that the network is implicitly defining a depth map, which it uses for the supervised depth loss in training. See Fig. 10, the depth map is rendered in two-dimensional form.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu combined with Park, Park2, Zhi, and Yokota to include the use of depth supervision in training of Li. One of ordinary skill in the art would have been motivated to make this modification in order to obtain more accurate surface reconstruction from common data sources such as LIDAR, as suggested by Li at page 6 column 1 paragraph 5.
Regarding claim 7, Hu combined with Park, Park2, Zhi, Yokota, and Li renders obvious the limitations of claim 6. Li further discloses wherein to determine the one or more losses comprises to determine a geometric loss between the first sensor data and the depth map (See page 6 column 1 paragraphs 5-7, training the network comprises use of a depth loss based on corresponding depth data, i.e. a depth map, to supervise training of the network. The comparison of the processed output of the SDF based on the provided depth data indicates that the network is implicitly defining a depth map, which it uses for the supervised depth loss in training. See Fig. 10, the depth map is rendered in two-dimensional form. Further, the depth can come from depth sensors, i.e. first sensors. The depth loss therefore compares first sensor data and the depth map created by the network. See page 3 column 2 paragraph 2, the depth loss uses the signed distance value of points, which is geometric information. The depth loss is therefore a geometric loss.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu combined with Park, Park2, Zhi, and Yokota to include the use of depth supervision in training of Li. One of ordinary skill in the art would have been motivated to make this modification in order to obtain more accurate surface reconstruction from common data sources such as LIDAR, as suggested by Li at page 6 column 1 paragraph 5.
Claim 13 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, Park2, Zhi, and Yokota in view of US 20190244517 A1, hereinafter “Moustafa”.
Regarding claim 13, Hu combined with Park, Park2, Zhi, and Yokota, renders obvious the limitations of claim 12. Hu combined with Park, Park2, Zhi, and Yokota does not explicitly disclose wherein the one or more processors are configured to further cause the apparatus to stitch a plurality of high definition maps comprising at least the first HD map and the second HD map to generate a global high-definition map. Moustafa, in the same field of endeavor and solving a related problem, discloses wherein the one or more processors are configured to further cause the apparatus to stitch a plurality of high definition maps comprising at least the first HD map and the second HD map to generate a global high-definition map (See [0012], the invention is directed towards the creation of HD maps. See [0026], the FPGA, i.e. a processor, on the vehicle combines sensor data and a pre-existing map tile to create a modified map tile. This is the creation of a new map tile. A map tile is a subset of an HD map and therefore an HD map itself. See [0013], the map tiles are updated by several vehicles in a crowdsourced manner. See [0020]-[0021], a remote entity sends several tiles to a vehicle corresponding to the vehicle’s planned path. See [0001], the maps are navigation maps for autonomous vehicle control, i.e. they are used to control navigation of the vehicle. Examiner asserts that the use of these tiles for navigation comprises stitching the maps together to generate a global map corresponding to the route.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park, Park2, Zhi, and Yokota to include sending maps in tile format to the vehicle and stitching them together as necessary for navigation, as suggested by Moustafa. One of ordinary skill in the art would have been motivated to make this modification in order to reduce the resources needed to store and send the map information, as suggested by Moustafa at [0003].
Claim 14 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, Park2, Zhi, and Yokota, in view of US 20200167956 A1, hereinafter “Herman”.
Regarding claim 14, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Hu combined with Park, Park2, Zhi, and Yokota does not explicitly disclose wherein the one or more processors are configured to further cause the apparatus to mask out dynamic objects from the first sensor data prior to generating the first HD map.
Herman, in the same field of endeavor and solving a related problem, renders obvious wherein the one or more processors are configured to further cause the apparatus to mask out dynamic objects from the first sensor data prior to generating the first HD map (See [0029]-[0030], the system uses video, i.e. image, and 3D point cloud data, i.e. first sensor data, to identify features in the image data and perform localization. The system filters time-varying, i.e. dynamic, objects from the data before further processing.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park, Park2, Zhi, and Yokota to include filtering moving objects from the sensor data of Herman. One of ordinary skill in the art would have been motivated to make this modification so that only temporally stable features, which are those most relevant to mapping, are used to generate the HD maps, as suggested by Herman at [0049]-[0051].
Claims 16-18 are rejected under 35 U.S.C. 103 as being obvious over Hu combined with Park, Park2, Zhi, and Yokota, in view of Li and US 20220358319 A1, hereinafter “Lee”.
Regarding claim 16, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Park discloses extract a mesh model based on the first neural implicit surface network (See page 3 column 2 paragraph 5, a mesh of the implicit surface is obtained with Marching Cubes algorithm.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park2, Zhi, and Yokota to include extracting a mesh from Park. One of ordinary skill in the art would have been motivated to make this modification so that the implicit surface could be viewed, as suggested by Park at page 3 column 2 paragraph 5.
Hu combined with Park does not explicitly disclose select a first label from the labels, wherein the first label is associated with a bounding box corresponding to an object in the first environment; generate a candidate bounding box defined by a sub-mesh of the mesh model, wherein coordinates of the sub-mesh indicate the candidate bounding box and the candidate bounding box predicts a location and size of the object in the first environment; calculate a 3D intersection-over-union (IoU) value for the bounding box and the candidate bounding box; or generate an indication based on the 3D IoU value being less than a threshold value.
Li, in the same field of endeavor and solving a related problem, discloses select a first label from the labels, wherein the first label is associated with a bounding box corresponding to an object in the first environment (See [0011], the system acquires a target image corresponding to an environment. See Fig. 2S and [0052], the system creates a bounding box corresponding to an object in the target image data, in this case a window, and assigns a bounding box corresponding to the location of the window and an object type tag identifying the type of the object, i.e. a label.);
generate a candidate bounding box defined by a sub-mesh of the mesh model, wherein coordinates of the sub-mesh indicate the candidate bounding box and the candidate bounding box predicts a location and size of the object in the first environment (See [0063], the system creates a reprojected new reprojected image based on the estimate of the room’s pose, which can also comprise projections of the object bounding boxes, i.e. a candidate bounding box defined by a sub-mesh of the mesh model, with the coordinates of the object in the mesh and the bounding box corresponding to the estimated, i.e. predicted, location and size of the object in the environment.);
calculate a 3D intersection-over-union (IoU) value for the bounding box and the candidate bounding box (See [0016], in order to estimate the room shape, the system projects object bounding boxes into the target image space, i.e. creates candidate bounding boxes, and compares the reprojected bounding boxes to the original bounding boxes using intersection-over-union distance. See [0064], the system outputs a confidence score for the estimated room pose based on the IoU values.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park, Park2, Zhi, and Yokota to include calculating a bounding box in the sensor data and meshed image space, then comparing their locations, of Li. One of ordinary skill in the art would have been motivated to make this modification to obtain an estimate of the accuracy of the resulting mesh and object locations corresponding to the bounding boxes, as suggested by Li at [0046].
Hu combined with Park and Li does not explicitly disclose generate an indication based on the 3D IoU value being less than a threshold value. Lee, in the same field of endeavor and solving a related problem, renders obvious generate an indication based on the 3D IoU value being less than a threshold value (See [0005], the invention is directed toward identifying text areas within an image. A first text area is identified. The first text area is used as input to a convolutional neural network, which identifies a second text area. The intersection-over-union between the two areas is computed. See [0061], if the intersection-over-union is below a threshold value, the second input area is inputted into the neural network again in an attempt to find a more accurate set of coordinates corresponding to the text area. An indication is necessarily generated in order for the computer to execute the corresponding section of the program.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model, including creation of a mesh and projection of bounding boxes, disclosed by Hu combined with Park, Park2, Zhi, Yokota, and Li to include comparing the first and second sets of bounding boxes to a threshold of Lee. One of ordinary skill in the art would have been motivated to make this modification in order to reprocess the image locations if the intersection-over-union values do not indicate sufficient accuracy, as suggested by Lee at [0061].
Regarding claim 17, Hu combined with Park, Park2, Zhi, Yokota, Li, and Lee renders obvious the limitations of claim 16. Hu renders obvious wherein the indication indicates an uncertainty regarding an accuracy of the first label within the first HD map for the object (See page 5 column 1 paragraph 2-column 2 paragraph 1, an estimated probability that each face within the 3D model belongs to a given semantic class is calculated in order to assign each face to its estimated semantic class. The calculation is also an indication indicating uncertainty regarding accuracy of the labels. See page 1 column 1 paragraph 3-column 2 paragraph 1, the semantic map can be used to create an HD map.).
Regarding claim 18, Hu combined with Park, Park2, Zhi, Yokota, Li, and Lee renders obvious the limitations of claim 16. Hu renders obvious the indication comprises a location of the object in the first environment (See Figs. 11-13, 2D renderings of the semantically mapped reconstructions are created. The relative location of each face in the environment was necessarily computed and provided, i.e. indicated, to the imaging routine in order to create the image. ).
Li renders obvious the one or more processors are configured to further cause the apparatus to collect additional sensor data corresponding to the location of the object (See [0032], the image data used for estimating room pose can be video frame data. Examiner asserts that video data has sufficient temporal resolution, i.e. frame rate, that additional sensor data was gathered in the location of any identified object. The video is gathered on a mobile device, which comprises a processor. The processor of the acquisition system was necessarily configured to cause this functionality.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu combined with Park, Park2, Zhi, Yokota, and Li to use of video cameras to gather data of Lee. One of ordinary skill in the art would have been motivated to make this modification in order to allow for objects to be identified across multiple images, allowing for greater accuracy in environment pose estimation, as suggested by Lee at [0033].
Claims 19-20 are rejected under 35 U.S.C. 103 as being obvious over Hu combined with Park, Park2, Zhi, and Yokota, in view of Li and NPL document “3D Object Detection and Instance Segmentation from 3D Range and 2D Color Images”, hereinafter Shen.
Regarding claim 19, Hu combined with Park, Park2, Zhi, and Yokota renders obvious the limitations of claim 1. Hu combined with Park, Park2, Zhi, and Yokota does not explicitly disclose select a first label from the labels, wherein the first label is associated with a bounding box and an annotation corresponding to an object in the first environment;
determine an amount of space occupied within the bounding box; determine that the amount of space is less than a predetermined threshold, the predetermined threshold corresponding to the annotation of the object; or
generate an indication based on the amount of space being less than the predetermined threshold.
Li, in the same field of endeavor and solving a related problem, discloses select a first label from the labels, wherein the first label is associated with a bounding box and an annotation corresponding to an object in the first environment (See [0011], the system acquires a target image corresponding to an environment. See Fig. 2S and [0052], the system creates a bounding box corresponding to an object in the target image data, in this case a window, and assigns a bounding box corresponding to the location of the window and an object type tag identifying the type of the object, i.e. a label. The label is also an annotation).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic HD maps from sensor data using an implicit surface model disclosed by Hu and Park to include calculating a bounding box in the sensor data and labeling the object of Li. One of ordinary skill in the art would have been motivated to make this modification to allow for an estimate of the accuracy of object locations corresponding to the bounding boxes between multiple representations of the same environment, as suggested by Li at [0046].
Hu combined with Park, Park2, Zhi, Yokota, and Li does not explicitly disclose determine an amount of space occupied within the bounding box; determine that the amount of space is less than a predetermined threshold, the predetermined threshold corresponding to the annotation of the object; or generate an indication based on the amount of space being less than the predetermined threshold.
Shen, in the same field of endeavor and solving a related problem, renders obvious determine an amount of space occupied within the bounding box (See page 17 paragraph 2-page 18 paragraph 1, the method determines the size of the bounding box, which is also the amount of space that it occupies.);
determine that the amount of space is less than a predetermined threshold, the predetermined threshold corresponding to the annotation of the object (See Table 1 and page 10 paragraph 2-page 11 paragraph 2, the method has several object type labels, i.e. annotations, e.g. toilet, chair, nightstand, etc. The method assigns each object type label to a size category with maximum or minimum sizes in the x and y dimension, which is the amount of space occupied in each direction by the object. If the bounding box for an object detected is less than the threshold in a given dimension for a specific group of object type labels, the object corresponding to the bounding box is assigned to a different group of object type labels. The thresholds are predetermined.); and
generate an indication based on the amount of space being less than the predetermined threshold (See Table 1 and page 10 paragraph 2-page 11 paragraph 2, depending on which size group the object associated with the bounding box is assigned to, a specific neural network is used for determination of the specific object type, i.e. the annotation. An indication is necessarily created in order for the program to use the appropriate network.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic HD maps from sensor data using an implicit surface model, including bounding boxes associated with specific objects, disclosed by Hu combined with Park, Park2, Zhi, Yokota, and Li to include calculating the dimensions of the bounding box and using this in the process of determining the object type and label of Shen. One of ordinary skill in the art would have been motivated to make this modification to allow for more accurate categorization of the objects by including their estimated sizes in the labeling process, as suggested by Shen at page 10 paragraph 2-page 11 paragraph 2.
Regarding claim 20, Hu combined with Park, Park2, Zhi, Yokota, Li, and Shen renders obvious the limitations of claim 19. Hu renders obvious the indication comprises a location of the object in the first environment (See Figs. 11-13, 2D renderings of the semantically mapped reconstructions are created. The relative location of each face in the environment was necessarily computed and provided, i.e. indicated, to the imaging routine in order to create the image.); and
the one or more processors are configured to further cause the apparatus to collect additional sensor data corresponding to the location of the object (See [0032], the image data used for estimating room pose can be video frame data. Examiner asserts that video data has sufficient temporal resolution, i.e. frame rate, that additional sensor data was gathered in the location of any identified object. The video is gathered on a mobile device, which comprises a processor. The processor of the acquisition system was necessarily configured to cause this functionality.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu, Park, Li, and Lee to use of video cameras to gather data of Lee. One of ordinary skill in the art would have been motivated to make this modification in order to allow for objects to be identified across multiple images, allowing for greater accuracy in environment pose estimation, as suggested by Lee at [0033].
Claims 21-24 and 28 are rejected under 35 U.S.C. 103 as being obvious over Hu in view of Moustafa, Park, and Zhi
Regarding claim 21, Hu discloses An apparatus (See page 5 column 2 paragraph 2, the method is evaluated on several data sets. This indicates that the method was executed on a computer, which is an apparatus.), comprising:
one or more memories (See page 5 column 2 paragraph 2, the method is evaluated on several data sets. This indicates that the method was executed on a computer. Computers inherently comprise a memory.);
and one or more processors, coupled to the one or more memories (See page 5 column 2 paragraph 2, the method is evaluated on several data sets. This indicates that the method was executed on a computer. Computers inherently comprise processors coupled to memories, configured to execute their programming.), configured to cause the apparatus to:
determine a location of a simulated vehicle (See page 7 column 1 paragraph 3-page 8 column 1 paragraph 1, the models can be used as environments in a simulation. Examiner asserts that simulation of a vehicle in an environment comprises determining the location of the simulated vehicle.);
select a first implicit surface model from a plurality of implicit surface models respectively trained to represent a plurality of sub-environments of a global environment, wherein the first implicit surface model is trained to represent a first sub-environment of the plurality of sub-environments, the first sub-environment corresponding to the location of the vehicle (See page 5 column 2 paragraph 2, the system is evaluated on a plurality of datasets. See Figs. 7-8, the reconstructions from the datasets correspond to different locations, i.e. environments, indicating that the datasets correspond to different locations. The trained implicit function models therefore are trained to represent different environments. The set of trained models inherently represent sub-environments of a global environment, namely the union of the real and simulated environments the datasets correspond to. See page 7 column 1 paragraph 3-page 8 column 1 paragraph 1, the models can be used as environments in a simulation. Examiner asserts that simulating a vehicle comprises using the model corresponding to the environment the vehicle is being simulated in.);
generate, from one or more output heads of the implicit surface model, one or more output modalities (See Figs. 7-8, 3D reconstructions are generated from the model. Output from the model inherently comes from the segment of software corresponding to that kind of output, i.e. the head of the model ); and
render, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment (See Figs. 7-8, 2D images of 3D reconstructions are rendered based on the model’s geometric output modality. They correspond to the environment the model represents.).
Hu does not explicitly disclose determine a location of a vehicle based on location data from one or more position sensors of the vehicle or neural implicit surface network.
Moustafa, in the same field of endeavor and solving a related problem, discloses determine a location of a vehicle based on location data from one or more position sensors of the vehicle (See [0002]-[0003], the vehicle receives electronic map tiles corresponding to a given location and uses them for navigation. See [0014], the vehicle receives multiple tiles corresponding to their planned route at once. Examiner asserts that using multiple tiles for navigation requires localizing the vehicle, i.e. determining the location, which is necessarily done with sensor data. Sensor data used for localization is location data.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for using implicit surface models of given environments and rendering a representation based on simulated vehicle locations of Hu to include determining the location of a real vehicle using sensor data and using the location to select the corresponding model of Moustafa. One of ordinary skill in the art would have been motivated to make this modification in order to allow use of the neural surface representations for navigation or awareness of the surroundings based on the vehicle’s location, as suggested by Moustafa at [0002].
Hu combined with Moustafa does not explicitly disclose neural implicit surface network or one or more output heads of a plurality of output heads.
Park, in the same field of endeavor and solving a related problem, renders obvious neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from implicit surface models based on vehicle location of Hu and Moustafa to include use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Hu combined with Moustafa and Park does not explicitly disclose one or more output heads of a plurality of output heads.
Zhi renders obvious one or more output heads of a plurality of output heads (See Fig. 2 and page 3 column 1 paragraph 3-column 2 paragraph 3, the system comprises several heads, specifically volumne density, semantic labeling, and RGB output heads, attached to a backbone, i.e. the shared common layers of the network. See page 2 column 2 paragraph 1-4, the system uses neural radiance fields (NeRF) implemented on a multi-layer perceptron, i.e. a neural network based implicit representation of the geometry of the scene. It would be obvious to try combining a similar architecture comprising backbone and plurality of output heads with a neural implicit surface network.)
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu, Moustafa, and Park to include the multi-headed network of Zhi. One of ordinary skill in the art would have been motivated to make this modification because join learning of the geometry and semantics can improve labeling, as suggested by Zhi at page 4 column 2 paragraph 1.
Regarding claim 22, Hu combined with Moustafa, Park, and Zhi renders obvious the limitations of claim 21. Hu further discloses the one or more output heads comprise a signed distance head configured to generate a predicted signed distance to closest surfaces of the first sub-environment (See Fig. 3, a two-dimensional representation based on the truncated signed distance function, which is a predicted signed distance to the closest surface in the environment, is computed.); and the one or more two-dimensional representations comprise a signed distance field based on the predicted signed distance to the closest surfaces of the first sub-environment (See Fig. 3, a two-dimensional representation based on the truncated signed distance function, which is a signed distance field based on the predicted signed distance to the closest surface, is computed and rendered.).
Regarding claim 23, Hu combined with Moustafa, Park, and Zhi renders obvious the limitations of claim 21. Hu further discloses the one or more output heads comprise a semantic head configured to generate semantic class probabilities (See page 5 column 1 paragraph 2-column 2 paraph 1, a loss function based on the probability that a face belongs to a given class is used to perform semantic labeling. Semantic class probabilities are therefore outputted as part of evaluating the loss function, and corresponding structure, i.e. semantic head configured to output the predicted output modality, exists in the network. See Fig. 11-12, the semantic map is rendered in a 2D format.); and
one or more two-dimensional representations comprise a semantic segmentation map based on the semantic class probabilities (See page 5 column 1 paragraph 2-column 2 paraph 1, a loss function based on the probability that a face belongs to a given class is used to perform semantic labeling. Semantic class probabilities are therefore outputted as part of evaluating the loss function, and corresponding structure, i.e. semantic head configured to output the predicted output modality, exists in the network. See Fig. 11-12, the semantic map is rendered in a 2D format.).
Zhi renders obvious logits (See Fig. 2 and page 3 column 1 paragraph 4-column 2 paragraph 1, the network outputs semantic logits.). It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating semantic reconstructions from sensor data using an implicit surface model based on a signed distance function disclosed by Hu combined with Park and Moustafa to include the semantic logits of Zhi. One of ordinary skill in the art would have been motivated to make this modification in order to assign semantic labels to help mapping, as suggested by Zhi at page 3 column 1 paragraph 4.
Regarding claim 24, Hu combined with Moustafa, Park, and Zhi renders obvious the limitations of claim 21. Hu further discloses the one or more processors are configured to further cause the apparatus to receive a viewing direction with respect to the first sub-environment (See Figs. 1-2, 4, and 7-13, the method renders a 2D reconstruction of the environment, i.e. an image. This necessarily comprises obtaining a viewing direction.);
the one or more output heads comprise a color head configured to generate per-pixel color values corresponding to the viewing direction (See page 4 column 1 paragraph 1-page 5 column 1 paragraph 1, the method performs texture mapping, i.e. prediction of what texture corresponds to a given location. The texture defines the color values. Exactly one color value exists for each pixel in the corresponding 2D rendering, i.e. the predicted output is per-pixel color values corresponding to the viewing direction. The section of code that outputs the texture for each face is therefore a color head configured to generate the color values.), and
the one or more two-dimensional representations comprise a color map based on the per-pixel color values (See Figs. 1-2, 4, and 7-13, the method renders a 2D reconstruction of the environment, i.e. an image. The images map each pixel location to a color value, i.e. are color maps.).
Regarding claim 28, Hu discloses A method, comprising:
determining a location of a simulated vehicle (See page 7 column 1 paragraph 3-page 8 column 1 paragraph 1, the models can be used as environments in a simulation. Examiner asserts that simulation of a vehicle in an environment comprises determining the location of the simulated vehicle.);
selecting a first implicit surface model from a plurality of implicit surface models respectively trained to represent a plurality of sub-environments of a global environment, wherein the first implicit surface model is trained to represent a first sub-environment of the plurality of sub-environments, the first sub-environment corresponding to the location of the vehicle (See page 5 column 2 paragraph 2, the system is evaluated on a plurality of datasets. See Figs. 7-8, the reconstructions from the datasets correspond to different locations, i.e. environments, indicating that the datasets correspond to different locations. The trained implicit function models therefore are trained to represent different environments. The set of trained models inherently represent sub-environments of a global environment, namely the union of the real and simulated environments the datasets correspond to. See page 7 column 1 paragraph 3-page 8 column 1 paragraph 1, the models can be used as environments in a simulation. Examiner asserts that simulating a vehicle comprises using the model corresponding to the environment the vehicle is being simulated in.);
generating, from one or more output heads of the first neural implicit surface network, one or more output modalities (See Figs. 7-8, 3D reconstructions are generated from the model. Output from the model inherently comes from the segment of software corresponding to that kind of output, i.e. the head of the model.); and
rendering, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment (See Figs. 7-8, 2D images of 3D reconstructions are rendered based on the model’s geometric output modality. They correspond to the environment the model represents.).
Hu does not explicitly disclose determining a location of a vehicle based on location data from one or more position sensors of the vehicle or a first neural implicit surface network.
Moustafa, in the same field of endeavor and solving a related problem, discloses determining a location of a vehicle based on location data from one or more position sensors of the vehicle (See [0002]-[0003], the vehicle receives electronic map tiles corresponding to a given location and uses them for navigation. See [0014], the vehicle receives multiple tiles corresponding to their planned route at once. Examiner asserts that using multiple tiles for navigation requires localizing the vehicle, i.e. determining the location, which is necessarily done with sensor data. Sensor data used for localization is location data.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for using implicit surface models of given environments and rendering a representation based on simulated vehicle locations of Hu to include determining the location of a real vehicle using sensor data and using the location to select the corresponding model of Moustafa. One of ordinary skill in the art would have been motivated to make this modification in order to allow use of the neural surface representations for navigation or awareness of the surroundings based on the vehicle’s location, as suggested by Moustafa at [0002].
Hu combined with Moustafa does not explicitly disclose a first neural implicit surface network.
Park, in the same field of endeavor and solving a related problem, renders obvious a first neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from implicit surface models based on vehicle location of Hu and Moustafa to include use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Hu combined with Moustafa and Park does not explicitly disclose one or more output heads of a plurality of output heads.
Zhi renders obvious the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone (See Fig. 2 and page 3 column 1 paragraph 3-column 2 paragraph 3, the system comprises several heads, specifically volumne density, semantic labeling, and RGB output heads, attached to a backbone, i.e. the shared common layers of the network. See page 2 column 2 paragraph 1-4, the system uses neural radiance fields (NeRF) implemented on a multi-layer perceptron, i.e. a neural network based implicit representation of the geometry of the scene. It would be obvious to try combining a similar architecture comprising backbone and plurality of output heads with a neural implicit surface network.)
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu, Moustafa, and Park to include the multi-headed network of Zhi. One of ordinary skill in the art would have been motivated to make this modification because join learning of the geometry and semantics can improve labeling, as suggested by Zhi at page 4 column 2 paragraph 1.
Claim 25 is rejected under 35 U.S.C. 103 as being obvious over Hu, Moustafa, Park, and Zhi in view of Li.
Regarding claim 25, Hu combined with Moustafa, Park, and Zhi renders obvious the limitations of claim 21. Hu combined with Moustafa, Park, and Zhi does not explicitly disclose wherein the one or more two-dimensional representations comprise a depth map based on the one or more output modalities. Li, in the same field of endeavor and solving a related problem, discloses wherein the one or more two-dimensional representations comprise a depth map based on the one or more output modalities (See Abstract, the paper is directed toward the creation of implicit surface representation of objects. See page 3 column 2 paragraph 2, the method uses a network to map a point to its signed distance value, i.e. the network is a signed distance network. See page 6 column 1 paragraphs 5-7, training the network comprises use of a depth loss based on corresponding depth data, i.e. a depth map, to supervise training of the network. The comparison of the processed output of the SDF based on the provided depth data indicates that the network is implicitly defining a depth map, which it uses for the supervised depth loss in training. This is a depth map based on the output modality of the network. See Fig. 10, the depth map is rendered in two-dimensional form.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu combined with Moustafa, Park, and Zhi to include the use of depth supervision in training of Li. One of ordinary skill in the art would have been motivated to make this modification in order to obtain more accurate surface reconstruction from common data sources such as LIDAR, as suggested by Li at page 6 column 1 paragraph 5.
Claim 26 is rejected under 35 U.S.C. 103 as being obvious over Hu, Moustafa, Park, and Zhi in view of Park2.
Regarding claim 26, Hu combined with Moustafa, Park, and Zhi renders obvious the limitations of claim 21. Hu further discloses wherein the one or more processors are configured to further cause the apparatus to execute at least one of a perception, vehicle localization, or vehicle route planning operation based on the one or more representations of the first sub-environment (See page 1 column 1 paragraph 3, the 3D environment representation is used for localization, i.e. vehicle localization, and trajectory planning, i.e. a route training operation.).
Hu combined with Moustafa, Park, and Zhi does not explicitly disclose based on the one or more two-dimensional representations.
Park2, in the same field of endeavor and solving a related problem, discloses based on the one or more two-dimensional representations (See [0006], the method converts 3D feature point information from HD map data to a 2D format, i.e. generates a two-dimensional representation, and compares the converted data to 2D feature point information captured by an image capturing device in order to perform localization.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu combined with Moustafa, Park, and Zhi to include converting 3D model information to 2D in order to perform localization based on acquired 2D sensor data of Park2. One of ordinary skill in the art would have been motivated to make this modification in order to improve accuracy of localization, as suggested by Park2 at [0003]-[0004].
Claim 26 is rejected under 35 U.S.C. 103 as being obvious over Hu in view of Park Zhi and Yokota.
Regarding claim 27, Hu discloses A method, comprising:
receiving first sensor data comprising a plurality of data corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors (See page 5 column 2 paragraph 1, the system receives data from LIDAR sensor and cameras, i.e. a plurality of sensors. LIDAR sensors and cameras inherently gather data from their environment.); and
generating, from a first implicit surface model, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data (See page 3 column 2 paragraph 1, Fig. 3, and page 4 column 1 paragraph 2, the method uses the measurements to generate an implicit surface model, specifically one based on a signed distance function. See page 1 column 1 paragraph 3-column 2 paragraph 1, the semantic mapped 3D models are used to extract semantic HD maps for automated driving vehicles. See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, perform texture mapping, and perform semantic mapping, i.e. labeling. These are all characteristics corresponding to the sensor data and therefore the environment.); and
executing a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task (See page 1 column 1 paragraph 3, the 3D environment representation is used for localization, i.e. vehicle localization, and trajectory planning, i.e. a route training operation.).
Hu does not explicitly disclose a first neural implicit surface network.
Park, in the same field of endeavor and solving a related problem, renders obvious a first neural implicit surface network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model based on a signed distance function disclosed by Hu to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Hu combined with Park does not explicitly disclose the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone.
Zhi renders obvious the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone (See Fig. 2 and page 3 column 1 paragraph 3-column 2 paragraph 3, the system comprises several heads, specifically volumne density, semantic labeling, and RGB output heads, attached to a backbone, i.e. the shared common layers of the network. See page 2 column 2 paragraph 1-4, the system uses neural radiance fields (NeRF) implemented on a multi-layer perceptron, i.e. a neural network based implicit representation of the geometry of the scene. It would be obvious to try combining a similar architecture comprising backbone and plurality of output heads with a neural implicit surface network.)
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for rendering representations of the environment from neural implicit surface models based on vehicle location of Hu and Park to include the multi-headed network of Zhi. One of ordinary skill in the art would have been motivated to make this modification because join learning of the geometry and semantics can improve labeling, as suggested by Zhi at page 4 column 2 paragraph 1.
Hu combined with Park, and Zhi does not explicitly disclose frames.
Yokota, renders obvious frames (See Abstract, the invention synchronizes data gathered by vehicle sensors. See [0060], the invention can synchronize vehicle camera and LIDAR data.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model disclosed by Hu, Park and Zhi to include synchronizing the sensor data of Yokota. One of ordinary skill in the art would have been motivated to make this modification so that data gathered from multiple sensors, including sensors used for localization and vehicle pose estimation, can have accurate relative timestamps for processing, leading to more accurate results when the data are processed together, as suggested by Yokota at [0010] and [0016]-[0019].
Response to Arguments
(A) Applicant argues “Discussion of Claim Rejections under 35 U.S.C. § 112
Claims 1, 12, and 27 stand rejected under Section 112 as allegedly being indefinite. Office Action, pgs. 2-3. Applicant respectfully traverses this rejection.
The Office asserts that the recitation of the term "frames" in claims 1, 12, and 27 is allegedly indefinite because the specification allegedly contains an explicit definition for the term "frames" in paragraph [0030] but allegedly conflicts other uses of the term "frame" in paragraphs [0030], [0033], and [0010]. Paragraph [0030] recites "[a]s used herein, the term "frames" refers to groups of pairwise aligned sensor data for a given time and/or location within an environment." Each of the other indicated uses indicated by the Examiner in paragraphs [0030], [0033], and [0010] are clear in that they are discussing either "a first frame of image" ( [0030]), "FIG. 2 depicts an illustrative frame of image data 200" ( [0030]), and "FIG. 2 depicts an illustrative frame of image data" ( [0010]). Applicant respectfully draws the Examiner's attention to the fact that the explicit definition recited in paragraph [0030] is discussing "frames" plural and the other instances highlighted by the Examiner in paragraphs [0010], [0030], and [0033], when read in totality are clearly discussing one illustrative example of a frame and not a contradiction to the explicit definition provided in paragraph [0033].
Accordingly, Applicant respectfully submits that the claims are not indefinite under 35 U.S.C. § 112(b) for at least the reasons presented herein. Applicant respectfully requests reconsideration and withdrawal of the rejections asserted under 35 U.S.C. §112(b).”
As to (A), in light of Applicant’s clarification that the explicit definition for frames applies exclusively to the plural frames and excludes the singular frame, Examiner with draws the rejections under 35 USC 112(b).
(B) Applicant argues “Discussion of Claim Rejections under 35 U.S.C. § 101
Claims 1-28 stand rejected under 35 U.S.C. § 101 as allegedly being directed to unpatentable subject matter. Office Action, pgs. 3-8. The rejections are respectfully traversed in view of the amendments to the claims and remarks presented herein.
Eligibility Analysis Arguments
Applicant respectfully submits that amended independent claims 1, 21, 27, and 28 are not an abstract idea that falls within the enumerated grouping of mental processes. The claims recite processes that cannot practically be performed in the human mind. USPTO's Deputy Commissioner for Patents, Charles Kim, issued a Memorandum dated August 4, 2025 (referred to herein as the "August 2025 Memorandum") that provides several reminders on evaluating subject matter eligibility of claims under 35 U.S.C. § 101, and in particular, those that allegedly fall within the enumerated grouping of mental processes. The August 2025 Memorandum provides a discussion of when a subject matter eligibility rejection should be made. Regarding Step 2A Prong One, the August 2025 Memorandum provides a reminder that:
The courts consider a mental process (thinking) that "can be performed in
the human mind, or by a human using a pen and paper," to be an abstract idea. The USPTO subject matter eligibility analysis follows this precedent and instructs examiners to determine that a claim recites a mental process when it contains limitation(s) that can practically be performed in the human mind, including, for example, observations, evaluations, judgments, and opinions. On the other hand, a claim does not recite a mental process when it contains limitation(s) that cannot practically be performed in the human mind,for instance when the human mind
is not equipped to perform the claim limitation(s).
The mental process grouping is not without limits. Examiners are
reminded not to expand this grouping in a manner that encompasses claim limitations that cannot practically be performed in the human mind. Claim limitations that encompass AI in a way that cannot be practically performed in
the human mind do notfall within this grouping.
(Emphasis added).
As a non-limiting explanation of the claims, the claims recite aspects for implementing a direct approach to HD map generation that encodes one or more characteristics of the environment (e.g., the geometry, appearance, and semantic information associated with the environment) with a neural implicit surface network to (e.g., implicitly) represent the first environment perceived by the sensor data. The neural implicit surface network includes a unique architecture that includes a backbone of a neural network and a plurality of output heads to implicitly represent an environment with the neural implicit surface network. The neural implicit surface network may provide a unified solution to encoding richer information in fixed, efficient and manageable memory footprints while providing an alternative solution to dense HD map production, labeling, and rendering that current processes fall short in delivering.
Accordingly, for at least these reasons claims 1-28, as a whole, are directed to subject matter that is significantly more than an abstract idea. For at least these reasons, Applicant respectfully submits that claims 1-20 relate to patent eligible subject matter under 35 U.S.C. § 101. Reconsideration and withdrawal of the rejection is respectfully requested.
A more detailed analysis and remarks regarding the eligibility of independent claims 1, 21, 27, and 28 under 35 U.S.C. § 101 will now be discussed.
The Supreme Court reiterated the framework set forth previously in Mayo Collaborative Servs. v. Prometheus Labs., Inc., 132 S. Ct. 1289, 1293 (2012), for distinguishing patents that claim laws of nature, natural phenomena, and abstract ideas from those that claim patent-eligible applications of these concepts. See Alice Corp. v. CLS Bank Int'l., 573 U.S. 208 (2014). The Court's framework has been adopted and explained by the Office, see "2014 Interim Guidance on Patent Subject Matter Eligibility, Federal Register Vol. 79, No. 241, pp. 74618 -74633 (December 16, 2014)," ("Interim Guidance"). As set forth in recent case law and recent interim guidance of patent subject matter eligibility, the current accepted process for determining subject matter11
eligibility is to first determine whether a claim is directed to a process, machine, manufacture, or composition of matter (Step 1). If so, a determination is to be made regarding whether the claim is directed to a judicial exception, such as an abstract idea (Step 2A). If the claim is directed to a judicial exception, a determination should be made regarding whether the claim recites additional elements that amount to "significantly more" than the judicial exception (Step 2B).
Step 1
Claims 1-28 are not rejected under Step 1 as they correspond to a statutory category of invention. Office Action pgs. 3-4.
Step 2A
Applicant respectfully submits that Step 2A is satisfied because claims 1-28 are either not directed to a judicial exception, such as an abstract idea (Prong One), and/or incorporate the
judicial exception into a practical application (Prong Two). The Supreme Court has defined the boundaries of "abstract ideas" using examples and previous case law to include the following as "abstract ideas": fundamental economic practices; certain methods of organizing human activities; an idea of itself; and mathematical relationships/formulas. However, the Supreme Court has also cautioned that even patent-eligible subject matter includes some form of abstract ideas. (See Alice,
134 S. Ct. at 2354).
The USPTO 2019 Revised Patent Subject Matter Eligibility Guidance ("Guidance") issued on January 4, 2019 requires a two prong test under Step 2A.
Prong One: Evaluate Whether the Claim Recites a Judicial Exception
The Guidance extracts and synthesizes key concepts identified by the courts as abstract ideas to explain that the abstract idea exception includes the following groupings of subject matter,
when recited as such in a claim limitation(s) (that is, when recited on their own or per se):
(a) Mathematical concepts - mathematical relationships, mathematical formulas or
equations, mathematical calculations;
(b) Certain methods of organizing human activity - fundamental economic principles or practices (including hedging, insurance, mitigating risk); commercial or legal interactions (including agreements in the form of contracts; legal obligations; advertising, marketing or sales activities or behaviors; business relations); managing personal behavior or relationships or interactions between
people (including social activities, teaching, and following rules or instructions);
and
(c) Mental processes - concepts performed in the human mind (including an
observation, evaluation, judgment, opinion).
The Guidance states that claims that do not recite matter that falls within these enumerated groupings of abstract ideas should not be treated as reciting abstract ideas, except as follows: In the rare circumstance in which a USPTO employee believes a claim limitation that does not fall within the enumerated groupings of abstract ideas should nonetheless be treated as reciting an
abstract idea. The claimed subject matter does not fall within a) mathematical concepts, b) certain methods of organizing human activity, or c) Mental processes which are enumerated in 2019
Revised Patent Subject Matter Eligibility Guidance of January 4, 2019.
The claimed subject matter is directed to three-dimensional (3D) implicit surface reconstruction for dense high-definition (HD) maps with neural representations, and in certain aspects, to techniques for training and utilizing neural implicit surface networks to implicitly represent a plurality of sub-environments of a global environment
First, independent claims 1 and 27 will be discussed, then independent claims 21 and 28 will be discussed.
As an initial matter, as discussed during the Examiner interview, amending the claims to recite utilization of the generated HD map would integrate the claim into a practical application. Independent claims 1 and 27 have been amended to recite the utilization of the HD map through the execution of a task comprising at least one of a perception, localization, or planning task.
Discussion of Independent claims 1 and 27 under Step 2A Pronmg One
Independent claim 1 is amended to recite:
1. An apparatus, comprising:
one or more memories; and
one or more processors, coupled to the one or more memories, configured
to cause the apparatus to:
receive first sensor data comprising a plurality of frames corresponding to a first environment, wherein the first sensor data is
generated from a plurality of sensors;
generate, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output
heads appended to the backbone, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding
to the first environment determined based on the first sensor data; and
execute a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task. Independent claim 27 is similarly amended.
More specifically, amended independent claims 1 and 27 are not directed to mental processes such as concepts performed in the human mind as alleged on page 4 of the Office Action. Applicant respectfully submits that at least the aforementioned recitations of amended independent claims 1 and 27 cannot be practically performed in the human mind.
The "mental processes" abstract idea grouping is defined as concepts performed in the human mind, and examples of mental processes include observations, evaluations, judgments, and opinions. (M.P.E.P. § 2106.04(a)(2)(III). The aforementioned recitations of independent claims 1 and 27 recite a specific architecture for a first neural implicit surface network that is used to generate a first HD map, which is a structure and operation thereof that cannot be practically performed in the human mind. As discussed in the 2024 Guidance Update on Patent Subject Matter Eligibility, Including on Artificial Intelligence published on July 17, 2024 (89 FR 58128) (AI- SME Update), which provides further explanation on Step 2A of the USPTO's subject matter eligibility analysis, we are reminded of claim limitations that cannot be practically performed in the human mind. For example, the AI-SME Update discusses an example where a claim does not recite a mental process because it cannot practically be performed in the human mind is a claim to "a specific, hardware-based RFID serial number data structure" (i.e., an RFID transponder), where the data structure is uniquely encoded (i.e., there is "a unique correspondence between the data physically encoded on the [RFID transponder] with preauthorized blocks of serial numbers"). The recited architecture of the "first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone," is utilized to implicitly represent an environment with the neural implicit surface network which can provide a unified solution to encoding richer information in fixed, efficient and manageable memory footprints while providing an alternative solution to dense HD map production, labeling, and rendering that current processes fall short in delivering, as discussed in at least paragraph [0025] of the specification. At least the14
aspect of implicitly representing an environment with a first neural implicit surface network is not a mental process that can be practically performed in the human mind.
Discussion of Independent claims 21 and 28 under Step 2A Prong One
Similar to independent claims 1 and 27, independent claims 21 and 28 incorporate a "first neural implicit surface network" to "generate, with one or more of a plurality of output heads of the first neural implicit surface network" and "render, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment," as recited in amended independent claim 21 and as similarly recited in amended independent claim 28. The
aforementioned recitations are not mental processes that can practically be performed in the human mind. In particular, the recitations recite architecture that performs the operations, which are not features of the human mind. Furthermore, as discussed above, the AI-SME Update discusses an example where a claim does not recite a mental process because it cannot practically be performed in the human mind is a claim to "a specific, hardware-based RFID serial number data structure" (i.e., an RFID transponder), where the data structure is uniquely encoded (i.e., there is "a unique correspondence between the data physically encoded on the [RFID transponder] with
preauthorized blocks of serial numbers"). The recited architecture of a first neural implicit surface network and the one or more of a plurality of output heads of the first neural implicit surface network used to generate one or more output modalities; and render, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment, are not mental processes as they utilize specific neural implicit surface network architecture.
For at least these reasons, Applicant respectfully submits that the claimed subject matter does not recite matter that falls within these enumerated groupings of abstract ideas, and thus, should not be treated as reciting abstract ideas.
Prong Two: If the Claim Recites a Judicial Exception, Evaluate Whether the Judicial Exception Is Integrated Into a Practical Application
However, assuming arguendo, if the instant claims recites a judicial exception under prong one of Step 2A, (which Applicant refutes) the amended claims nevertheless are integrated into a practical application.
The Supreme Court has explained that the judicial exceptions reflect the Court's view that abstract ideas, laws of nature, and natural phenomena are "the basic tools of scientific and technological work", and are thus excluded from patentability because "monopolization of those tools through the grant of a patent might tend to impede innovation more than it would tend to promote it." Alice Corp., 573 U.S. at 216, 110 USPQ2d at 1980. (M.P.E.P § 2106.04(I)).
The MPEP's guidance states that in Prong Two, Examiners should evaluate whether the claim as a whole integrates the recited judicial exception into a practical application of the exception. (MPEP § 2106.04(II)(A)(2)). (Emphasis added). In the context of Step 2A Prong Two, the MPEP provides that exemplary considerations, such as an improvement to technology or technical field, or applying or using the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, are indicative that an additional element (or combination of elements) may have integrated the exception into a practical application. (MPEP § 2106.04(d)(I)). The MPEP further states that the evaluation of whether claim elements integrate the exception into a practical application includes
(1) identifying whether there are any additional elements recited in the claim beyond the judicial exception(s); and (2) evaluating those additional elements individually and in combination to determine whether they integrate the exception into a practical application.
Regarding Step 2A Prong Two, the August 2025 Memorandum provides a reminder that:
The analysis in Step 2A Prong Two considers the claim as a whole. The
way in which the additional elements use or interact with the exception may integrate the judicial exception into a practical application. Accordingly, the additional limitations should not be evaluated in a vacuum, completely separate from the recited judicial exception. Instead, the analysis should take into consideration all the claim limitations and how these limitations interact and impact each other when evaluating whether the exception is integrated into a practical
application.
In computer-related technologies, examiners can conclude that claims are
eligible in Step 2A Prong Two by finding that a claim reflects an improvement to the functioning of a computer or to another technology or technical field, integrating a recited judicial exception into a practical application of the exception.
This consideration has also been referred to as the search for a technological solution to a technological problem. An important consideration in determining whether a claim improves technology or a technical field is the extent to which the claim covers a particular solution to a problem or a particular way to achieve a
desired outcome, as opposed to merely claiming the idea of a solution or outcome.
The examiner is reminded to consult the specification to determine whether the disclosed invention improves technology or a technical field, and evaluate the claim to ensure it reflects the disclosed improvement. The specification does not need to explicitly set forth the improvement, but it must describe the invention such that the improvement would be apparent to one of ordinary skill in the art. The claim itself does not need to explicitly recite the improvement described in the
specification.
(Emphasis added).
The Guidance states that in Prong Two, Examiners should evaluate whether the claim as a whole integrates the recited judicial exception into a practical application of the exception. (Emphasis added). In the context of revised Step 2A, the Guidance provides exemplary considerations are indicative that an additional element (or combination of elements) may have integrated the exception into a practical application. The claimed subject matter includes additional elements that apply or use the judicial exception in some other meaningful way beyond generally linking the use of the judicial exception to a particular technological environment, such that the claim as a whole is more than a drafting effort designed to monopolize the exception.
In particular, independent claim 1 is amended to recite "generate, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data; and execute a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task. Independent claim 27 is similarly amended. The aforementioned recitations of the independent claims 1 and 27 recite the practical application of utilizing the generated HD maps for a task comprising at least one of a perception, localization, or planning task.
Regarding independent claims 21 and 28, the claims are integrated into a practical application of "rendering, based on the one or more output modalities, one or more two- dimensional representations of the first sub-environment." Implicitly representing an environment with the neural implicit surface network may provide a unified solution to encoding richer information in fixed, efficient and manageable memory footprints while providing an alternative solution to dense HD map production, labeling, and rendering that current processes fall short in17
delivering. The recited features of independent claims 21 and 28 provide a technological solution to a technological problem.
More specifically, the practical application of rendering, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment" provide a technical benefit of a dense representation of the sub-environment, as discussed in at least paragraph [0029] of the specification. Moreover, in certain aspects, the implicit representation can be queried by one of a plurality of different output layers (e.g., heads) to generate various two- dimensional representations of the sub-environment. Accordingly, certain aspects of techniques discussed herein may provide another technical benefit of flexibility and/or generalization of information that can be obtained from the trained neural implicit surface network. Flexibility refers to the technical benefit of adapting new modalities to the trained neural implicit surface network without modifying the network architecture. Generalization refers to the technical benefit of obtaining information about the sub-environment that may not have been directly perceived or measured by a sensor.
Furthermore, similar to EXAMPLE 40 of the 2019 PEG that found a judicial exception integrated into a practical application through recitation of a meaningful limitation of at least collecting additional Netflow protocol data relating to traffic when the collected network delay, packet loss, or jitter is greater than the predefined threshold, the independent claims integrate any alleged judicial exception into a practical application of "generate, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data; and execute a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task," as recited in amended independent claims 1 and 27 and "generate, from one or more a plurality of output heads of the first neural implicit surface network, one or more output modalities; and render, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment," as recited in amended independent claims 21 and 28.
Using a computer as a tool to perform an allegedly abstract idea does not render a concept unpatentable. "For example, [regarding EXAMPLE 40,] but for the 'by the network appliance'18
language, the claim encompasses a user simply comparing the collected packet loss data to a predetermined acceptable quality percentage in his/her mind." (2019 PEG, page 11). What is the important under Step 2A Prong 2 is whether "the claim as a whole is directed to a particular improvement"..."although each of the collecting steps analyzed individually may be viewed as mere pre- or post-solution activity." (Id.). Here, the claims as a whole are similarly directed to a particular improvement as discussed hereinabove.
Accordingly, Applicant asserts that amended independent claims 1, 21, 27, and 28 and the claims depending therefrom, either do not recite a judicial exception or in the alternative, incorporate the judicial exception into a practical application. When a claim is eligible under step 2A, step 2B need not be analyzed. Therefore, Applicant requests that the rejection of the claims as patent ineligible be withdrawn.
Step 2B
Furthermore, even if the claims are directed to an abstract idea of a grouped alleged judicial exception and are not held to integrate the alleged judicial exception into a practical application in respective Prongs 1 and 2 of Step 2A, the claims include "something more" than the abstract idea (thus, Step 2B: Yes). Here, the claims are patent eligible under § 101 and recite "significantly more" than an abstract idea as they effect an improvement in the technology and/or technical field of neural implicit surface networks, generation of HD maps, and utilization of the HD maps. In particular, at least the additional elements in the recitations "generate, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data; and execute a task based on the first HD map, the task comprising at least one of a perception, localization, or planning task," as recited in amended independent claims 1 and 27 and "generate, from one or more a plurality of output heads of the first neural implicit surface network, one or more output modalities; and render, based on the one or more output modalities, one or more two-dimensional representations of the first sub-environment," as recited in amended independent claims 21 and 28 provide significantly more than the alleged judicial exception.
Accordingly, claims 1-28, as a whole, are directed to subject matter that is significantly more than an abstract idea. For at least these reasons, Applicant respectfully submits that claims 1-28 relate to patent eligible subject matter under 35 U.S.C. § 101. Reconsideration and withdrawal of the rejection is respectfully requested.”
As to (B), Examiner does not find the argument persuasive.
Regarding Step 1, Examiner notes that independent claims 1 and 21 and their corresponding dependent claims are not directed to statutory categories of invention. See the final paragraph of the 35 USC 101 section.
Regarding Step 2A Prong One, Examiner does not find the argument persuasive. Examiner agrees that a human could not practically represent or determine how to represent an environment as an implicit distance function or implicit distance function network from sensor data. However, a human with access to sufficient quantities of sensor data could, using observation and judgement, create a highly detailed map, i.e. an HD map.
Regarding Step 2A Prong Two, Examiner does not find the argument persuasive. The specific architecture of the network is not specified in any claim. An implicit function network comprising a backbone could be implemented in many specific ways. An implicit function network comprising backbone requires shared layers leading to multiple output heads and at least some segment of the network predicting a signed distance function (SDF). Examiner believes that the claims do not include the components or steps of the invention the provide any improvement described in the specification.
Regarding Step 2B, Examiner does not find the argument persuasive. In this case, the additional limitation of one or more memories; and one or more processors… is well-understood, routine, and conventional activity, because the specification does not provide any indication that the one or more memories; and one or more processors… is/are anything more than conventional computer(s). Additionally, the remaining element(s) has/have been deemed insignificant extra-solution activity by one or more courts; see at least MPEP 2106.05(d) and MPEP 2106.05(g):
receive first sensor data comprising a plurality of frames corresponding to a first environment, wherein the first sensor data is generated from a plurality of sensors… is considered well-understood, routine, and conventional activity under CyberSource v. Retail Decisions, Inc., 654 F.3d 1366, 1375, 99 USPQ2d 1690, 1694 (Fed. Cir. 2011) (mere data gathering in conjunction with a law of nature or abstract idea such as a step of obtaining information so that the information can be analyzed by an abstract mental process.).
render, based on the one or more output modalities, one or more two-dimensional representations… is considered well-understood, routine, and conventional activity under TLI Communications, 823 F.3d at 612-13, 118 USPQ2d at 1747-48 (Gathering and analyzing information using conventional techniques and displaying the result.).
(C) Applicant argues “Discussion of Claim Rejections under 35 U.S.C. § 103
Claims 1-5, 10-12, and 27 are rejected under 35 U.S.C. 103 as allegedly being obvious over NPL document "Large-Scale 3D Semantic Reconstruction for Automated Driving Vehicles with Adaptive Truncated Signed Distance Function", hereinafter "Hu", and NPL document "DeepSDF: Learning Continuous Signed Distance Functions. Claims 6-7 are rejected under 35 U.S.C. 103 as being obvious over Hu and Park in view of NPL document "Vox-Surf: Voxel-based Implicit Surface Representation", hereinafter "Li." Claim 8 is rejected under 35 U.S.C. 103 as being obvious over Hu and Park in view of NPL document "Multinomial logistic regression", hereinafter "Wikipedia." Claim 9 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, and Wikipedia in view of NPL document "Semantic Implicit Neural Scene Representations With Semi-Supervised Training", hereinafter "Kohli." Claims 13, 21-22, 24, and 28 are rejected under 35 U.S.C. 103 as being obvious over Hu and Park in view of US 20190244517 Al, hereinafter "Moustafa." Claim 23 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, and Moustafa, in view of Wikipedia. Claim 25 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, and Moustafa, in view of Li. Claim 26 is rejected under 35 U.S.C. 103 as being obvious over Hu, Park, and Moustafa, in view of US 20230136492 Al, hereinafter "Park2." Claim 14 is rejected under 35 U.S.C. 103 as being obvious over Hu and Park, in view of US 20200167956 Al, hereinafter "Herman." Claim 15 is rejected under 35 U.S.C. 103 as being obvious over Hu and Park, in view of US 20220254165 Al, hereinafter "Yokota." Claims 16-18 are rejected under 35 U.S.C. 103 as being obvious over Hu and Park, in view of Li and US 20220358319 Al, hereinafter "Lee." Claims 19-20 are rejected under 35 U.S.C. 103 as being obvious over Hu and Park, in view of Li and NPL document "3D Object Detection and Instance Segmentation from 3D Range and 2D Color Images", hereinafter Shen.
Applicant respectfully traverses these rejections in view of the amendments to the claims presented herein.
A claim is not obvious under 35 U.S.C. § 103 when a prior art reference or combination of references fails to teach or suggest each and every element recited in the claim. Rejections on obviousness cannot be sustained by mere conclusory statements; instead, there must be some articulated reasoning with some rational underpinning to support the legal conclusion of obviousness. (KSR Int'l Co. v. Teleflex Inc., 550 U.S. 398, 418, 127 S. Ct. 1727, 1740 (2007). See also Adapt Pharma Operations Ltd. v. Teva Pharms. USA, Inc., 25 F.4th 1354, 1365, 2022 USPQ2d 144 (Fed. Cir. 2022). KSR stated that a determination of obviousness "requires 'identify[ing] a reason that would have prompted a person of ordinary skill in the relevant field to combine the elements in the way the claimed new invention does."'
Discussion of Independent Claims 1 and 27
Applicant respectfully submits that the cited references, alone or in combination, do not teach or suggest each and every element recited in amended independent claims 1 and 28. In particular, the cited references do not teach or suggest the following recitation of amended independent claim 1, which is similarly recited in amended independent claim 27: "generate, from a first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone, a first high-definition (HD) map comprising labels created from one or more characteristics corresponding to the first environment determined based on the first sensor data."
The Office on pages 9-11 alleges that independent claim 1 and similarly claim 27 are obvious in view of Hu and Park. In particular, the Office asserts that Hu does not explicitly disclose "generate, from a first neural implicit surface network," but Park cures the deficiency of Hu. That is, Park's teaching of modeling SDFs with Neural Networks in section 3 of Park allegedly teaches an implicit surface network to approximate the signed distance function SDF and thereby the implicit surface described by the SDF. However, Applicant respectfully submits that amended independent claims 1 and 27 recite a "first neural implicit surface network, the first neural implicit surface network comprising a backbone and a plurality of output heads appended to the backbone." Neither Park nor Hu discuss a neural implicit surface network architecture as recited in the claims.
Furthermore, the first neural implicit surface network recited in the claims is utilized to generate an HD map, which is not the same as an SDF discussed in Park.
For at least the foregoing reasons, Applicant respectfully submits that independent claims 1 and 27 and, all claims that depend therefrom are patentable over the cited references, in particular Park and Hu. Reconsideration and withdrawal of the rejections under 35 U.S.C. § 103 is respectfully requested.
Discussion of Independent Claims 21 and 28
Applicant respectfully submits that the cited references, alone or in combination, do not teach or suggest each and every element recited in amended independent claims 21 and 28. In particular, the cited references do not teach or suggest the following recitation of independent claim 21, which is similarly recited in independent claim 28: "select a first neural implicit surface network from a plurality of neural implicit surface networks respectively trained to represent a plurality of sub-environments of a global environment, wherein the first neural implicit surface network is trained to represent a first sub-environment of the plurality of sub-environments, the first sub-environment corresponding to the location of the vehicle."
The Office on pages 27-30 alleges that independent claim 21 and similarly claim 28 are obvious in view of the combination of the teachings of Hu, Park, and Moustafa. In particular, the Office suggests that Hu teaches "select a first neural implicit surface network from a plurality of neural implicit surface networks respectively trained to represent a plurality of sub-environments of a global environment, wherein the first neural implicit surface network is trained to represent a first sub-environment of the plurality of sub-environments, the first sub-environment corresponding to the location of the vehicle," as recited in independent claim 21 and as similarly recited in independent claim 28. More specifically, the Office states that since FIGs. 7-8 and page 5, column 2, paragraph 2 depict and discuss reconstructions from a plurality of datasets corresponding to different locations, that Hu teaches the aforementioned recitation of independent claims 21 and 28. However, Hu does not contemplate or discuss having a plurality of neural implicit surface networks respectively trained to represent a plurality of sub-environments of a global environment. Furthermore, Hu does not teach or suggest making a selection between the plurality of neural implicit surface networks, in part because Hu does not teach the plurality of neural implicit surface networks, but also because there is no selection process discussed in Hu. Hu discusses a single model that is trained on datasets that correspond to different locations so that the model can reconstruct views such as those depicted in FIGs. 7-8. Accordingly, Hu fails to teach22
or suggest "select a first neural implicit surface network from a plurality of neural implicit surface networks respectively trained to represent a plurality of sub-environments of a global environment, wherein the first neural implicit surface network is trained to represent a first sub-environment of the plurality of sub-environments, the first sub-environment corresponding to the location of the vehicle," as recited in independent claim 21 and as similarly recited in independent claim 28.
The other cited references, Park and Moustafa are not asserted as teaching the aforementioned recitations of independent claims 21 and 28. Additionally, it does not appear that the cure the deficiencies of Hu.
For at least the foregoing reasons, Applicant respectfully submits that independent claims 21 and 28 and, all claims that depend therefrom are patentable over the cited references, in particular Hu, Park, and Moustafa. Reconsideration and withdrawal of the rejections under 35 U.S.C. § 103 is respectfully requested.
Discussion of Dependent Claims
Each of the remaining dependent claims depends from one of the independent claims referenced in the arguments presented above. Each of these claims includes features recited in its base claim as well as any intervening claims. Accordingly, these claims are allowable for substantially similar reasons as discussed above and for their additional novel features recited therein. See M.P.E.P. § 2143.03 ("If an independent claim is nonobvious under 35 U.S.C. 103, then any claim depending therefrom is nonobvious. In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988)").
No Disclaimers or Disavowals
This response includes alterations to the claims and/or characterizations of claim scope or cited references. Applicant is not conceding in this application that previously pending claims are not patentable over the cited references. Rather, the alterations or characterizations are made for the purpose of facilitating expeditious prosecution of this application. Applicant reserves the right to pursue at a later date any previously pending claim, whether narrower or broader, that captures subject matter supported by this application's disclosure. This reservation of rights includes subject matter specifically disclaimed in this application or any prior prosecution. As a result, reviewers
of this or any prosecution history of a related application should not infer that Applicant has made any disclaimers or disavowals of any subject matter supported by the present application.”
Regarding independent claims 1 and 27, Examiner does not find the argument persuasive.
In response to applicant's arguments against the references individually, one cannot show nonobviousness by attacking references individually where the rejections are based on combinations of references. See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).
Hu recites creation of an HD map from voxel data corresponding to an implicit representation of an environment using a signed distance function (See page 3 column 2 paragraph 1, Fig. 3, and page 4 column 1 paragraph 2, the method uses the measurements to generate an implicit surface model, specifically one based on a signed distance function. See page 1 column 1 paragraph 3-column 2 paragraph 1, the semantic mapped 3D models are used to extract semantic HD maps for automated driving vehicles. See page 3 column 2 paragraph 1-page 5 column 1 paragraph 1, the sensor data is used to create the implicit surface representation of the environment, perform texture mapping, and perform semantic mapping, i.e. labeling. These are all characteristics corresponding to the sensor data and therefore the environment.).
Park recites approximating the signed distance function using a neural network, i.e. a signed distance network (See Section 3 “3. Modeling SDFs with Neural Networks”, the authors train a neural network, i.e. an implicit surface network, to approximate the signed distance function (SDF) and thereby the implicit surface described by the SDF.).
It would have been obvious for one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system for creating HD maps from sensor data using an implicit surface model based on a signed distance function disclosed by Hu to include the use of an implicit surface network of Park. One of ordinary skill in the art would have been motivated to make this modification in order to preserve accuracy while reducing the model size and preserve the fine details of the environment, as suggested by Park at Abstract and page 2 column 2 paragraph 5-page 3 column 1 paragraph 1.
Regarding independent claims 21 and 28, Examiner does not find the argument persuasive.
In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
Independent claims 21 and 28 recite select a first neural implicit surface network from a plurality of neural implicit surface networks … corresponding to the location of the vehicle and selecting a first neural implicit surface network from a plurality of neural implicit surface networks respectively trained to represent a plurality of sub-environments of a global environment … corresponding to the location of the vehicle.
The invention as claimed does not recite choosing the environment based on or using the location of the vehicle. Selecting between multiple networks corresponding to multiple maps is rendered obvious by Hu (See page 5 column 2 paragraph 2, the system is evaluated on a plurality of datasets. See Figs. 7-8, the reconstructions from the datasets correspond to different locations, i.e. environments, indicating that the datasets correspond to different locations. The trained implicit function models therefore are trained to represent different environments. The set of trained models inherently represent sub-environments of a global environment, namely the union of the real and simulated environments the datasets correspond to. See page 7 column 1 paragraph 3-page 8 column 1 paragraph 1, the models can be used as environments in a simulation. Examiner asserts that simulating a vehicle comprises using the model corresponding to the environment the vehicle is being simulated in.).
Conclusion
The prior art made of record and not relied upon is considered pertinent to the applicant’s disclosure and may be found on the accompanying PTO-892 Notice of References Cited:
NPL document “Rethinking Reprojection: Closing the Loop for Pose-aware Shape Reconstruction from a Single Image” which relates to reprojecting a learned 3D shape back onto the training image and comparing the results as a way of supervising the learning process.
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to AUSTIN ROBERT CHENNAULT whose telephone number is (571)272-4606. The examiner can normally be reached Monday - Friday 9:00am - 5:00pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Hitesh Patel can be reached at (571) 270-5442. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/AUSTIN ROBERT CHENNAULT/Examiner, Art Unit 3667
/Hitesh Patel/Supervisory Patent Examiner, Art Unit 3667
3/24/26