Office Action Analysis: 18140445 — CHIP BASED LIDAR 3D OBJECT DETECTION SYSTEM AND METHOD

Office Action

§103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to because Figures 3, 4, 5, and 6 are blurry, and the text is difficult to read.  Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.
The drawings are objected to as failing to comply with 37 CFR 1.84(p)(5) because they include the following reference character(s) not mentioned in the description: 300 (a flow map of the 3D lidar detection flow).  Corrected drawing sheets in compliance with 37 CFR 1.121(d), or amendment to the specification to add the reference character(s) in the description in compliance with 37 CFR 1.121(b) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities: 
On page 3, line 27, “memory interconnect 130” should read memory interconnect  “memory interconnect 180”, based on drawing shown in Fig. 1.
On page 4, lines 22-23, “peripherals interface 230” should read “peripherals interface 280”, based on the drawing shown in Fig. 2. 
On page 5, lines 8-12 appear to be a run on sentence. Additionally, the specification fails to mention reference number 300.  The examiner suggests rewriting the sentence to read “Fig 3 depicts an example of the 3D lidar detection flow 300. 3D lidar detection system 310 ….” 
Appropriate correction is required.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claim(s) 1, 6-7, 19 and 20 are rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu.

As to Claim 1, Vora teaches a method of detecting three-dimensional objects using lidar points (see paragraph [0032], “The disclosed techniques are implemented using a sequential fusion architecture for 3D object detection that accepts LiDAR point clouds”) comprising:
receiving a point cloud having a set of irregular lidar points (see paragraph [0092], “In the scenario shown in this figure, the AV 100 receives both camera system output 504 c in the form of an image 702 and LiDAR system output 504 a in the form of LiDAR data points” and see paragraph [0003], “LiDAR data, however, is sparse”, where sparse means irregular and unstructured);
assigning the set of irregular lidar points to either a three-dimensional or two-dimensional grid thereby resulting in a set of assigned points (see paragraph [0129], “As a first step the point cloud is discretized into an evenly spaced grid in the x-y plane,”);
determining a pseudo image based on the set of assigned points thereby resulting in a set of regular pseudo image points (see paragraph [0129], “In an embodiment, to apply a 2D convolutional architecture, the pillar feature network 1502 converts the point cloud to a pseudo-image”); 
decorating at least one point (see paragraph [0129], “The points in each pillar are then augmented with x_c, y_c, z_c, x_p and y_p where the c subscript denotes distance to the arithmetic mean of all points in the pillar and the p subscript denotes the offset from the pillar x, y center.”, where calculating the offset is interpreted as ‘point decoration’. The instant application defines ‘point decoration’ page 6, line 4, which states, “Points decoration adds an offset to detected points”), 
and encoding the at least one point as a set of high dimension regular features (see paragraph [0129], “The pseudo-image is input into deep learning backbone 1405. In an embodiment, the backbone is a 2D CNN, as described in reference to FIG. 15. The output of the backbone 1405 are features”);
and predicting at least one three-dimensional object utilizing the set of high dimension regular features (see paragraph [0127], “ The output of the backbone 1405 are features that input into detection 1406, which estimates (predictions) of oriented 3D bounding boxes” and see paragraph [0128], “A pillar feature network 1502 included in the point pillar network 1500 is configured to accept decorated point clouds as input and estimate/predict oriented 3D boxes for various classes, including but not limited to cars, pedestrians and cyclists.”
Vora fails to teach encoding the set of regular pseudo image points via normalizing a reflection channel. However, Hu teaches a Lidar system for object detection in vehicles (see abstract) that teaches encoding points via normalizing a reflection channel (see paragraph [0012], “Accordingly, the method includes determining an intensity normalization multiplier for each channel of the Lidar unit. The intensity normalization multiplier for a channel is determined based on a median intensity value determined based on the raw intensity values included in the collected set of data points that correspond to the channel. The determination of the intensity normalization multipliers assumes a predefined value for reflectivity of the type of surface (e.g., a ground truth reflectivity) to which the data points correspond”).
Hu is combinable with Vora as both are from the analogous field of 3D object detection and Lidar. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the reflection channel normalization taught by Hu with the 3D object detection system taught by Vora. The motivation for doing so would be to increase the accuracy of the values of the reflective channel. Hu teaches in paragraph [0014], “By normalizing raw intensity values in the manner, the method aligns Lidar intensity returns more closely with the reflectivity of reflected surfaces and objects thereby providing downstream processing systems with more accurate and detailed information regarding the reflected surfaces and objects.” Thus, it would have been obvious to combine the reflective channel normalization taught by Hu with the 3d object detection system taught by Vora in order to obtain the invention as claimed in Claim 1. 

As to Claim 6, Vora in view of Hu teaches further comprising transforming the set of irregular lidar points from a lidar coordinate frame to a camera coordinate frame (see Vora, paragraph [0114], “The LiDAR points are transformed by fusion module 1305 from a LiDAR ego-vehicle coordinate frame to a camera coordinate frame”).

As to Claim 7, Vora in view of Hu further comprising determining two-dimensional point coordinates on an image plane by projecting three-dimensional coordinates onto a two- dimensional plane (see Vora, paragraph [0114], “The LiDAR points are transformed by fusion module 1305 from a LiDAR ego-vehicle coordinate frame to a camera coordinate frame and a segmentation score vector is obtained for each pixel where a point is projected in the camera coordinate frame”, where the LIDAR points are 3D data, and the camera coordinate frame is 2D).

As to Claim 19, Vora in view of Hu teaches a computing apparatus (see Vora, Fig. 3, computer system 300) comprising:
 one or more non-transitory computer readable storage media (see Vora, Fig. 3, storage device 310, and see paragraph [0075], “The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operate in a specific fashion”);
a processing system operatively coupled to the one or more non-transitory computer readable storage media (see Vora, Fig. 3, processor 304, coupled by bus 304 to storage device 310), 
and program instructions stored on the one or more non-transitory computer readable storage media that (see Vora, paragraph 0080], “In an embodiment, the computer system 300 receives code for processing. The received code is executed by the processor 304 as it is received, and/or stored in storage device 310, or other non-volatile storage for later execution.”), 
that, when executed by the processing system, direct the processing system to perform the same method disclosed in Claim 1. Thus, the rejection and rationale are analogous to that of Claim 1.

As to Claim 20, Vora in view of Hu teaches a non-transitory computer readable storage media comprising (see Vora, Fig. 3, storage device 310, and see paragraph [0075], “The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operate in a specific fashion”);
program instructions that, when executed by a processing system (see Vora, paragraph 0080], “In an embodiment, the computer system 300 receives code for processing. The received code is executed by the processor 304 as it is received, and/or stored in storage device 310, or other non-volatile storage for later execution.”),
 direct the processing system to perform the same method disclosed in Claim 1. Thus, the rejection and rationale are analogous to that of Claim 1.


Claims 2 and 5 are rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu, and further in view of Ma et al. (US Pub No 20130080045), hereinafter Ma.

As to Claim 2, Vora in view of Hu fails to explicitly teach preprocessing the set of irregular lidar points to remove redundant operators. However, Ma teaches a method of extracting features from 3D range data (see abstract), that includes preprocessing which removes unnecessary points (see paragraph [0033], “Pre-processing can also include filtering the set of range data to suppress noise and to remove outliers”) . Ma is combinable with Vora and Hu since all three are from the analogous field of 3D point clouds. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the preprocessing taught by Ma with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to correct points in case of a broken equipment. Ma teaches in paragraph [0033], “This can also include correction of points in case of a broken or malfunctioning range finder. Pre-processing can also include filtering the set of range data to suppress noise and to remove outliers. FIG. 4B is a graphical example of the set of range data in FIG. 4A after pre-processing. As can be seen, many of the points with no value have been filled in via interpolation and outliers have been removed.” Thus, it would have been obvious to combine the preprocessing taught by Ma with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 2. 

As to Claim 5, Vora in view of Hu fails to teach suppressing redundant points in the set of irregular lidar points. However, Ma teaches that points associated with noise can be suppressed (see paragraph [0033], “Pre-processing can also include filtering the set of range data to suppress noise and to remove outliers”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the preprocessing taught by Ma with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to correct points in case of a broken equipment, as taught by Ma in paragraph [0033]. Thus, it would have been obvious to combine the preprocessing taught by Ma with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 2. 



Claims 3-4, 8-11, 14 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu, and further in view of Kim et al. (US Pub No 20230071437), hereinafter Kim.

As to Claim 3, Vora in view of Hu fails to teach postprocessing the at least one three- dimensional object to remove redundant operators. However, Kim teaches post-processing to remove redundant points (see paragraph [0101], “Then, the present disclosure only keeps the center points by two criteria: The center point value is higher than the predefined threshold, and the confidence score filters out the detected center point number in priority order of a redefined object number in the detection range.”). Kim is combinable with Vora and Hu as all three are from the art of 3D object detection. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the filtering taught by Kim with the teachings of Vora and Hu. The motivation for doing so would be to increase the accuracy of the 3D object detection. Kim teaches in paragraph [0100] and [0101], “For accurately localizing 3D bounding boxes, after extracting fine-grained feature maps, the present disclosure first checks the presence of center keypoints”). Thus, it would have been obvious to combine the filtering taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in in Claim 3.

As to Claim 4 Vora in view of Hu fails to teach  filtering the set of irregular lidar points to a predefined detection range. However, Kim teaches a 3D object detection system (see abstract) that teaches filtering points to a detection range (see paragraph [0101], “Then, the present disclosure only keeps the center points by two criteria: The center point value is higher than the predefined threshold, and the confidence score filters out the detected center point number in priority order of a redefined object number in the detection range”). Kim is combinable with Vora and Hu as all three are from the art of 3D object detection. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the filtering taught by Kim with the teachings of Vora and Hu. The motivation for doing so would be to increase the accuracy of the 3D object detection as taught by Kim in paragraph [0100] and [0101 Thus, it would have been obvious to combine the filtering taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in in Claim 4. 

As to Claim 8, Vora in view of Hu teaches iterating the set of irregular lidar points to the set of regular pseudo image points (see paragraph [0129], “As a first step the point cloud is discretized into an evenly spaced grid in the x-y plane”, and see paragraph [0129], “In an embodiment, to apply a 2D convolutional architecture, the pillar feature network 1502 converts the point cloud to a pseudo-image”). However, Vora in view of Hu does not explicitly teach iterating points within a predefined detection range. 
However, Kim teaches that points can be kept within a detection range (see paragraph [0101], “Then, the present disclosure only keeps the center points by two criteria: The center point value is higher than the predefined threshold, and the confidence score filters out the detected center point number in priority order of a redefined object number in the detection range.”).  Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the detection range taught by Kim with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to increase the accuracy of the 3D object detection as taught by Kim in paragraph [0100] and [0101 Thus, it would have been obvious to combine the detection range taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in in Claim 8. 

As to Claim 9, Vora in view of Hu teaches point decoration (see paragraph [0115], “In an embodiment, the point pillars encoding for a point is (x, y, z, i, x_c, y_c, z_c, x_p, y_p), where (x_c, y_c, z_c) are the offsets of the point to an arithmetic mean of all points in the pillar”), but fails to teach that the point decoration utilizes the set of assigned points subtracted by a set of grid center coordinates.
However, Kim teaches point decoration can utilize points subtracted by a set of grid center coordinates (see paragraph [0064] and [0065], “The present disclosure presents this feature map for complementing the distance information and reinforcing the BEV representation. This feature map is useful for the learning task and further helps the model to learn the point clouds distribution by range. The normalized distance feature Di_norm in each cell is calculated by Equation 1. Here, DO→P i is the distance between the LiDAR origin (0,0,1.73 m) and the current point Pi.”, where the lidar origin is the grid origin, and the  calculating the distance between each point and the origin would involve subtracting the point by the origin ). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the distance normalization taught by Kim with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to normalize the points to help the model learn the point cloud distribution, as taught by Kim in paragraph [0064]. Thus, it would have been obvious to combine the distance normalization taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 9. 

As to Claim 10, Vora in view of Hu teaches point decoration utilizes the set of points subtracted by a centroid of the points on the grid and  (see paragraph [0115], “In an embodiment, the point pillars encoding for a point is (x, y, z, i, x_c, y_c, z_c, x_p, y_p), where (x_c, y_c, z_c) are the offsets of the point to an arithmetic mean of all points in the pillar, i is the intensity and (x_p, y_p) are the offsets of the point to the pillar x and y center”., where the arithmetic mean of all points in the pillar is the centroid, and calculating and offset implies the points are subtracted by the centroid)
but fails to teach that the point decoration utilizes the set of assigned points subtracted by a set of grid center coordinates and a centroid of the points on the grid.
However, Kim teaches point decoration can utilize points subtracted by a set of grid center coordinates (see paragraph [0065],  Here, DO→P i is the distance between the LiDAR origin (0,0,1.73 m) and the current point Pi.”)
	Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the distance normalization taught by Kim with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to normalize the points to help the model learn the point cloud distribution, as taught by Kim in paragraph [0064]. Thus, it would have been be obvious to combine the distance normalization taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 10.

As to Claim 11, Vora in view of Hu teaches wherein the point feature encoding utilizes matrix multiplication (see Vora, paragraph [0131], “This is followed by a max operation over the channels to create an output tensor of size (C, P). Note that the linear layer can be formulated as a 1×1 convolution across the tensor resulting in very efficient computation”, where convolution is a matrix multiplication operation). However, Vora in view of Hu fails to teach max pooling. 
Kim teaches that a max pooling operation may be applied, (see paragraph [0106], “The present disclosure handles these tasks as an embedded system-friendly approach. Therefore, the present disclosure can find the object centers by using a lightweight max-pooling operation, way faster than involving the conventional NMS process”).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the max pooling operation taught by Kim with the teachings of Vora and Hu. The motivation for doing so would be to reduce the time needed for feature encoding, as taught Kim in paragraph [0106]. Thus, it would have been obvious to combine the max pooling operation taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in in Claim 11. 


As to Claim 14, Vora in view of Hu teaches the prediction comprises a three-dimensional object classification (see Vora, paragraph [0083], “Thee perception module 402 identifies nearby physical objects using one or more sensors 121, e.g., as also shown in FIG. 1. The objects are classified (e.g., grouped into types such as pedestrian, bicycle, automobile, traffic sign, etc.) and a scene description including the classified objects 416 is provided to the planning module”). However, Vora fails to teach that the prediction comprises an object size  and an object bearing angle. However, Kim teaches that a size and bearing angle can be predicted, see paragraph [0069], “Second, the header network, which has inputs, is composed of the final blocks of the backbone network, and it is designed to learn task-specific predictions. The header network contains five subtasks including the object center point (x,y), the offset information (Δx, Δy), the extending Z coordinate (z), the object size (l,w,h), and the object rotating angle (yaw)”, where the rotating angle is the ‘bearing angle’). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the header network taught by Kim with  the teachings of Vora and Hu. The motivation for doing so would be to would be to provide more information important for motion planning. Kim teaches in paragraph [0003] and [0006], “For an autonomous driving of an unmanned vehicle, a driving route needs to be generated upon detecting a moving object in front and estimating its dynamic motion…Understanding dynamic properties of coexisting entities in the environment is crucial for enhancing the overall automation since the knowledge directly impacts the quality of localization, mapping, and motion planning.”). Thus, it would have been obvious to combine the teachings of Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 14.

As to Claim 16, Vora in view of Hu teaches the prediction is generated by a concatenated features (see paragraph [0133], “The final output features are a concatenation of all features that originated from different strides” and see paragraph [0117], “The deep learning backbone computes and outputs features which are input into a detection head. The detection head outputs oriented 3D bounding boxes”, where the bounding boxes are predictions). However, Vora in view of Hu fails to explicitly teach feature maps.
 However, Kim teaches that feature maps can be generated (see paragraph [0013], “The BEV image generating module is configured to generate four feature map images based on a height, a density, an intensity, and a distance of the raw 3D point cloud data, by encoding the raw 3D point cloud data.”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the feature maps taught by Kim with the concatenation of features taught by Vora. The motivation for doing so would be extract more information from the pseudo image (see paragraph [0064], “First, the backbone network is used to reclaim general information from the raw BEV representation in the form of convolutional feature maps, and it is compact and has a high representation capability to learn and exploit robust feature representation.”, where the BEV representation is the pseudo image). Thus, it would have been obvious to combine the feature maps taught by Kim with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 16. 
Claim 12 is rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu, and further in view of Wang et al. (Wang, J, et al., “CLRCNET Cascaded Low-rank Convolutions for Semantic Segmentation in Real time”, 2019).

As to Claim 12, Vora in view of Hu fails to explicitly teach utilizes two cascade convolutional layers in an inverted bottleneck. However, Wang teaches a cascaded convolutional network that utilizes inverted bottlenecks (see page 287, Section I, “Our network is based on MobileNetV2 [9] which is the state of the art mobile network. The inverted residual with linear bottleneck is used in MobileNetV2 [9] to construct fewer parameters and computational efficient module. We find that cascaded low-rank convolution kernels require fewer parameters in image convolution operations than traditional convolution kernels.” and see page 288, Section III, Part B. “We want to decompose the second convolutional layer in the MobileNetV2 unit into two layers and redesign MobileNetV2 in a more optimized way. As shown in Fig. 1 (b), we replace the second 3 × 3 depth convolution layer with a 3 × 1 depth convolution layer and a 1 × 3 depth convolution.”, and see corresponding network architecture in Fig. 1. ‘Network Architecture’). The examiner has interpreted the ‘linear bottleneck’ as the ‘inverted bottleneck’ claimed in Claim 12. Wang is combinable with Vora and Hu since all three are from the analogous field of image analysis . Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the cascaded convolution and inverted bottleneck taught by Wang with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be to create a more efficient model, as taught by Wang in page 287, Section I. Thus, it would have been obvious to combine the network architecture taught by Wang with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 12.

Claims 13 is rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu, and further in view of Zhang et al. (US Pub No 20230206603), hereinafter Zhang.

As to Claim 13, Vora in view of Hu fails to explicitly teach that point feature encoding utilizes a spatial attention branch. However, Zhang teaches a method for correcting point clouds, (see abstract) which includes a  ‘spatial attention mechanism’ which can be used to determine relationships among points in a point cloud (see paragraph [0046], “The present disclosure adds a spatial attention mechanism to a feature fusion module, so that the decoder better learns the relationship of among various features and improves the precision of point cloud completion”). Zhang is combinable with Vora and Hu as all three are from the analogous field of 3D imaging. Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the spatial attention mechanism taught by Zhang with the teachings of Vora and Hu. The motivation for doing so would be to increase precision, as taught by Zhang in paragraph [0046]. Thus, it would have been obvious to combine the spatial attention mechanism taught by Zhang with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 13. 


Claims 15 and 17-18 are rejected under 35 U.S.C. 103 as being unpatentable over Vora et al. (US Pub No 20210146952), hereinafter Vora, in view of Hu et al. (US Pub No 20220404500), hereinafter Hu, and further in view of Sriram et al (US Pub No 20220044114), hereinafter Sriram.

As to Claim 15 Vora in view of Hu fails to explicitly teach that the prediction is based on an integer model. However, Sriram teaches a method of quantizing neural network (see abstract) which can create an integer model from a floating model (see paragraph [0057], “Techniques described herein provide a way to transform (e.g., quantize) a model to have weights represented by lower-precision values (e.g., low-bit integers) instead of using higher-precision values (e.g., values with full floating point-precision) to conserve memory usage and reduces computation when the trained model is deployed”). Sriram further teaches that the integer model can be used for object detection (see paragraph [0017], “The second trained model may be deployed to one or more computing devices to perform inferencing (e.g., object detection, image classification) over a network (such as one or more communication networks described in FIG. 1)”).  Sriram is combinable with Vora and Hu because all three are from the analogous field of neural networks for image analysis. Thus, it would have obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the quantization method taught by Sriram with the teachings of Vora and Hu. The motivation for doing so would be to reduce memory usage as taught by Sriram in paragraph [0057]. Thus, it would have been obvious to combine the quantization method taught by Sriram with the teachings of  Vora an d Hu in order to obtain the invention as claimed in Claim 15. 

As to Claim 17 Vora in view of Hu fails to explicitly teach that prediction training is based on a floating model. However, Sriram teaches that training may be based on a floating model (see paragraph [0060], “Typically, DNN training and inference have relied on the Institute of Electrical and Electronics Engineers (IEEE) single-precision floating-point format, using 32 bits to represent the floating-point model weights and activation tensors. This compute budget may be acceptable, at training, as most DNNs are trained in data centers or in the cloud with GPUs that have significantly large compute capability and much larger power budgets.  However, during deployment, these models are most often required to run on edge devices with much smaller computing resources and lower power budgets.”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the quantization method taught by Sriram with the teachings of Vora and Hu. The motivation for doing so would be to reduce memory consumption, as taught by Sriram in paragraph [0057]. 

As to Claim 18, Vora in view of Hu fails to teach wherein the method is performed on a system on a chip. However, Sriram teaches an embodiment where computing can be done on a system on chip (see paragraph [0352], “In at least embodiment, components of computing system 1900 may be integrated with one or more other system elements on a single integrated circuit. For example, in at least one embodiment, parallel processor(s) 1912, memory hub 1905, processor(s) 1902, and I/O hub 1907 can be integrated into a system on chip (SoC) integrated circuit”). Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the system on chip taught by Sriram with the 3D object detection method taught by Vora and Hu. The motivation for doing so would be optimize processing. Sriram teaches in paragraph [0352], “In at least one embodiment, parallel processor(s) 1912 incorporate circuitry optimized for general purpose processing.” Thus, it would have been obvious to one of ordinary skill to combine the system on chip taught by Sriram with the teachings of Vora and Hu in order to obtain the invention as claimed in Claim 18. 


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Sandler et al. (Sandler, M., et al, “MobileNetV2: Inverted Residuals and Linear Bottlenecks”, 2019) introduces the concept of a ‘inverted bottleneck’. See page 1, paragraph 3 under Section I, “Our main contribution is a novel layer module: the inverted residual with linear bottleneck. This module takes as an input a low-dimensional compressed representation which is first expanded to high dimension and filtered with a lightweight depth wise convolution. Features are subsequently projected back to a low-dimensional representation with a linear convolution.” This structure is the opposite of a typical bottleneck, which first receives a feature with a high-dimensional representation, then compresses the feature to a low-dimensional representation, performs convolution, and finally expands the feature to a higher dimension This art was referenced by Wang et al., who was cited in the rejection of Claim 12. Wang et al uses the same ‘linear bottleneck’ disclosed by Sandler et al., but modifies it to include cascading convolutional kernels. 
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SOUMYA THOMAS whose telephone number is (571)272-8639. The examiner can normally be reached M-F 8:30-5:00.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Mehmood can be reached at (571) 272-2976. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/S.T./Examiner, Art Unit 2664                                                                                                                                                                                                        
/JENNIFER MEHMOOD/Supervisory Patent Examiner, Art Unit 2664
Read full office action
CHIP BASED LIDAR 3D OBJECT DETECTION SYSTEM AND METHOD

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

CHIP BASED LIDAR 3D OBJECT DETECTION SYSTEM AND METHOD

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email