Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This is a non-final Office Action on the merits in response to communications filed by Applicant on December 04, 2024. Claims 1-20 are currently pending and examined below.
Priority
Receipt is acknowledged of papers submitted under 35 U.S.C. 119(a)-(d), which papers have been placed of record in the file.
Information Disclosure Statement
The information disclosure statement(s) (IDS) submitted on is/are being considered by the examiner.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 9-10, 11-14, 19-20 are rejected under 35 U.S.C. 103 as being unpatentable over Liong et al. US2022/0122363 (“Liong”) in view of Wu et al. US2020/0293064 (“Wu”).
Regarding claim(s) 1. 11. Liong discloses an apparatus for controlling autonomous driving of a vehicle, the apparatus comprising: a
first sensor configured to obtain an image (para. 54, monocular or stereo video cameras 122 in the visible light, infrared or thermal (or both)); a second sensor configured to obtain a cluster of points; a memory storing a plurality of neural network models ([0093] FIG. 9 shows a flowchart of a process 900 of classifying LiDAR points. As noted above with respect to FIG. 6, a vehicle 100 detects physical objects based on characteristics of data points 704 in the form of a point cloud detected by a LiDAR system 602. In some embodiments, the data points are processed by one or more neural networks to identify objects represented by the data points. For example, a point cloud is processed by neural networks to generate semantic labels for clusters of points included in the point cloud.); and
a processor configured to: obtain, based on inputting the image to a first neural network model among the plurality of neural network models and based on the cluster of points ([0109] FIG. 11 illustrates a view network 1100, which is representative of both the RV network 1004 and the BeV network 1006. The view network 1100 takes as the input view data 1120, which is representative of RV data 1020 or BeV data 1030, and output a set of class scores 1110. In an embodiment, the view data 1120 is passed through consecutive convolutional layers 1102. A convolutional layer is a layer in a neural network that performs convolution on the input to the layer. Convolution is an operation where a convolutional kernel, e.g., a 5×5 matrix, is convolved with the input tensor to produce a new tensor.),
a first value, wherein the first value indicates a first score for a type of a point associated with the second sensor, and wherein the point corresponds to at least one pixel included in the image ([0097] The different sets of class scores of a point in the point cloud data are obtained and compared. Based on the result of the comparison, the point is then determined 910 as either an uncertain point 950 or a classified point 960. Details regarding the comparison process are found below in accordance with FIG. 10.);
obtain, based on inputting the cluster of points to a second neural network model among the plurality of neural network models, a second value, wherein the second value indicates a second score for the type of the point ([0103] The RV data 1020 is provided as an input to a Range View network 1004, or an RV network and BeV data 1030 to a Bird-eye View network 1006, or a BeV network. In an embodiment, both the RV network 1004 and the BeV network 1006 are view networks. Detailed architecture of a view network representative of The RV network 1004 and BeV network 1006 is illustrated in FIG. 11. The RV network 1004 calculates, for each point in the point cloud, a first set of class scores 1040. Likewise, the BeV network 1006 calculates, for each point in the point cloud, a second set of class scores 1050. FIG. 9 shows a flowchart of a process 900 of classifying LiDAR points. As noted above with respect to FIG. 6, a vehicle 100 detects physical objects based on characteristics of data points 704 in the form of a point cloud detected by a LiDAR system 602. In some embodiments, the data points are processed by one or more neural networks to identify objects represented by the data points. For example, a point cloud is processed by neural networks to generate semantic labels for clusters of points included in the point cloud. The semantic labels are used to differentiate objects in the point cloud. Further, in some examples, multiple views of a point cloud are processed in parallel and fused to generate a finalized set of labels), and
control, based on the signal, autonomous driving of the vehicle ([0060] Computer processors 146 located on the vehicle 100 algorithmically generate control actions based on both real-time sensor data and prior information, allowing the AV system 120 to execute its autonomous driving capabilities.)
Liong does not explicitly disclose wherein the point is included in the cluster of points; and output, based on obtaining a similarity value among a plurality of points included in the cluster of points using the first value and the second value, at least one of: the first value, the second value, or a third value obtained by the image and the cluster of points; generate a signal associated with the similarity value among the plurality of points.
Wu teaches another autonomous vehicle system and method that obtaining a similarity value among a plurality of points included in the cluster of points using the first value and the second value, at least one of: the first value, the second value, or a third value obtained by the image and the cluster of points; generate a signal associated with the similarity value among the plurality of points (para. 29-30, para. 82-83, para. 92-93, the training sensor data 704 corresponding to the objects for which there are inconsistencies may be ignored when generating the ground truth, such that only consistent associations between the images and LIDAR data (e.g., LIDAR point projections into image space) and/or the images and RADAR data are used for ground truth generation with respect to any one object. As such, if the values associated with an object in image space determined from LIDAR data are not within a threshold similarity to the values associated with the object in image space determined from RADAR data, at least one of the LIDAR data or the RADAR data may be ignored. ).
It would have been obvious to one of ordinary skills in the art before the effective filing date of the claimed invention to modify the system and method of Liong by incorporating the applied teaching of Wu to improve the accuracy of object prediction in autonomous vehicle control and one of ordinary skill before the effective filing date of the claimed invention would have recognized that the results of the combination would have been predictable.
Regarding claim(s) 2, 12. Liong in view of Wu further teaches based on inputting the first value and the second value to a third neural network model among the plurality of neural network models, the third value ([0131] The processor generates 1312, using a fusion neural network, a third set of class scores based on the at least one of the first set of class scores of the at least one uncertain point and the second set of class scores of the at least one uncertain point, wherein the third set of class scores is based on characteristics of neighboring points of the at least one uncertain point. In an embodiment, the third set of class scores of the at least one uncertain point is the new set of class scores 1220 shown in FIG. 12. In an implementation, at least one class score in the third set of class scores of the at least one uncertain point corresponds to a pre-defined class of object.).
Regarding claim(s) 3, 13. Liong in view of Wu further teaches obtain, based on projecting the cluster of points onto a two-dimensional (2D) coordinate system to compare the at least one pixel with at least one point included in the cluster of points, the first value (Wu: [0029] Deep neural network(s) (DNN(s)) of the present disclosure may be trained using image data correlated with sensor data (e.g., LIDAR data, RADAR data, etc.), such as by using cross-sensor fusion techniques. The DNN(s) may include a recurrent neural network (RNN), which may have one or more long short term memory (LSTM) layers and/or gated recurrent unit (GRU) layers. As a result, the DNN(s) may be trained—using a combination of image data and sensor data—to predict temporal information such as time-to-collision (TTC), two-dimensional (2D) motion, and/or three-dimensional (3D) motion corresponding to objects in an environment. In contrast to conventional systems, and as a result of the training methods described herein, the DNN(s) may be configured to generate accurate predictions of temporal information in deployment using only image data (e.g., an RNN in deployment may learn to generate temporal information with the accuracy of supplemental sensor data without requiring the supplemental sensor data).).
Regarding claim(s) 4, 14. Liong in view of Wu further teaches obtain, based on inputting the first value and the second value to a first algorithm, the similarity value (Liong: para. 121, para. 129-131, A multi-layer perceptron 1212 is a neural network where every node of the neural network is a perceptron. A perceptron is an algorithm for learning a binary classifier. In an embodiment, the multi-layer perceptron 1212 is replaced by a convolutional neural network. In an embodiment, the multi-layer perceptron 1212 is replaced by a transformer. A transformer is a type of neural network that transforms an input sequence to an output sequence.).
Regarding claim(s) 9, 19. Liong in view of Wu further teaches a training dataset for training the plurality of neural network models; or a validation dataset for validating the plurality of neural network models(Liong: para. 129-131, [0130] The processor determines 1310 at least one uncertain point in the point cloud, wherein the determining is based on the first set of class scores of the at least one uncertain point and the second set of class scores of the at least one uncertain point. In an implementation, the first and second set of class scores are compared in the score comparer 1008 shown in FIG. 10. In an implementation, the at least one uncertain point is the uncertain point 950 shown in FIG. 9. In an implementation, the uncertain point is determined with respect to a threshold difference in class scores, wherein the threshold difference is determined based on at least one of a probability function or a filtering function or both, as described with respect to FIG. 10. In an implementation, a point determined as not an uncertain point is a classified point 960.).
Regarding claim(s) 10, 20. Liong in view of Wu further teaches train, based on the training dataset, the first neural network model and the second neural network model; perform validation, based on the validation dataset, for the trained first neural network model and the trained second neural network model; and train, based on the validation, at least one of: the first neural network model among the plurality of neural network models, the second neural network model among the plurality of neural network models, or a third neural network model among the plurality of neural network models (Wu: [0040] The sequential DNN(s) 104 may include a recurrent neural network (RNN), a gated recurrent unit (GRU) DNN, a long short term memory (LSTM) DNN, and/or another type of sequential DNN 104 (e.g., a DNN that uses sequential data, such as sequences of images, as inputs). Although examples are described herein with respect to using sequential DNNs, and specifically RNNs, LSTMs, and/or GRUs, as the sequential DNN(s) 104, this is not intended to be limiting. For example, and without limitation, the sequential DNN(s) 104 may more broadly include any type of machine learning model, such as a machine learning model(s) using linear regression, logistic regression, decision trees, support vector machines (SVM), Naïve Bayes, k-nearest neighbor (Knn), K means clustering, random forest, dimensionality reduction algorithms, gradient boosting algorithms, neural networks (e.g., auto-encoders, convolutional, recurrent, perceptrons, LSTM, GRU, Hopfield, Boltzmann, deep belief, deconvolutional, generative adversarial, liquid state machine, etc.), and/or other types of machine learning models.).
Allowable Subject Matter
Claims 5-8, 15-18 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims.
Inquiry
Any inquiry concerning this communication or earlier communications from the examiner should be directed to TRUC M DO whose telephone number is (571)270-5962. The examiner can normally be reached on 9AM-6PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ramón Mercado, Ph.D. can be reached on (571) 270-5744. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/TRUC M DO/Primary Examiner, Art Unit 3658