DETAILED ACTION
Status of the Claims
Original claims 1-20 are pending.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on February 1, 2024, is being considered by the examiner.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim(s) 8-10 and 19-20 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 8 depends from claim 1. Claim 1 recites a system comprising a vehicle, which comprises a controller. The controller is configured to send video to an edge server, but the edge server itself is not part of the controller, vehicle, or system of claim 1.
Claim 8 recites “The system according to claim 1, wherein the edge server is configured to:” followed by actions performed by the edge server. It is unclear to what extent the features recited in claim 8 limit the scope of the system of claim 1. This ambiguity makes the scope of claim 8 unclear and renders the claim indefinite.
On the one hand, by reciting additional limitations, claim 8 is clearly attempting to further limit the scope of claim 1. Claim 8 also recites the edge server transmitting a retrained specialized model to the vehicle, which may suggest that the vehicle is configured to receive the retrained specialized model. These factors suggest that claim 8 may limit the scope of claim 1.
On the other hand, the retraining and transmitting functions of claim 8 are performed by the edge server, not by any of the components of the system of claim 1, and it is unclear how the structure of any of the components of the system of claim 1 would be affected by the functions performed by the edge server. For example, claim 1 merely requires inputting obtained video data to a specialized model. This inputting to (and receipt of output from) the model will be performed the same regardless of how that specialized model was trained or retrained, or where that specialized model was received from. These factors suggest that claim 8 may not limit the scope of claim 1.
This ambiguity regarding whether (and to what extent) the claim elements recited in claim 8 limit the scope of claim 1 makes the scope of claim 8 unclear and renders the claim indefinite.
Claims 9 and 10 similarly recite actions performed by the edge server and are also indefinite for substantially the same reasons as claim 8. Claims 19 and 20 are also indefinite for substantially the same reasons – i.e., they recite actions that the edge server is configured to perform, but it is unclear whether these actions are required to be performed as part of the method.
Applicant may wish to consider amending claim 1 to include the edge server in the system and amending claims 19 and 20 to recite the functions of the edge server as further steps in the method.
Claim 8 recites the limitation "the specialized model received from the vehicle" in the first line on page 3. There is insufficient antecedent basis for this limitation in the claim. Claim 1 recites a specialized model at a vehicle, but does not recite sending the specialized model to the edge server. Instead, claim 1 only recites sending obtained video data to the edge server. Therefore, it is unclear what is meant by the limitation noted above.
Claim 19 is also indefinite for substantially the same reason as claim 8. Claims 9 and 20 are also indefinite because they depend from claims 8 and 19, respectively.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 1-3, 6-9, 12-14, and 17-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over ‘Kim’ (“Design and Implementation of the Vehicular Camera System using Deep Neural Network Compression,” 2017) in view of ‘Tan’ (“Deep Learning on Mobile Devices With Neural Processing Units,” 2023).
Regarding claim 1, Kim teaches a system for obtaining video analytic output (see mapping below) comprising:
a vehicle comprising a controller (e.g., Figure 1, Section 3.1, Sec. 4.1) configured to:
obtain video data (e.g., Sec. 4.1, video is obtained from vehicle cameras; Fig. 5 shows examples);
obtain output of a specialized model by inputting the obtained video data to the specialized model (e.g., Secs. 3.2-3.3 describe how Faster-RCNN model is specialized through pruning and quantization model compression; e.g., Sec. 4.2 discusses analysis of outputs obtained by the specialized model); and
operate the vehicle using an output of the specialized model (e.g., Secs. 2.1 and 4.1, detection is implemented as part of ADAS, which operates vehicle using an output of the specialized object detection model).
Kim does not explicitly teach the controller further:
determining a confidence of the specialized model by inputting the obtained video data to the specialized model;
determining whether the confidence is greater than a predetermined value;
sending the obtained video data to an edge server in response to determining that the confidence is less than or equal to the predetermined value; and
that the operation of the vehicle using the output of the specialized model is in response to determining that the confidence is greater than the predetermined value.
Kim recognizes that deep learning models have very high performance, but are difficult to implement in embedded environments such as vehicles because their size and amount of computation are too large (e.g., Abstract). Kim proposes addressing this problem by applying model compression, where a model on a server is modified so that it is smaller and requires fewer computations (e.g., Abstract, 2nd par.; Secs. 3.2-3.3) and sent to a vehicle for local use (e.g., Fig. 2). While model compression is successful in reducing size and computation requirements (e.g., Secs. 4.2.1-4.2.2), Kim acknowledges that it also causes a reduction in performance (e.g., Sec. 4.2.5 and Fig. 9). Reduction in model performance is disadvantageous in general, but especially so in a safety system such as an ADAS.
Tan teaches a different approach for addressing this problem. Instead of analyzing video data only with a less-accurate local model, Tan also adaptively offloads some computation to a server (e.g., Page 54, COMPUTATION OFFLOADING). “Since the server has more computation capacity, more advanced deep learning models with high accuracy can be executed quickly” (Page 54, COMPUTATION OFFLOADING, 2nd paragraph), so offloading can advantageously increase accuracy. Tan’s offloading technique includes:
determining a confidence of a specialized model by inputting obtained video data to the specialized model (e.g., Page 54, COMPUTATION OFFLOADING, 3rd par., “In our offloading framework, video frames are first processed on an NPU”; Note that the NPU model is a model that has been compressed/specialized to run on the NPU);
determining whether the confidence is greater than a predetermined value (e.g., Page 54, COMPUTATION OFFLOADING, 3rd par.);
sending the obtained video data to an edge server in response to determining that the confidence is less than or equal to the predetermined value (e.g., Page 54, COMPUTATION OFFLOADING, 3rd par., “otherwise, the data should be offloaded for further processing to improve accuracy”); and
using the output of the specialized model in response to determining that the confidence is greater than the predetermined value (e.g., Page 54, COMPUTATION OFFLOADING, 3rd par., “If the confidence score is higher than a threshold, the classification on an NPU is most likely correct and can be directly used”).
Tan demonstrates that offloading can produce higher accuracy than local computation with a specialized model alone (e.g., Fig. 9; Note that “CBO” refers to the offloading).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Kim with the offloading of Tan in order to improve the system with the reasonable expectation that this would result in a system that addressed the issues of model size and computation raised by Kim, but did so in a manner that provided advantageously higher model performance. This technique for improving the system of Kim was within the ordinary ability of one of ordinary skill in the art based on the teachings of Tan.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim and Tan to obtain the invention as specified in claim 1.
Regarding claim 2, Kim in view of Tan teaches the system according to claim 1, and Kim further teaches that the specialized model is included in the vehicle (e.g., Sec. 3.1, Fig. 2).
Regarding claim 3, Kim in view of Tan teaches the system according to claim 1, and Kim further teaches that a general model in the edge server is compressed to the specialized model (e.g., Sec. 3.1 and Fig. 2; Secs. 3.2-3.3, general Faster R-CNN in the edge server is compressed through pruning and quantization).
Regarding claim 6, Kim in view of Tan teaches the system according to claim 3, and Kim further teaches that parameters of the general model are compressed to obtain parameters of the specialized model (e.g., Sec. 3.3, parameters are compressed through conversion from 32-bit real numbers to 8-bit integers).
Regarding claim 7, Kim in view of Tan teaches the system according to claim 3, and Kim further teaches that the general model comprises a machine learning model for processing frames in the video data (e.g., Sec. 2.3, Faster-RCNN).
Regarding claim 8, Kim in view of Tan teaches the system according to claim 1, and Kim further teaches that the edge server is configured to:
retrain the specialized model received from the vehicle (e.g., Sec. 3.1, Fig. 2, update phase at server); and
transmit the retrained specialized model to the vehicle (e.g., Sec. 3.1, Fig. 2, deployment of updated model to vehicle).
Regarding claim 9, Kim in view of Tan teaches the system according to claim 8, and Kim further teaches that the edge server retrains the specialized model with a general model in the edge server (e.g., Sec. 3.1, Fig. 2, general model is learned/updated, then compressed to obtain specialized model).
Regarding claim 12, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 1. Kim in view of Tan teaches the system of claim 1 (see above). Accordingly, claim 12 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 1.
Regarding claim 13, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 2. Kim in view of Tan teaches the system of claim 2 (see above). Accordingly, claim 13 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 2.
Regarding claim 14, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 3. Kim in view of Tan teaches the system of claim 3 (see above). Accordingly, claim 14 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 3.
Regarding claim 17, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 6. Kim in view of Tan teaches the system of claim 6 (see above). Accordingly, claim 17 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 6.
Regarding claim 18, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 7. Kim in view of Tan teaches the system of claim 7 (see above). Accordingly, claim 18 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 7.
Regarding claim 19, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 8. Kim in view of Tan teaches the system of claim 8 (see above). Accordingly, claim 19 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 8.
Regarding claim 20, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 9. Kim in view of Tan teaches the system of claim 9 (see above). Accordingly, claim 20 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan for substantially the same reasons as claim 9.
Claim(s) 4 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan as applied above, and further in view of ‘Ren’ (“Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” 2016).
Regarding claim 4, Kim in view of Tan teaches the system according to claim 3.
Kim’s specialized model recognizes two classes: vehicle sides and rears (Sec. 4.1). Kim teaches using Faster R-CNN as a general model (e.g., Sec. 2.3), but does not explicitly teach a number of classes recognized by Faster-RCNN.
However, Ren does teach details of Faster R-CNN, including that it can recognize 20 (Sec. 4.1, 1st par.) or 80 (Sec. 4.2, 1st par.) different object classes. Accordingly, a number of objects or classes recognized by the specialized model of Kim (2) is less than a number of objects or classes recognized by the Faster R-CNN general model (20 or 80), as required by the claimed invention.
Regarding claim 15, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 4. Kim in view of Tan and Ren teaches the system of claim 4 (see above). Accordingly, claim 15 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan and Ren for substantially the same reasons as claim 4.
Claim(s) 5 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan as applied above, and further in view of ‘Choi’ (US 2018/0268292 A1).
Regarding claim 5, Kim in view of Tan teaches the system according to claim 3.
Kim uses a Faster R-CNN object detector as a general model (e.g., Sec. 2.3) and applies model compression including pruning and quantization to obtain a smaller and faster specialized model (Secs. 3.2-3.3). Kim does not explicitly teach that its compression results in a number of hidden layers of the specialized model being less than a number of hidden layers of the general model. Tan also does not explicitly teach this feature.
However, Choi does teach an additional technique for compressing a Faster R-CNN object detection model based on knowledge distillation (e.g., Figs. 1 and 4; [0036]). A larger and slower teacher/general model is used to train a smaller and faster compressed/specialized/student model (e.g., [0028]-[0029]). The compressed/specialized/student model has fewer hidden layers than the general/teacher model ([0028], “the teacher model 110 may include a larger number of hidden layers … compared to the student model 120”).
Choi teaches that its techniques “solve the problem of achieving object detection at an accuracy comparable to complex deep learning models, while maintaining speeds similar to a simpler deep learning model” ([0070]). Choi also teaches that “Distillation tends to solve the problem of generalization, in other words, the over-fitting problem” ([0060]).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Kim in view of Tan with the knowledge distillation of Choi in order to improve the system with the reasonable expectation that this would result in a system that could solve the problem of achieving object detection at an accuracy comparable to complex deep learning models, while maintaining speeds similar to a simpler deep learning model, and/or that could solve the problem of generalization. This technique for improving the system of Kim in view of Tan was within the ordinary ability of one of ordinary skill in the art based on the teachings of Choi.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim, Tan and Choi to obtain the invention as specified in claim 5.
Regarding claim 16, Examiner notes that the claim recites a method that is substantially the same as the method performed by the system of claim 5. Kim in view of Tan and Choi teaches the system of claim 5 (see above). Accordingly, claim 16 is also rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan and Choi for substantially the same reasons as claim 5.
Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan as applied above, and further in view of ‘Suri’ (US 2024/0046456 A1).
Regarding claim 10, Kim in view of Tan teaches the system of claim 1.
Kim and Tan teach servers (see mapping in rejections above), but do not explicitly address server availability. Kim and Tan do not explicitly teach the limitations of claim 10.
However, Suri does teach a server system for analyzing received data including a queue manager that:
determines whether an edge server is available for processing received data (e.g., [0039], “if a model server, for example the tooth identification module, is not responding”);
stores the received data in a frame buffer in response to determining that the edge server is not available (e.g., [0039], “the messages stay in the queue and are read later when the model server becomes available again”); and
inputs the video data to a general model in the edge server to obtain analytic output in response to determining that the edge server is available (e.g., [0039], messages are read once server is available).
Suri teaches that its queue manager has several advantages, such as allowing asynchronous processing, which is particularly useful for time-consuming image processing tasks, providing load balancing, fault tolerance, and scalability ([0039]).
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Kim in view of Tan with the queue manager of Suri in order to improve the system with the reasonable expectation that this would result in a system that enjoyed at least one of the advantages identified by Suri. This technique for improving the system of Kim in view of Tan was within the ordinary ability of one of ordinary skill in the art based on the teachings of Suri.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim, Tan and Suri to obtain the invention as specified in claim 10.
Claim(s) 11 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim in view of Tan as applied above, and further in view of ‘Burger’ (US 2020/0265301 A1).
Regarding claim 11, Kim in view of Tan teaches the system according to claim 1.
Kim teaches periodically transmitting obtained video data to a server for use in re-training (Sec. 3.1, Fig. 2). Tan teaches transmitting video data to a server for additional processing in response to determining that the confidence is less than or equal to the predetermined value (see rejection of claim 1).
Neither Kim nor Tan teaches sending the specialized model in response to determining that the confidence is less than or equal to the predetermined value.
However, Burger does teach performing incremental training of a model in response to determining that the model’s confidence is less than or equal to a predetermined threshold value (e.g., Fig. 6, steps 640-650), and sending the model to a server (e.g., [0094], “the application executing on the client device can update operational parameters for the server computer and all client devices communicating with the server, such as by performing the incremental training of the neural network model and distributing the updated operational parameters to local server computer memory”).
Like Kim, Burger teaches that example input/video data can be uploaded to a server for retraining ([0083]). However, Burger also recognizes that users may prefer for their input/video data to be processed locally instead of being stored in a remote training data set (e.g., [0084]). Performing incremental training locally and sending the model itself, rather than input/video data, advantageously enhances privacy by avoiding the remote storage of the input/video data in a server’s training dataset.
Before the effective filing date of the claimed invention, it would have been obvious to one of ordinary skill in the art to modify the system of Kim in view of Tan with the model sending of Burger in order to improve the system with the reasonable expectation that this would result in a system that enhanced privacy by avoiding the remote storage of input/video data in a server’s training dataset. This technique for improving the system of Kim in view of Tan was within the ordinary ability of one of ordinary skill in the art based on the teachings of Burger.
Therefore, it would have been obvious to one of ordinary skill in the art to combine the teachings of Kim, Tan, and Burger to obtain the invention as specified in claim 11.
Conclusion
The following prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
‘Dronen’ (US 2019/0295003 A1)
Vehicle system that classifies video data and, if confidence of the classification is lower than a threshold, transmits the video data to a server for use as training data – e.g., Figs. 2 and 3
‘Gazzetti’ (US 2021/0065063 A1)
Analyzes data with a local model and, if confidence is below an upper bound threshold, transmits the data to a cloud server for classification – e.g., Figs. 8 and 14 and associated description
‘Cao’ (“Edge-Cloud Collaborated Object Detection via Difficult-Case Discriminator,” 2023)
Processes input video data locally with a small model. Output of the small model is passed to a “discriminator” that applies a series of threshold tests to determine whether the input data is an easy or difficult case. If difficult, the input data is sent to a big model on a server. If easy, the output of the local/small model is used. See, e.g., Figs. 2 and 4.
‘Malawade’ (“SAGE: A Split-Architecture Methodology for Efficient End-to-End Autonomous Vehicle Control,” 2021)
Designs a neural split neural network with a bottleneck in the middle
All inputs are processed through the bottleneck, then a decision is made whether to execute the rest of the network locally or remotely on a server
In either case, outputs are used to control a vehicle
See Fig. 2
‘Pacheco’ (“On the impact of deep neural network calibration on adaptive edge offloading for image classification,” 2023)
Describes an early-exit neural network that is executed locally, where confidence metrics are calculated after each stage of computation and used to determine whether to terminate or offload to a server for further computation – e.g., Fig. 1
‘Tan-21’ (“Deep Learning on Mobile Devices Through Neural Processing Units and Edge Computing,” 4 Dec. 2021)
Describes a more-detailed implementation of the CBO described by ‘Tan’
Any inquiry concerning this communication or earlier communications from the examiner should be directed to GEOFFREY E SUMMERS whose telephone number is (571)272-9915. The examiner can normally be reached Monday-Friday, 7:00 AM to 3:30 PM ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Chan Park can be reached at (571) 272-7409. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/GEOFFREY E SUMMERS/Examiner, Art Unit 2669