DETAILED ACTION
This nonfinal office action is responsive to the amendment filed on December 1, 2025. Claims 1-20 are pending. Claims 1, 8, and 14 are independent.
The objection to the specification is withdrawn in light of applicant’s amendment to paragraph [0003] of the specification.
Claim rejections under 35 USC §103 are withdrawn in light of applicant’s arguments. However, a new grounds of rejection has been made. See sections Claim Rejections – 35 USC §103 and Response to Arguments below.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1 rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. (PPT: A Privacy-preserving global model training protocol for federated learning in P2P networks), hereinafter Chen, in view of Zeng et al. (Federated Learning on the Road Autonomous Controller Design for Connected and Autonomous Vehicles), hereinafter Zeng, in view of Wu et al. (Survey of Knowledge Distillation in Federated Edge Learning), hereinafter Wu.
Regarding claim 1, Chen teaches:
a processor; and a memory storing machine-readable instructions that, when executed by the processor, cause the processor to: (Chen, page 9, column 1, section 6, paragraph 1: “All simulations are implemented on the same computing environment (Linux Ubuntu 16.04, Intel i7-6950X CPU. 62 GB RAM, and 3.6TB SSD) with Tensorflow, Keras, and Py- Cryptodome.”)
train a teacher model… (Chen, page 3, column 2, section 3.1, paragraph 1: “In detail, every target client u i first trains a local model m R i using a central global model M R and D i , where R is the index of the aggregation round.” – This implies that the central global model, which is analogous to the teacher model, is already trained.)
perform the following repeatedly until one or more predetermined convergence criteria have been satisfied: (Chen, page 6, column 1, section 4.4: “Afterward, the updated global model will be distributed to all target clients to train and aggregate over and over again until the global model is converged.”)
distribute … a set of teacher-model parameters associated with the teacher model; (Chen, page 5, column 2, section 4.2, paragraph 1: “When the coordinator starts a training process for a global model, all target clients download the global model M following the inherent P2P transmission protocol.” – The global model being downloaded to all target clients is analogous to the teacher model parameters being distributed.)
receive … a set of student-model parameters associated with a student model… (Chen, page 5, column 2, section 4.2, paragraph 2: “Subsequently, every target client
u
i
re-trains
M
using
D
i
locally. After local training,
u
i
obtains local personalized model
m
i
and prepares to upload the weighted local model update parameters
w
i
x
i
and weights
w
i
.” – Uploading the local model updated parameters and weights is analogous to receiving the set of student-model parameters.)
update the teacher model … in which a combined model based on the sets of student-model parameters is used as a quasi- teacher model, the teacher model being treated as a quasi-student model; (Chen, page 1, column 2, paragraph 1: “Specifically, the central server comprises a coordinator and an aggregator. The aggregator aggregates the local training results and updates the global model under the control of the coordinator.” – The aggregator aggregating the local training results is analogous to the combined model based on the sets of student-model parameters. This is used to update the global model and is thus being used as a quasi-teacher model while the global model being updated is being treated as a quasi-student model.)
Chen does not explicitly teach:
…wherein the teacher model is a machine-learning-based model pertaining to a vehicular application; and
That the set of teacher model parameters are distributed to a plurality of connected vehicles
that the student-model parameters that are received are from each connected vehicle in the plurality, a set of student-model parameters associated with a student model trained at that connected vehicle using local vehicle input data and first knowledge distillation to teach the student model to mimic the teacher model based on the set of teacher-model parameters; and
wherein, after the one or more predetermined convergence criteria have been satisfied, the vehicular application controls operation of at least one connected vehicle in the plurality based, at least in part, on the student model in the at least one connected vehicle.
However, Zeng teaches:
…wherein the teacher model is a machine-learning-based model pertaining to a vehicular application; and (Zeng, page 2, column 2, paragraph 2: “The main contribution of this paper is a novel FL framework that enables CAVs to collaboratively learn and optimize their autonomous controller design in presence of wireless link uncertainties and environmental dynamics.” And page 4, column 1, paragraph 2: “In particular, a wireless BS, operating as a parameter server, will first generate an initial global ANN model parameter w0 for the auto-tuning unit and send it to all CAVs over a downlink broadcast channel.” – The learning and optimization of the autonomous controller of a CAV is analogous to a machine-learning-based model pertaining to a vehicular application. The initial global model is analogous to the teacher model.)
That the set of teacher model parameters are distributed to a plurality of connected vehicles (Zeng, page 4, column 1, paragraph 2: “In particular, a wireless BS, operating as a parameter server, will first generate an initial global ANN model parameter w0 for the auto-tuning unit and send it to all CAVs over a downlink broadcast channel.” – The global model parameter being sent to all CAVs is analogous to distributing the set of teacher-model parameters. The CAVs are the connected vehicles.)
that the student-model parameters that are received are from each connected vehicle in the plurality, a set of student-model parameters associated with a student model trained at that connected vehicle using local vehicle input data and first knowledge distillation to teach the student model to mimic the teacher model based on the set of teacher-model parameters; and (Zeng, page 4, column 1, paragraph 3: “Then, in the first communication round, CAVs will use the received model parameters w0 to independently train their own model based on their local data for I iterations… In the uplink, the CAVs transmit their trained model parameters to the BS.” – The CAVs transmitting the parameters of their local model to the BS is indicative of receiving the student-model parameters, the CAV is analogous to the connected vehicles where the student models are being trained, and the local data is on each CAV and, therefore, is analogous to the local vehicle input data.)
wherein, after the one or more predetermined convergence criteria have been satisfied, the vehicular application controls operation of at least one connected vehicle in the plurality based, at least in part, on the student model in the at least one connected vehicle. (Zeng, page 2, column 1, paragraphs 1-2: “In other words, a cooperative learning framework among multiple CAVs will be needed for properly designing the autonomous controller of a CAV. To this end, one can leverage the wireless connectivity in CAVs and use federated learning (FL) to enable a network of CAVs to collaboratively train the learning models used by their controllers [10]. In FL, the CAVs can train the controller models based on their local data available at their local memory and, then, a parameter server, such as a base station (BS), can aggregate the trained controller models from CAVs. These processes will be repeated among the CAVs and parameter server iteratively until all controllers converge to the optimal learning model. In this way, the learning model can be collaboratively trained among multiple CAVs, and such a trained model can enable a particular CAV’s controller to adapt to new traffic scenarios unknown to the CAV but already experienced by other CAVs in the past.” – The autonomous controller of a CAV is controlling operation of each CAV. The cooperative learning framework to collaboratively train the models used by their controllers is analogous to the steps taken above, which is done until all controllers converge to the optimal learning model and is then deployed on the CAV’s controller to control the CAV.)
Zeng is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Chen, which already teaches training student models using local training and using the student model parameters in a combined model to update the teacher model but does not explicitly teach that the local models are trained on connected vehicles using their local vehicle data and the model is then used to control operation of the connected vehicles, to include the teachings of Zeng which does teach that the local models are trained on connected vehicles using their local vehicle data and the model is then used to control operation of the connected vehicles in order to improve convergence speed of the training and “execute near real-time control decisions” in the autonomous controller.” (Zeng, abstract)
Chen and Zeng do not teach that the student model is trained using first knowledge distillation or that the teacher model is updated using second knowledge distillation.
However, Wu teaches training a model using knowledge distillation. (Wu, page 3, section 2.2, paragraph 1: “Knowledge Distillation (KD) is a machine learning technique for constructive model training through knowledge transfer [35,9]. In common KD frameworks, the transferred model output (typically logits [12,29] or features [22,11,36]) is referred to as knowledge. Fig. 1 shows the general framework of KD. The core of KD is that given the same input, one model (student network) simulates the adjusted outputs of another model (teacher network) to learn from the representation of the latter model” – Knowledge distillation is being used to transfer the knowledge from one model to another during training, the teacher and student models are analogous to the methods described within the claim language. This being a technique for model training is indicative that it can be used in the training taught by Chen and Zeng.)
Wu is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Chen and Zeng, which already teaches training student models at a plurality of connected vehicles then using the student model parameters to update the teacher model but does not explicitly teach that the training includes knowledge distillation, to include the teachings of Wu which does teach that the training includes knowledge distillation in order to keep “private data on devices and only transmits model parameters or encrypted data information to the edge server, which is a promising solution for privacy-preserving edge intelligence.”)
Regarding claim 2, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen does not explicitly teach:
wherein the set of teacher-model parameters includes a complete set of parameters defining the teacher model.
However, Zeng further teaches:
wherein the set of teacher-model parameters includes a complete set of parameters defining the teacher model. (Zeng, page 4, column 1, paragraph 2: “In particular, a wireless BS, operating as a parameter server, will generate an initial global ANN model parameter w0 for the auto-tuning unit and send it to all CAVs over a downlink broadcast channel. Then, in the first communication round, CAVs will use the received model parameters w0 to independently train their own model based on their local data for I iterations.” – The initial model parameters w0 being sent to the CAVs is indicative that the complete set of model parameters is sent.)
Regarding claim 4, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen does not explicitly teach:
wherein the local vehicle input data includes one or more of images, Light Detection and Ranging (LIDAR) data, radar data, sonar data, driver-monitoring data, Controller-Area-Network (CAN) bus data, Inertial-Measurement-Unit (IMU) data, dead- reckoning data, and Global-Positioning-System (GPS) data.
However, Zeng further teaches:
wherein the local vehicle input data includes one or more of images, Light Detection and Ranging (LIDAR) data, radar data, sonar data, driver-monitoring data, Controller-Area-Network (CAN) bus data, Inertial-Measurement-Unit (IMU) data, dead- reckoning data, and Global-Positioning-System (GPS) data. (Zeng, page 4, column 1, paragraph 2: “However, the CAV’s local training data (e.g. camera data containing the longitudinal movement) is constrained by the onboard memory of the CAV, and, thus, the information that can be stored will be limited to a few traffic scenarios.” – The local training data being camera data indicates that the local vehicle input data includes images.)
Regarding claim 5, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen does not explicitly teach:
wherein the student models in the plurality of connected vehicles have a same underlying architecture as the teacher model.
However, Zeng further teaches:
wherein the student models in the plurality of connected vehicles have a same underlying architecture as the teacher model. (Zeng, page 6, algorithm 1 – The parameters of the global model are sent to each CAV and each CAV performs training that updates the global parameters to local parameters. Thus, this is indicative that the local models have the same underlying architecture as the global model as the structure of the parameters is the same between the local and global models.)
Regarding claim 7, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen does not explicitly teach:
wherein the vehicular application is one of computer vision, a range-estimation service, a distracted-driver-detection application, an impaired-driver- detection application, and an application that automatically customizes vehicle settings for a particular driver.
However, Zeng further teaches:
wherein the vehicular application is one of computer vision, a range-estimation service, a distracted-driver-detection application, an impaired-driver- detection application, and an application that automatically customizes vehicle settings for a particular driver. (Zeng, page 3, column 1, paragraph 1: “Using real vehicular data traces, i.e., the Berkeley deep drive (BDD) data [20] and the dataset of annotated car trajectories (DACT) [21], we show that the controller trained by our proposed algorithm can track the target speed over time and under different traffic scenarios (e.g., traffic accidents, traffic congestion, and roadwork zones).” – The experimental results being trained on vehicular data that includes video and image data (BDD and DACT) with the ability to identify different traffic scenarios for autonomously controlling the speed of the vehicle indicates that the vehicular application is computer vision.)
Regarding claim 8, Claim 8 has all the same limitations of claim 1 which are taught by Chen, Zeng, and Wu which represents the instructions for customized machine-learning-based model simplification for connected vehicles – see claim 1 above.
Chen further teaches:
A non-transitory computer-readable medium for customized machine-learning-based model simplification for connected vehicles and storing instructions that, when executed by a processor, cause the processor to: (Chen, page 9, column 1, section 6, paragraph 1: “All simulations are implemented on the same computing environment (Linux Ubuntu 16.04, Intel i7-6950X CPU. 62 GB RAM, and 3.6TB SSD) with Tensorflow, Keras, and Py- Cryptodome.”)
Regarding claim 9, Chen, Zeng, and Wu teach the non-transitory computer-readable medium of claim 8, as cited above.
Claim 9 additionally has the same limitations of claim 2 which are taught by Chen, Zeng, and Wu – see claim 2 above.
Regarding claim 11, Chen, Zeng, and Wu teach the non-transitory computer-readable storage medium of claim 8, as cited above.
Claim 11 additionally has the same limitations of claim 4 which are taught by Chen, Zeng, and Wu – see claim 4 above.
Regarding claim 12, Chen, Zeng, and Wu teach the non-transitory computer-readable storage medium of claim 8, as cited above.
Claim 12 additionally has the same limitations of claim 5 which are taught by Chen, Zeng, and Wu – see claim 5 above.
Regarding claim 14, claim 14 has all the limitations of claim 1 which are taught by Chen, Zeng, and Wu which represents the method – see claim 1 above.
Regarding claim 15, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 15 additionally has the same limitations of claim 2 which are taught by Chen, Zeng, and Wu – see claim 2 above.
Regarding claim 17, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 17 additionally has the same limitations of claim 4 which are taught by Chen, Zeng, and Wu – see claim 4 above.
Regarding claim 18, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 18 additionally has the same limitations of claim 5 which are taught by Chen, Zeng, and Wu – see claim 5 above.
Regarding claim 20, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 20 additionally has the same limitations of claim 7 which are taught by Chen, Zeng, and Wu – see claim 7 above.
Claims 3, 10, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Zeng in view of Wu in view of Woo et al. (US 20230214719), hereinafter Woo.
Regarding claim 3, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen, Zeng, and Wu do not explicitly teach:
wherein the set of teacher-model parameters includes a subset of a complete set of parameters defining the teacher model, the subset including parameters identified as being particularly important for defining the teacher model.
However, Woo teaches:
wherein the set of teacher-model parameters includes a subset of a complete set of parameters defining the teacher model, the subset including parameters identified as being particularly important for defining the teacher model. (Woo, paragraph 0035: “A constraint-based approach called elastic weight consolidation (EWC) may be utilized to alleviate catastrophic forgetting in neural networks by selectively restraining the plasticity of weights depending on the importance of weights to previous tasks.” And paragraph 0072: “Therefore, when the student model is being trained, feature representations of T and S for training data are stored in the representation memory (Rmem.). In the embodiments of the present disclosure, only unique characteristics are selectively stored to minimize a memory space, instead of entire i+1 data characteristics being stored, unlike previous schemes in which a large number of samples are stored.” – Only unique characteristics being stored instead of entire data characteristics is analogous to a subset of the parameters. These are identified using elastic weight consolidation which identifies the importance of the weights (parameters) in the teacher model.)
Woo is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Chen, Zeng, and Wu, which already teaches a set of teacher-model parameters being provided to the student models but does not explicitly teach that the set of teacher-model parameters includes a subset including parameters identified as being particularly important for defining the teacher model, to include the teachings of Woo which does teach that the set of teacher-model parameters includes a subset including parameters identified as being particularly important for defining the teacher model in order to "improve catastrophic forgetting by utilizing the knowledge distillation during transfer learning."
Regarding claim 10, Chen, Zeng, and Wu teach the non-transitory computer-readable medium of claim 8, as cited above.
Claim 10 additionally has the same limitations of claim 3 which are taught by Chen, Zeng, Wu, and Woo – see claim 3 above.
Regarding claim 16, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 16 additionally has the same limitations of claim 3 which are taught by Chen, Zeng, Wu, and Woo – see claim 3 above.
Claims 6, 13, and 19 are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Zeng in view of Wu in view of Zhang et al. (DENSE: Data-Free One-Shot Federated Learning), hereinafter Zhang.
Regarding claim 6, Chen, Zeng, and Wu teach the system of claim 1, as cited above.
Chen, Zeng, and Wu do not explicitly teach:
wherein the student models in the plurality of connected vehicles have a different underlying architecture from an underlying architecture of the teacher model.
However, Zhang teaches:
wherein the student models in the plurality of connected vehicles have a different underlying architecture from an underlying architecture of the teacher model. (Zhang, abstract: “(3) DENSE considers model heterogeneity in FL, i.e., different clients can have different model architectures.” – Zeng already teaches that the student models are in the plurality of connected vehicles. The different clients, e.g., vehicles, being able to have different model architectures implies that the student models have a different architecture than the teacher model.) Zhang is considered analogous to the claimed invention as it is in the same field of endeavor, machine learning. Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date to have modified Chen, Zeng, and Wu, which already teaches the student models in the plurality of connected vehicles but does not explicitly teach that the student models have a different underlying architecture from the teacher model, to include the teachings of Zhang which does teach that the models can have different model architectures since, in real-world applications, it is very common for different clients to have different model architecture thus a method that takes this into account can improve the accuracy of the global (teacher) model. (Zhang, page 2, paragraphs 1-2)
Regarding claim 13, Chen, Zeng, and Wu teach the non-transitory computer-readable medium of claim 8, as cited above.
Claim 13 additionally has the same limitations of claim 6 which are taught by Chen, Zeng, Wu, and Zhang – see claim 6 above.
Regarding claim 19, Chen, Zeng, and Wu teach the method of claim 14, as cited above.
Claim 19 additionally has the same limitations of claim 6 which are taught by Chen, Zeng, Wu, and Zhang – see claim 6 above.
Response to Arguments
Applicant’s arguments, on page 12 that Wu does not disclose the “role reversal” between teacher and student models and on page 13 that the “items” of Kweon is not analogous to the model parameters, with respect to the rejection(s) of claim(s) 1, 8, 14 and 3 and 10, respectively under 35 USC §103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Chen, Zeng, and Wu of claims 1, 8, and 14 and Chen, Zeng, Wu and Woo of claims 3, 10, and 16. See section Claim Rejections – 25 USC §103 above.
Examiner’s note: Examiner previously agreed that Wu does not “combine the student models themselves into a new, singular quasi-teacher model, as recited in the claims.” However, upon further consideration, this combining of the student models themselves is not recited in the claims. The independent claim recites “a combined model based on the sets of student-model parameters is used.” Thus, though Chen does not teach combining the student models themselves, it does teach combining the student model parameters in the aggregator which the examiner is interpreting to being analogous to the combined model.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhu et al. (Data-Free Knowledge Distillation for Heterogenous Federated Learning)
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACQUELINE MEYER whose telephone number is (703)756-5676. The examiner can normally be reached M-F 8:00 am - 4:30 pm EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tamara Kyle can be reached at 571-272-4241. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.C.M./Examiner, Art Unit 2144
/TAMARA T KYLE/Supervisory Patent Examiner, Art Unit 2144