Last updated: April 19, 2026
Application No. 18/225,371
DYNAMIC COMPRESSION AND SPECIALIZATION OF A MACHINE LEARNING MODEL

Non-Final OA §101§103
Filed
Jul 24, 2023
Examiner
RYLANDER, BART I
Art Unit
2124
Tech Center
2100 — Computer Architecture & Software
Assignee
Cisco Technology Inc.
OA Round
1 (Non-Final)
Interview Optional

— +15.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 109 resolved cases, 2023–2026
Examiner Intelligence

RYLANDER, BART I View full profile →
Grants 62% of resolved cases
Career Allow Rate
68 granted / 109 resolved
+7.4% vs TC avg
Moderate +15% lift
Without
With
+15.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 10m
Avg Prosecution
29 currently pending
Career history
138
Total Applications
across all art units
Statute-Specific Performance

§101
19.8%
-20.2% vs TC avg
§103
62.8%
+22.8% vs TC avg
§102
7.4%
-32.6% vs TC avg
§112
7.1%
-32.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 109 resolved cases
Office Action

§101 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. This office action is in response to submission of application on 7/24/2023. Claims 1-20 are presented for examination. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. Claims 1-20 are rejected over 35 U.S.C. § 101 because the claimed invention is directed to an abstract idea without significantly more. Step 1: Is the claim to a process, machine, manufacture or composition of matter? Claims 1-10 are directed to a method (i.e., a process); claims 11-19 are directed to an apparatus (i.e., a machine/apparatus); and claim 20 is directed to a non-transitory, computer readable medium storing instructions (i.e., a product); therefore, all pending claims are directed to one of the four categories of invention. Step 2A: Prong 1: Does the claim recite an abstract idea, law of nature, or natural phenomenon ? Claim 1 recites the limitations of: identifying , by a device, a plurality of tasks that a base machine learning model is able to perform – mental process (observation, evaluation, judgement) as a human mind can identify a plurality of tasks that a machine learning model can perform. causing , by the device, the specialized model to be deployed to the target deployment environment – mental process (observation, evaluation, judgement) as a human mind can cause a specialized model to be deployed to a target environment. which are abstract idea s . Step 2A: Prong 2: Does the claim recite additional elements that integrate the judicial exception into a practical application? The claim recites the additional elements of: receiving , at the device and via a user interface, a request to generate a specialized model to perform a particular task for deployment to a target deployment environment – inputting data is insignificant, extra-solution activity. See MPEP 2106.05(g). using , by the device, knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks – using knowledge distillation without any explanation of the knowledge distillation is mere instructions to apply. See MPEP 2106.05(f)( 3 ) . The additional elements do not integrate the abstract idea into a practical application. Step 2B: Does the claim recite additional elements that amount to significantly more? The additional elements: receiving, at the device and via a user interface, a request to generate a specialized model to perform a particular task for deployment to a target deployment environment - inputting data is insignificant, extra-solution activity. See MPEP 2106.05(g). Transmitting data is well understood, routine, and conventional. See MPEP 2106.05(d)(II)(i). using, by the device, knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks - using knowledge distillation without any explanation of the knowledge distillation is mere instructions to apply. See MPEP 2106.05(f)( 3 ) . The additional elements do not amount to significantly more than the abstract idea. Therefore, the claim is not patent eligible. Independent claims 11 and 20 recite the same relevant limitations and a similar analysis applies. Claim 11 recites the additional elements of “ An apparatus, comprising: a network interface to communicate with a computer network; a processor coupled to the network interface and configured to execute one or more processes; and a memory configured to store a process that is executed by the processor ” – computer components recited at a high level are construed as generic. See MPEP 2106.05(f)(2). As such, the additional elements do not integrate the abstract idea into a practical application. Nor do they amount to significantly more. Claim 20 recites the additional elements of “ A tangible, non-transitory, computer-readable medium storing program instructions ” – computer components recited at a high level are construed as generic. See MPEP 2106.05(f)(2). As such, the additional elements do not integrate the abstract idea into a practical application. Nor do they amount to significantly more. Therefore, the independent claims are not patent eligible. A similar analysis applies to the dependent claims. Claims 2 and 12 recite the additional elements of “ the specialized model is a compressed form of the base machine learning model ” – mere description of the result of an abstract idea. See MPEP 2106.05(f)(3). Claims 3 and 13 recite the additional elements of “ the plurality of tasks comprises one or more of: object detection, image classification, sematic segmentation, or activity recognition ” – mere description of the purpose of the machine learning model without details of the model. See MPEP 2106.05(f)(3). Claims 4 and 14 recite the additional elements of “ the particular task comprises identifying a particular type of object or activity ” - mere description of the purpose of the machine learning model without details of the model. See MPEP 2106.05(f)(3). Claims 5 and 15 recite the additional elements of “ update the specialized model in response to a change in sensor data captured at the target deployment environment ” – updating the model in response to new data is merely inputting data. As such, this is insignificant, extra-solution activity. See MPEP 2106.05(g). Transmitting data is well understood, routine and conventional. See MPEP 2106.05(d)(II)(i). Claims 6 and 16 recite the additional elements of “ receive, via the user interface, a selection of a training dataset associated with the target deployment environment ” - inputting data is insignificant, extra-solution activity. See MPEP 2106.05(g). Transmitting data is well-understood, routine and conventional. See MPEP 2106.05(d)(II)(i) , and ; “ the apparatus trains the specialized model based in part on the training dataset ” – training a model without a description of the model, or the training, such as weights, activation function, is mere instructions to apply. See MPEP 2106.05(f)(3). Claims 7 and 17 recite the additional elements of “ receive, via the user interface, a selected type of machine learning model, wherein the apparatus trains the specialized model as the selected type of machine learning model ” - inputting data is insignificant, extra-solution activity. See MPEP 2106.05(g). Transmitting data is well-understood, routine and conventional. See MPEP 2106.05(d)(II)(i) . Claims 8 and 18 recite the additional elements of “ the selected type of machine learning model differs from that of the base machine learning model ” – machine learning models recited at a high level are construed as generic. See MPEP 2106.05(f)(1). Claims 9 and 19 recite the additional elements of “ the apparatus causes the specialized model to be deployed to the target deployment environment by: ” – mental process (observation, evaluation, judgement) as a human mind can cause a model to be deployed to a target environment; “ sending the specialized model to an execution node associated with the target deployment environment ” – transmitting data is well-understood, routine and conventional. See MPEP 2106.05(d)(II)(i). The dependent claims do not integrate the abstract idea into a practical application. Nor do they amount to significantly more. Therefore, claims 1-20 are not patent eligible. Claim Rejections - 35 USC § 103 In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis ( i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1 , 3- 6, 11, and 13- 16, and 20 are rejected under 35 U.S.C. § 103 as being unpatentable over Mullapudi, et al (Online Model Distillation for Efficient Video Inference, herein Mullapudi), an d Tandon, et al (Knowlywood: Mining Activity Knowledge From Hollywood Narratives, herein Tandon). Regarding claim 1, Mullapudi teaches a method (Mullapudi, abstract, line 8 “ In this paper, we employ the technique of model distillation (supervising a low-cost student model using the output of a high-cost teacher) to specialize accurate, low-cost semantic segmentation models to a target video stream. ” In other words, technique of model distillation is a method. ) comprising: identifying, by a device, a plurality of tasks that a base machine learning model is able to perform (Mullapudi, Figure 1, and, page 3576, column 2, paragraph 4, line 6 “ Unlike other datasets for efficient inference, which consist of streams from fixed-viewpoint cameras such as traffic cameras [19], we capture a diverse array of challenges: from fixed-viewpoint cameras, to constantly moving and zooming television cameras, and hand-held and egocentric video. ” In other words, MRCNN is base machine learning model, generate high-resolution, per-frame semantic segmentation for video streams (sequence of frames where each frame is a task) including fixed-viewpoint cameras and moving and zooming cameras is a plurality of tasks that a base machine learning model is able to perform . ) ; [ receiving, at the device and via a user interface, a request ] to generate a specialized model to perform a particular task for deployment to a target deployment environment (Mullapudi, Figure 1, In other words, live video stream is target deployment environment, and student model “JITNet” designed to be specialized for future frames is a specialized model to perform a particular task in a target deployment environment . ) ; using, by the device, knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks (Mullapudi, Figure 1, and page 3573, column 2, paragraph 2, line 1 “ We employ the technique of model distillation [2, 16], training a lightweight “student” model to output the predictions of a larger, reliable high-capacity “teacher”, but do so in an online fashion, intermittently running the teacher on a live stream to provide a target for student learning. ” In other words, lightweight student model is specialized model, “teacher” model is base model, knowledge distillation is knowledge distillation, output prediction is particular task, and training is train . ) ; and causing, by the device, the specialized model to be deployed to the target deployment environment (Mullapudi, abstract, line 11 “ Rather than learn a specialized student model on offline data from the video stream, we train the student in an online fashion on the live video, intermittently running the teacher to provide a target for learning. ” In other words, train the student in an online fashion is causing the specialized model to be deployed to the target environment . ) . Thus far, Mullapudi does not explicitly teach receiving, at the device and via a user interface, a request . Tandon teaches receiving, at the device and via a user interface, a request ( Tandon, page 230, column 2, paragraph 2, line 5 “ For this, we relied on a user interface as in Table 8, asking two people (one outsider and one of the authors) to enter arbitrary queries of their choice, as long as it fit the template. ” In other words, user interface is user interface, and asking two people to enter arbitrary queries is receiving at a device v ia a user interface, a request .) Both Mullapudi and Tandon are directed to knowledge distillation, among other things. Mullapudi teaches a method comprising identifying, by a device, a plurality of tasks that a base machine learning model is able to perform , generate a specialized model to perform a particular task for deployment to a target deployment environment , using, by the device, knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks , and causing, by the device, the specialized model to be deployed to the target deployment environment ; but does not explicitly teach receiving at the device and via a user interface, a request. Tandon teaches receiving at the device and via a user interface, a request . In view of the teaching of Mullapudi, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Tandon into Mullapudi. This would result in a method comprising identifying, by a device, a plurality of tasks that a base machine learning model is able to perform , receiving, at the device and via a user interface, a request to generate a specialized model to perform a particular task for deployment to a target deployment environment , using, by the device, knowledge distillation on the base machine learning model to train the specialized model to perform the particular task based on at least one of the plurality of tasks , and causing, by the device, the specialized model to be deployed to the target deployment environment. One of ordinary skill in the art would be motivated to do this because a user interface makes the method easier to use thus saving time and money required by human labor. Regarding claim 3, The combination of Mullapudi and Tandon teaches the method as in claim 1, wherein the plurality of tasks comprises one or more of: object detection, image classification, sematic segmentation, or activity recognition (Mullapudi, page 3577, column 1, paragraph 1 “ We use the highest-quality MRCNN [7] without test-time data augmentation, and provide its output for all dataset frames to aid evaluation of classification, detection, and segmentation (semantic and instance level) methods .” In other words, frames is images , classification is image classification which is one or more of object detection, image classification, etc . ) . Regarding claim 4, The combination of Mullapudi and Tandon teaches the method as in claim 1, wherein the particular task comprises identifying a particular type of object or activity (Mullapudi, page 3573, column 2, paragraph 1, line 4 “ Specifically, we apply this methodology to the task of realizing high-accuracy and low-cost semantic segmentation models that continuously adapt to the contents of a video stream. ” Examiner notes that it is known in the art that a semantic segmentation model classifies pixels in an image into a category (e.g . , road, person, car, sky, etc.). In other words, semantic segmentation is identifying and then classifying an object. ) Regarding claim 5, The combination of Mullapudi and Tandon teaches the method as in claim 1, further comprising: updating, by the device, the specialized model in response to a change in sensor data captured at the target deployment environment (Mullapudi, page 3573, column 2, paragraph 1, line 1 In this paper, we embrace this reality and move away from attempting to pre-train a model on camera-specific datasets curated in advance, and instead train models online on a live video stream as new video frames arrive. ” And, page 3575, column 2, paragraph 2, line 1 “ Online training presents many challenges: training samples (frames) from the video stream are highly correlated, there is continuous distribution shift in content (the past may not be representative of the future), and teacher predictions used as a proxy for “ground truth” at training can exhibit temporal instability or errors. The method for updating JITNet parameters must account for these challenges. ” And, page 3573, column 2, paragraph 3, line 14 “ With these weighted labels, we compute the gradients for updating the model parameters using weighted cross-entropy loss and gradient descent. Since training JITNet on a video from a random initialization would require significant training to adapt to the stream, we pretrain JITNet on the COCO dataset, then adapt the pretrained model to each stream. ” In other words, JITnet is specialized model, live video stream is sensor data captured at the target environment, and updating the model parameters is updating. ) Regarding claim 6, The combination of Mullapudi and Tandon teaches the method of claim 1, further comprising: receiving, at the device and via the user interface, a selection of a training dataset associated with the target deployment environment, wherein the device trains the specialized model based in part on the training dataset (Mullapudi, page 3575, column 2, paragraph 3, line 16 “ Since training JITNet on a video from a random initialization would require significant training to adapt to the stream, we pretrain JITNet on the COCO dataset, then adapt the pretrained model to each stream. ” See above mapping. In other words, pretrain JITNet on COCO dataset then adapt is trains the specialized model based in part on the training dataset. Examiner notes that user interface is previously mapped to Tandon . ) . Claims 11 , 13- 1 6 are apparatus claims comprising a network interface to communicate with a computer network , a processor coupled to the network interface and configured to execute one or more processes , and a memory configured to store a process that is executed by the processor corresponding to method claims 1 , and 3-6 , respectively. Otherwise, they are not patentably distinct. The combination of Mullapudi and Tandon teaches an apparatus (Mullapudi, page 35778, column 2, paragraph 2, line 1 “ All evaluated methods generate pixel-level predictions for each class in the video. We use mean intersection over union (mean IoU) over the classes in each video as the accuracy metric. All results are reported on the first 30,000 frames of each video ( ~ 16-20 minutes due to varying fps) unless otherwise specified. Timing measurements for JITNet, MRCNN (see Table 1), and other baseline methods are performed using TensorFlow 1.10.1 (CUDA 9.2/cuDNN 7.3) and PyTorch 0.4.1 for MRCNN on an NVIDIA V100 GPU. ” Examiner notes that an NVIDIA V100 GPU is an apparatus with a network interface containing a processor and a non-transitory, computer readable medium storing instructions. ) Therefore, claims 1 1, and 13-16 are rejected for the same reasons as claims 1, and 3-6 , respectively. Claim 20 is a tangible, non-transitory, computer-readable medium claim corresponding to method claim 1. Otherwise, they are not patentably distinct. The combination of Mullapudi and Tandon teaches a tangible, non-transitory, computer-readable medium . See above mapping. Therefore, claim 20 is rejected for the same reasons as claim 1. Claims 2 and 12 are rejected under 35 U.S.C. § 103 as being unpatentable over Mullapudi, Tandon, and Ch ou dh a ry, et al ( A comprehensive survey on model compression and acceleration, herein Choudhary ). Regarding claim 2, The combination of Mullapudi and Tandon teaches the method as in claim 1, wherein Thus far, the combination of Mullapudi and Tandon does not explicitly teach the specialized model is a compressed form of the base machine learning model . Ch ou dh ar y teaches the specialized model is a compressed form of the base machine learning model ( Choudhary, Fig(s). 1 and 2, and, page 5116, paragraph 2, line 1 “ Pruning is a powerful technique to reduce the number of parameters of DNN (LeCun et al. 1990; Han et al. 2015; Li et al. 2017). In DNN, many parameters are redundant that do not contribute much during training to lower the error and generalize the network. So, after training, such parameters can be removed from the network, and the removal of these parameters will have the least effect on the accuracy of the network. The primary motive of pruning was to reduce the storage requirement of the DL model and make it storage-friendly. Pruning parameters from the dense layer helps in making the model smaller (Srinivas and Babu 2015; Ardakani et al. 2016). Pruning is also used to reduce the computation and speed-up the inference process by pruning parameters/filters from the convolutional layer. Pruning the filter reduces the number of MAC operations in the convolutional layer, and reduction in the MAC operations improves the inference time (Li et al. 2017). ” In other words, compression by pruning is the specialized model is a compressed form of the base machine learning model . ) . Both Choudhary and the combination of Mullapudi and Tandon are directed to knowledge distillation, and compression, among other things. The combination of Mullapudi and Tandon teaches the method of claim 1, but does not explicitly teach the specialized model is a compressed form of the base machine learning model . Choudhary teaches the specialized model is a compressed form of the base machine learning model . In view of the teaching of the combination of Mullapudi and Tandon, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Choudhary into the combination of Mullapudi and Tandon. This would result in the method of claim 1 where the specialized model is a compressed form of the base learning model. One of ordinary skill in the art would be motivated to do this in order to reduce space and speed up execution. (Choudhary, abstract, line 7 “ For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. ”) Claim 12 is an apparatus claim corresponding to method claim 2. Otherwise, they are not patentably distinct. Therefore, claim 12 is rejected for the same reasons as claim 2. Claims 7-8 and 17-18 are rejected under 35 U.S.C. § 103 as being unpatentable over Mullapudi, Tandon, and Li, et al (FedMD: Heterogenous Federated Learning via Model Distillation, herein L i ). Regarding claim 7, The combination of Mullapudi and Tandon teaches the method as in claim 1, further comprising: receiving, at the device and via the user interface, a [ selected type of machine learning model ] , wherein the device trains the specialized model as the [ selected type of machine learning model ] (Mullapudi, See mapping of claim 1. ). Thus far, the combination of Mullapudi and Tandon does not explicitly teach selected type of machine learning model . Li teaches selected type of machine learning model (Li, page 2 , paragraph 6, line 1 “ There are m participants in the federated learning process. Each owns a very small labeled dataset D k := { (x k i ; y i ) } N k i=1 that may or may not be drawn from the same distribution. There is also a large public dataset D 0 := { (x 0 i ; y 0 i ) } N 0 i=1 that everyone can access. Each participant independently designs its own model f k to perform a classification task. The models f k can have different architectures. ” Examiner notes the specification of the instant application recites “ In addition, inputs such as an indication of a model selection may be obtained by the model specialization process 248. An indication of a model selection may include an indication of a type of neural network structure to be used by the input ML model 402 and/or by targeted ML model 412. Some examples of a model selection may include a long short-term memory (LSTM) neural network, a residual network with 34 layers (ResNet34) neural network, a residual network with 18 layers (ResNetl8) neural network, a mobile network (MobileNet) neural network, and a wide residual network (WideResNet) neural network, among others. ” (Specification, page 16, line 13.) Therefore, examiner is interpreting that the specialized models may have different neural network architectures than the base model and that these architectures are determined (selected ) as part of the specialization process. In other words, models f k can have different architectures is selected type of machine learning model .) Both Li and the combination of Mullapudi and Tandon are directed to knowledge distillation, among other things. The combination of Mullapudi and Tandon teaches the method of claim 1, but does not explicitly teach a selected type of machine learning model . Li teaches a selected type of model. In view of the teaching of Mullapudi and Tandon, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Li into the combination of Mullapudi and Tandon. This would result in the method of claim 1, and a selected type of machine learning model. One of ordinary skill in the art would be motivated to do this in order to make federated learning more universal, and thus useful. (Li, abstract, line 1 “ Federated learning enables the creation of a powerful centralized model without compromising the data privacy of multiple participants. While successful, it does not incorporate the case where each participant independently designs its own model. Due to intellectual property concerns and heterogeneous nature of tasks and data, this is a widespread requirement in applications of federated learning to areas such as health care and AI as a service. In this work, we use transfer learning and knowledge distillation to develop a universal framework that enables federated learning when each agent owns not only their private data, but also uniquely designed models. ”) Regarding claim 8, The combination of Mullapudi , Tandon , and Li teaches the method as in claim 7, wherein the selected type of machine learning model differs from that of the base machine learning model (Li, see mapping of claim 7 . In other words, models can have different types of architectures is the selected type machine learning model differs from that of the base machine learning model .) Claims 17 - 18 are apparatus claims that correspond to method claims 7 - 8, respectively. Otherwise, they are not patentably distinct. Therefore, claims 17 - 18 are rejected for the same reasons as claims 7-8, respectively. Claims 9-10 and 19 are rejected under 35 U.S.C. § 103 as being unpatentable over Mullapudi, Tandon, and Canady, et al (Applying DDDAS Principles for Realizing Optimized and Robust Deep Learning Models at the Edge, herein Canady). Regarding claim 9, The combination of Mullapudi and Tandon teaches the method as in claim 1, wherein causing the specialized model to be deployed to the target deployment environment comprises: Thus far, the combination of Mullapudi and Tandon does not explicitly teach sending the specialized model to an execution node associated with the target deployment environment. Canady teaches sending the specialized model to an execution node associated with the target deployment environment ( Canady, abstract, line 13 “ DDDAS is used to dynamically instrument the edge-deployed, quantized DL models for data on the effectiveness of their quantization and robustness abilities, which in turn is used to drive an automated, cloud-based process that uses tools, such as Apache TVM, to generate quantized, optimized and robust DL models suitable for the edge. These models subsequently are automatically deployed at the edge using orchestration tools. Preliminary studies using this approach have shown its effectiveness in image classification and object detection applications. ” In other words, automatically deploy is sending the specialized model to an execution node in the target deployment environment .) Both Canady and the combination of Mullapudi and Tandon are directed to knowledge distillation and specialized models, among other things. The combination of Mullapudi and Tandon teaches the method of claim 1, but does not explicitly teach sending the specialized model to an execution node associated with the target deployment environment . Canady teaches sending the specialized model to an execution node associated with the target deployment environment. In view of the teaching of the combination of Mullapudi and Tandon, it would be obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the teaching of Canady into the combination of Mullapudi and Tandon. This would result in the method of claim 1, and sending the specialized model to an execution node associated with the target deployment environment . One of ordinary skill in the art would be motivated to do this because of the growth and costs associated with edge computing. (Canady, page 327, paragraph 1, line 1 “ Edge Computing is the idea of moving the computation from the cloud closer to the sensors or Internet of Things (IoT). One of the benefits of this approach is that it eliminates the need to send data back to the cloud, which can be very costly depending on the type of data. ”) Regarding claim 10, The combination of Mullapudi , Tandon , and Canady teaches the method as in claim 1, wherein the specialized model takes video data as input to perform the particular task ( Mullapudi , page 3573, column 2, paragraph 1, line 1 “ In this paper, we embrace this reality and move away from attempting to pre-train a model on camera-specific datasets curated in advance, and instead train models online on a live video stream as new video frames arrive. ” In other words, live video stream data is video data. ) . Claim 19 is an apparatus claim corresponding to method claim 9. Otherwise, they are not patentably distinct. Therefore, claim 19 is rejected for the same reasons as claim 9. The prior art made of record and not used is considered pertinent to applicant’s disclosure: Dey, et al (US 2023/0115700 A1), “Automated Generation of Machine Learning Models” discloses a hardware processing unit to perform an iterative model-growing process that involves modifying parent models to obtain child models. Li, et al, (US 2021/0407090 A1), “Visual Object Instance Segmentation Using Foreground-Specialized Model Imitation” disclose s a specialized teacher model to perform visual object instance segmentation in order to segment and classify objects in first training images where t he first training images contain foreground objects without backgrounds. The method also includes training, using the at least one processor, a student model to perform visual object instance segmentation in order to segment and classify objects in second training images. Ross, et al, (US 11,288,595 B2), “Minimizing Memory and Processor Consumption In Creating Machine Learning Models” discloses a system that can create a new machine learning model by improving and combining existing machine learning models in a modular way. By combining existing machine learning models, the system can avoid the step of training a new machine model. ” Yan, et al, (US 11,200,497 B1), “System and Method for Knowledge-Preserving Neural Network Pruning” discloses a method that includes obtaining a pre-trained machine learning model trained based on a plurality of general-purpose training data; training a task-specific machine learning model by tuning the pre-trained machine learning model based on a plurality of task-specific training data corresponding to a task; constructing a student network based on the task-specific machine learning model; simultaneously performing (1) knowledge distillation from the trained task-specific machine learning model as a teacher network to the student network and (2) network pruning on the student network; and obtaining the trained student network for serving the task. Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT BART RYLANDER whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (571)272-8359 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT Monday - Thursday 8:00 to 5:30 . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Miranda Huang can be reached at FILLIN "SPE Phone?" \* MERGEFORMAT 571-270-7092 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /Bart I Rylander/ Examiner, Art Unit 2124
Read full office action
Prosecution Timeline

Jul 24, 2023
Application Filed
Mar 06, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/509,582
Patent 12555002
RULE GENERATION FOR MACHINE-LEARNING MODEL DISCRIMINATORY REGIONS
2y 5m to grant Granted Feb 17, 2026
17/204,188
Patent 12530572
Method for Configuring a Neural Network Model
2y 5m to grant Granted Jan 20, 2026
18/072,677
Patent 12530622
GENERATING NEW DATA BASED ON CLASS-SPECIFIC UNCERTAINTY INFORMATION USING MACHINE LEARNING
2y 5m to grant Granted Jan 20, 2026
17/956,120
Patent 12493826
AUTOMATIC MACHINE LEARNING FEATURE BACKWARD STRIPPING
2y 5m to grant Granted Dec 09, 2025
18/118,110
Patent 12488318
EARNING CODE CLASSIFICATION
2y 5m to grant Granted Dec 02, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
62%
Grant Probability
77%
With Interview (+15.0%)
3y 10m
Median Time to Grant
Low
PTA Risk
Based on 109 resolved cases by this examiner. Grant probability derived from career allow rate.