Office Action Analysis: 18027061 — METHODS AND SYSTEMS FOR UPDATING MACHINE LEARNING MODELS

Examiner Intelligence

SANKS, SCHYLER S View full profile →

Grants 72% — above average

Career Allowance Rate

367 granted / 507 resolved

+17.4% vs TC avg

Strong +16% interview lift

Without

With

+16.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 10m

Avg Prosecution

24 currently pending

Career history

542

Total Applications

across all art units

Statute-Specific Performance

§101

0.6%

-39.4% vs TC avg

§103

75.3%

+35.3% vs TC avg

§102

6.4%

-33.6% vs TC avg

§112

17.2%

-22.8% vs TC avg

Black line = Tech Center average estimate • Based on career data from 507 resolved cases

Office Action

§102 §103 §112

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Election/Restrictions
Claims 15, 18, 20, and 22-24 are withdrawn from further consideration pursuant to 37 CFR 1.142(b), as being drawn to a nonelected invention, there being no allowable generic or linking claim. Applicant timely traversed the restriction (election) requirement in the reply filed on 01/26/2026.
Applicant's election with traverse of claims 1-14 in the reply filed on 01/26/2026 is acknowledged.  The traversal is on the ground(s) that Groups I-IV share a special technical feature.  This is not found persuasive because, as shown herein, there is no special technical feature that makes a contribution over the prior art in claim 1 or claims 2-14 and therefore a special technical feature cannot be shared by groups I-IV.
The requirement is still deemed proper and is therefore made FINAL.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.

The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.

Claims 6-8 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Regarding claims 6-8, “the local ML model” lacks antecedent basis in the claims.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.

Claim(s) 1-10 and 13 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Lu (Lu, Yan, et al. "Collaborative learning between cloud and end devices: an empirical study on location prediction." Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. 2019.)
Regarding claim 1, Lu teaches a method performed by a client computing device, the method comprising: 
obtaining a first machine learning (ML) model (Figure 2: M2);
training a second ML model using at least an input data set and the first ML model, wherein training the second ML model using at least the input data set and the first ML model comprises training the second ML module based at least on the input data set and a first output data set generated by the first ML model based on the input data set (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             is trained by distilling M2 with                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                            ); 
obtaining, as a result of training the second ML model, a third ML model (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             can be the third model once distillation is complete); and 
deploying the third ML model (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             → inference). 
Regarding claim 2, Lu teaches all of the limitations of claim 1, the method further comprising: 
obtaining a fourth ML model, wherein the fourth ML model is configured to receive the input data set and to generate a second output data set based on the input data set (Figure 2:                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                            ), wherein the training of the second ML model comprises training the second ML model based at least on the input data set, the first output data set, and the second output data set (Figure 2:                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                             is used in distilling                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             because output from                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                             (second output) is distilled against the output of M2 (first output) to train                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             and is all based on the input dataset).  
Regarding claim 3, Lu teaches all of the limitations of the method of claim 2, the method further comprising: 
calculating an output average or a weighted output average of (i) data included in the first output data set and (ii) data included in the second output data set (see §3.3 where the distillation loss function can be cross-entropy loss or KL-divergence. In the case of cross-entropy loss, the loss involves an expected value, or average, of the outputs of each model. In the case of KL-divergence, the loss is fundamentally a measure of the expectation of the logarithmic difference between the distributions, i.e. involves an expected value, or average, of the outputs), wherein the training of the second ML model comprises:
providing to the second ML model the input data set (Figure 2);
providing to the second ML model the calculated output average or the calculated weighted output average; and changing one or more parameters of the second ML model based at least on (i) the input data set and (ii) the calculated output average or the calculated weighted output average (see Figure 2 and §3. The training of the second ML model can be described as the distillation process providing the output average of the client and cloud models (via the loss function) and because the second ML model is trained based on this loss, its parameters are changed based on the input dataset and the output average).
Regarding claim 4, Lu teaches all of the limitations of claim 3 and thereby teaches all of the limitations of claim 4 because the limitations of claim 4 are not required under the broadest reasonable interpretation of the claim.
Per MPEP 2111.04, optional method steps are not required under the broadest reasonable interpretation of a method claim. The weighted output average is an optional step in claim 3 and therefore the definition thereof in claim 4 is optional.
Regarding claim 5, Lu teaches the method of claim 1, the method comprising: 
receiving from a control entity global model information identifying a global ML model (Figure 2: M2); 
training, based at least on the input data set or different input data set, the global ML model (§3.1, Stage 1); 
as a result of the training the global ML model, obtaining a local ML model (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                            ); and 
transmitting toward the control entity local ML model information identifying the local ML model, wherein the local ML model is the first ML model or the second ML model (Figure 2: cloud distillation).  
Regarding claim 6, Lu teaches all of the limitations of claim 1, wherein
the first ML model is one of the local ML model or a specific use-case model (Figure 2: M2 can be considered a specific use-case model), and the second ML model is a currently deployed ML model that is currently deployed at the client computing device (Figure 2). 
Regarding claim 7, Lu teaches all of the limitations of claim 2, wherein
the first ML model is one of the local ML model or a specific use-case model (Figure 2: M2 can be considered a specific use-case model),
the second ML model is a currently deployed ML model that is currently deployed at the client computing device, and the fourth ML model is another one of the local ML model and (ii) the specific use-case model (Figure 2:                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                             is a local model on a client device and                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             is currently deployed on the client device).
Regarding claim 8, Lu teaches all of the limitations of claim 1, wherein 
the first ML model is a specific use-case model or a currently deployed ML model that is currently deployed at the client computing device (Figure 2: M2 can be considered a specific use-case model), and 
the second ML model is the local ML model (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             is currently deployed on the client device so can be considered local).
Regarding claim 9, Lu teaches all of the limitations of claim 6, the method further comprising: 
receiving from a shared storage specific use-case model information identifying the specific use-case model (Figure 2: M2 is taken from the cloud, i.e. shared storage, thereby shared storage specific use-case model information identifying the model is received from there), wherein 
the specific use-case model is shared among two or more client computing devices including the client computing device; and the shared storage is configured to be accessible by said two or more client computing devices (Figure 2: each client device accesses the cloud storage for cloud distillation and iterations of the specific use case model are shared to each client device).
Regarding claim 10, Lu teaches all of the limitations of claim 6, wherein
the deploying of the second ML model comprises replacing the currently deployed ML model with the second ML model as the model that is currently deployed at the client computing device (Figure 2:                                 
                                    
                                            m
                                        
                                            2
                                        
                                            i
                                        
                             replaces                                 
                                    
                                            m
                                        
                                            1
                                        
                                            i
                                        
                            ).  
Regarding claim 13, Lu teaches all of the limitations of claim 6, wherein
the specific use-case ML model is associated with any one or a combination of a particular season of a year, a particular time period within a year, a particular public event, and a particular value of the temperature of the area in which the client computing device is located (Figure 2: The use case model can be considered associated with a particular time period within a year in which it is deployed).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 11-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu (Lu, Yan, et al. "Collaborative learning between cloud and end devices: an empirical study on location prediction." Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. 2019.) in view of Agrawal (US20180341706A1).
Regarding claim 11, Lu teaches all of the limitations of claim 1, but does not teach wherein the input data set is stored in a local storage element; the local storage element is included in the client computing device; and the method further comprises, after deploying the third ML model, removing the input data set from the local storage element.
Agrawal teaches storing a dataset in a local storage element, the local storage element included in the client computing device, and deleting the dataset after a threshold period of time (¶36-38) in order to clear up memory (¶36-38).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify Lu such that the input data set is stored in a local storage element; the local storage element is included in the client computing device; and the method further comprises, after deploying the third ML model, removing the input data set from the local storage element in order to clear space on the client device when necessary.
Regarding claim 12, Lu as modified teaches all of the limitations of claim 11, wherein
the input data set is removed from the local storage element in response to an occurrence of a triggering condition; and the occurrence of the triggering condition is any one or a combination of (i) that a predefined time has passed from the timing of storing the input data set at the local storage element, (ii) receiving a removing command signal from the control entity, and (iii) that the amount of storage spaces available at the local storage element is less than a threshold value (see ¶36-38 of Agrawal, where length of time and memory limitations are described).
Claim(s) 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Lu (Lu, Yan, et al. "Collaborative learning between cloud and end devices: an empirical study on location prediction." Proceedings of the 4th ACM/IEEE Symposium on Edge Computing. 2019.) in view of Le (US20180234453A1).
Regarding claim 14, Lu teaches all of the limitations of claim 1, wherein the client computing device is a base station (the client computing device can be considered a base station).
Lu does not teach wherein the client computing device is a base station and the third ML model is a ML model for predicting traffic load in a region associated with the base station.
Le teaches the application of a machine learning model for predicting traffic in a region associated with the base station (¶53).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to apply the third machine learning model to traffic prediction for the base station in order to accurately predict traffic load for the base station.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Hinton, G., Vinyals, O., Dean, J. (2015) "Distilling the knowledge in a neural network". arXiv preprint arXiv: 1503.02531. – This paper on knowledge distillation discloses the general features of Applicant’s invention as presented in claim 1.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SCHYLER S SANKS whose telephone number is (571)272-6125. The examiner can normally be reached 06:30 - 15:30 Central Time, M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/SCHYLER S SANKS/Primary Examiner, Art Unit 2129

Read full office action

Prosecution Timeline

Mar 17, 2023

Application Filed

Mar 09, 2026

Non-Final Rejection mailed — §102, §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/469,543

Patent 12632778

AUTOMATED FEATURE ENGINEERING FOR MACHINE LEARNING MODELS

4y 8m to grant Granted May 19, 2026

18/067,802

Patent 12632787

SYSTEMS AND METHODS FOR TRAINING MACHINE LEARNING MODELS USING GENERATED FACETED MODELS

3y 5m to grant Granted May 19, 2026

17/851,390

Patent 12619896

Bayesian Optimal Model System (BOMS) for Predicting Equilibrium Ripple Geometry and Evolution

3y 10m to grant Granted May 05, 2026

18/152,020

Patent 12614114

INTERNET-OF-THINGS-ORIENTED MACHINE LEARNING CONTAINER IMAGE DOWNLOAD METHOD AND SYSTEM

3y 3m to grant Granted Apr 28, 2026

18/162,618

Patent 12602588

NEURAL NETWORK MODEL OPTIMIZATION METHOD BASED ON ANNEALING PROCESS FOR STAINLESS STEEL ULTRA-THIN STRIP

3y 2m to grant Granted Apr 14, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

72%

Grant Probability

88%

With Interview (+16.0%)

2y 10m (~0m remaining)

Median Time to Grant

Low

PTA Risk

Based on 507 resolved cases by this examiner. Grant probability derived from career allowance rate.

METHODS AND SYSTEMS FOR UPDATING MACHINE LEARNING MODELS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

METHODS AND SYSTEMS FOR UPDATING MACHINE LEARNING MODELS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email