Last updated: May 29, 2026
Application No. 18/318,616
MODEL DISTILLATION TRAINING METHOD, RELATED APPARATUS AND DEVICE, AND READABLE STORAGE MEDIUM

Non-Final OA §103§112
Filed
May 16, 2023
Priority
Nov 17, 2020 — continuation of PCTCN2020129478
Examiner
CHEN, KUANG FU
Art Unit
2143
Tech Center
2100 — Computer Architecture & Software
Assignee
Huawei Technologies Co., Ltd.
OA Round
1 (Non-Final)
Interview Optional

— +65.1% interview lift. Examiner has a relatively high allowance rate (81%); +65.1% interview lift. A written response may suffice.
Based on 258 resolved cases, 2023–2026
Examiner Intelligence

CHEN, KUANG FU View full profile →
Grants 81% — above average
Career Allowance Rate
209 granted / 258 resolved
+26.0% vs TC avg
Strong +65% interview lift
Without
With
+65.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
26 currently pending
Career history
292
Total Applications
across all art units
Statute-Specific Performance

§101
8.4%
-31.6% vs TC avg
§103
82.7%
+42.7% vs TC avg
§102
5.2%
-34.8% vs TC avg
§112
3.2%
-36.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 258 resolved cases
Office Action

§103 §112
DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the application filed and claims elected 3/12/2026.
Claims 1-10 are presented for examination.

Election/Restrictions
Claims 1 and 7-10 are withdrawn from considerations as they are directed to nonelected inventions, there being no allowable generic or linking claim. Election of claims 2-6 was made without traverse in the reply filed on 3/12/2026.

Information Disclosure Statement
The information disclosure statement (IDS) submitted 8/23/2023 and 12/3/2023 have been considered by the examiner.

Drawings
Figure 1 should be designated by a legend such as --Prior Art-- because only that which is admitted to be conventional/old technology (the central network element directly delivering a large neural network to the edge network element, and/or distilling the same small neural network and delivering it to the edge network element) is illustrated. See MPEP § 608.02(g) and 37 CFR 1.84(p)(1). Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to this Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either "Replacement Sheet" or "New Sheet" pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Specification
The disclosure is objected to because of the following informalities: (i) the specification [0050] contains the misspelling "a first devie design the first neural network model" — the word "devie" should read "device"; and (ii) the specification [0098] states "This helps a second device design the second neural network model based on the structure information of the second reference neural network model." This statement is inconsistent with the remainder of the disclosure, which uniformly attributes the design of the small (second) neural network model to the first device (edge network element). The passage should be conformed to read "This helps the first device design the second neural network model." Appropriate correction is required.

Claim Rejections - 35 USC § 112(b)
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 2-6 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
	
Regarding independent claim 2 (lines 11-12) reciting in part "the distillation result notification" in the third sending step. There is insufficient antecedent basis for this limitation in the claim. Only "a distillation notification" is previously recited; no antecedent "a distillation result notification" appears in claim 2 or in any claim from which claim 2 depends. It is unclear whether "the distillation result notification" is intended to refer to the just-recited "distillation notification" or to a separate notification.  Thus, claim 2 is indefinite.  For the purposes of examination said limitation is interpreted as “the distillation notification”.

Regarding dependent claim 3 (lines 4-5) reciting in part "the second data information" within the second sending step. There is insufficient antecedent basis for this limitation in the claim. Neither parent claim 2 nor the earlier portion of claim 3 introduces "a second data information." It is further unclear whether "the second data information" is intended to be a separate transmission step (omitted) or a typographical substitute for "the second configuration information" (in which case the limitation is internally redundant).  Thus, claim 3 is indefinite.  For the purposes of examination said limitation is interpreted as “a second data information”.

	Regarding dependent claims 4-6, these claims variously depend from and do not cure the deficiencies of claims 2-3, thus claims 4-6 are also rejected under 35 U.S.C. 112(b) for at least being dependent on a rejected parent claim.	

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 2-4 and 6 are rejected under 35 U.S.C. 103 as being unpatentable over 3GPP TR 23.700-91 V1.0.0 "Study on Enablers for Network Automation for the 5G System (5GS); Phase 2 (Release 17)" (September 2020) (hereinafter 3GPP TR 23.700-91), in view of Hinton et al. "Distilling the Knowledge in a Neural Network" (2015) (hereinafter Hinton), and further in view of 3GPP TS 29.510 V16.4.0 (2020-07), “5G; 5G System; Network function repository services; Stage 3 (Release 16)” (July 2020) (hereinafter 3GPP TS 29.510).

	Regarding independent claim 2, 3GPP TR 23.700-91 teaches sending, by a first device, a second training request to a third device (Section 6.56.1.1 on p. 231 "A provider NWDAF [network data analytics function] instance provides the trained data model to consumer NWDAF instances via data model provision service. It registers its capability to expose a trained data model in the NRF [network repository function]. A consumer NWDAF instance discovers the address of the provider NWDAF instance by inquiring the NRF", Section 6.56.1.2 step 4 on p. 232 "Consumer NWDAF sends discovery request of 'dataModelProvision service' with a list of service parameters (e.g. model type, analytics ID, feature sets/input event IDs, etc.) to NRF"; the Consumer NWDAF construed as the first device, the NRF construed as the third device, the discovered Provider NWDAF instance construed as the second device, and the dataModelProvision-service discovery request from the Consumer NWDAF to the NRF construed as the second training request), the second training request comprising a fourth training type ID, second distillation query information, and second distillation capability information (Section 6.56.1.1 on p. 231 “Input: Model type(algorithm), model inputs(e.g. event ID/feature set) and outputs(e.g. analytics ID)", Section 6.56.1.2 step 4 on p. 232 "…a list of service parameters (e.g. model type, analytics ID, feature sets/input event IDs, etc.)"; the model type / algorithm element of the service-parameter list construed as the fourth training type ID carried by the dataModelProvision-service discovery request as the second training request, the model inputs (event ID / feature set) element corresponds to the second distillation query information, and the model outputs (analytics ID) element identifying the analytics task the model is capable of producing construed as the second distillation capability information); receiving, by the first device, a third response sent by the third device when the fourth training type ID is consistent with a third training type ID (Section 6.56.1.2 step 1 on p. 232 "Provider NWDAF registers its trained model provision capability (i.e. 'dataModelProvision service' with a list of supported model) as part of its profile in the NRF", Section 6.56.1.2 step 5 on p. 232 "NRF response with the NWDAF instance which provides the requested 'dataModelProvision service'"; the model type registered in the Provider NWDAF's NRF profile corresponds to the third training type ID, the NRF's matching of the discovery-request model type against the registered profile model type reads on when the fourth training type ID is consistent with a third training type ID, and the NRF's response identifying the matching Provider NWDAF instance corresponds to the receiving by the first device of a third response sent by the third device), the third response comprising training response information, a third neural network model ID, second storage information, and a second category list (Section 6.56.1.1 on p. 231 “Output: Model description or requested model parameters", Section 6.56.1.2 step 5 on p. 232 NRF response identifying the discovered NWDAF instance, Section 6.56.1.2 step 1 on p. 232 Provider NWDAF profile includes "a list of supported model"; the NRF's response identifying the discovered Provider NWDAF instance corresponds to the third response comprising training response information, the model description portion of the Output identifying the trained data model held by the discovered Provider NWDAF corresponds to a third neural network model ID, the requested-model-parameters portion of the Output comprising data describing what the consumer is to store regarding the trained model and construed as second storage information, and the list of supported models registered as part of the Provider NWDAF’s profile in the NRF, including the analytics IDs the model can produce, construed as a second category list).
3GPP TR 23.700-91 does not expressly teach a model distillation training method, comprising: the fourth training type ID indicating a function type of a neural network model on which the first device is to perform distillation training; the third training type ID indicating a function type of a neural network model on which a second device supports distillation training.  The Section 6.56.1.2 NOTE on p. 232 states that “the data model itself is provided to the consumer NWDAF instances in a file/transparent container,” characterizing the model exchange as opaque-container model sharing but is silent as to the underlying knowledge-transfer mechanism.
However, Hinton teaches a model distillation training method (Abstract "compress the knowledge in an ensemble into a single model which is much easier to deploy", Section 1 "Once the cumbersome model has been trained, we can then use a different kind of training, which we call 'distillation' to transfer the knowledge from the cumbersome model to a small model that is more suitable for deployment"; the disclosed cumbersome-to-small-model knowledge transfer reads on a model distillation training method), comprising: the fourth training type ID indicating a function type of a neural network model on which the first device is to perform distillation training; the third training type ID indicating a function type of a neural network model on which a second device supports distillation training (Section 1 "For tasks like speech and object recognition, training must extract structure from very large, highly redundant datasets", Section 2 Eq. 1 defining the softmax function over class logits and "In the simplest form of distillation, knowledge is transferred to the distilled model by training it on a transfer set"; Hinton supplies the substantive mechanism by which the model type identifier of 3GPP TR 23.700-91's dataModelProvision service is qualified as a function type identifier specifically for distillation training (the third training type ID indicating a function type of a neural network model), with the cumbersome / generalist model on the second device supporting the distillation training (on which a second device supports distillation training) and the small / distilled model on the first device being trained therefrom, such that the model type identifier (comprising: the fourth training type ID) in the discovery exchange identifies the function the small model is to perform via distillation from the cumbersome model (indicating a function type of a neural network model on which the first device is to perform distillation training)).
Because 3GPP TR 23.700-91 and Hinton both address efficient deployment of neural-network-based analytics services in resource-constrained environments, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate Hinton’s cumbersome-to-small-model distillation training into 3GPP TR 23.700-91’s method of dataModelProvision-service discovery and consumption, with a reasonable expectation of success, such that the dataModelProvision service’s model type identifier is qualified as a function-type identifier for distillation training and the registered Provider NWDAF is qualified as the second device that supports distillation training on the model identified by that type, thereby teaching a model distillation training method, comprising: the fourth training type ID indicating a function type of a neural network model on which the first device is to perform distillation training; the third training type ID indicating a function type of a neural network model on which a second device supports distillation training. This modification would have been motivated by the desire to enable a 5G NF consumer constrained by deployment-side compute resources to discover, from the NRF, a peer NWDAF that holds a cumbersome reference model suitable for distillation, since the 3GPP TR 23.700-91 Section 6.56.1.2 NOTE’s opaque-container exchange leaves open the consumer-side training mechanism and Hinton Section 1 expressly teaches that distillation enables deployment of a model “that is more suitable for deployment”.
3GPP TR 23.700-91 and Hinton do not expressly teach and sending, by the first device, a distillation notification to the third device, the distillation result notification (interpreted as the distillation notification per the 35 U.S.C. 112(b) rejection above) indicating whether the first device successfully matches the second device.
However, 3GPP TS 29.510 teaches sending, by the first device, a distillation notification to the third device, the distillation notification indicating whether the first device successfully matches the second device (Clause 5.2.1 on p. 14 “The Nnrf_NFManagement service allows an NF or an SCP Instance in the serving PLMN to register, update or deregister its profile in the NRF”, Clause 5.2.2.1 on p. 15 “NFUpdate: It allows an NF or SCP Instance to replace, or update partially, the parameters of its profile (including the parameters of the associated services, if any) in the NRF; it also allows to add or delete individual services offered by the NF Instance”, Clause 5.2.2.3 on p. 17 disclosing the NFUpdate service operation as a profile-update transaction issued by the NF Service Consumer to the NRF, Clause 5.2.2.3.2 on p. 19 “Each NF that has previously registered in NRF shall contact the NRF periodically (heart-beat), by invoking the NFUpdate service operation, in order to show that the NF is still operative”, Clause 5.2.1 on p. 14 “It also allows an NF or an SCP to subscribe to be notified of registration, deregistration and profile changes of NF Instances, along with their potential NF services”; the Nnrf_NFManagement service exposed by the NRF to consumer NFs is construed as the NF→NRF signaling channel by which the first device (Consumer NWDAF) reports state to the third device (NRF), the Nnrf_NFManagement_NFUpdate service operation issued by the Consumer NWDAF upon completing the post-discovery request/subscribe step of 3GPP TR 23.700-91 Section 6.56.1.2 step 6 corresponds to the sending by the first device of a distillation notification to the third device, and the post-binding state reflected in the NFUpdate-carried profile/status indication of the Consumer NWDAF with respect to the discovered Provider NWDAF corresponds to indicating whether the first device successfully matches the second device).
Because 3GPP TR 23.700-91, in view of Hinton, and 3GPP TS 29.510 address the issue of efficient discovery, deployment, and operation of neural-network-based analytics services in the resource-constrained environments of the 5G Service-Based Architecture, accordingly, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to incorporate the teachings of Nnrf_NFManagement_NFUpdate NF→NRF status reporting as suggested by 3GPP TS 29.510 into 3GPP TR 23.700-91 and Hinton’s method of dataModelProvision-service discovery and consumption, with a reasonable expectation of success, such that the Consumer NWDAF, upon completing the post-discovery request/subscribe step of 3GPP TR 23.700-91 Section 6.56.1.2 step 6, invokes Nnrf_NFManagement_NFUpdate to inform the NRF of the binding outcome to the discovered Provider NWDAF, thereby teaching sending, by the first device, a distillation notification to the third device, the distillation notification indicating whether the first device successfully matches the second device. This modification would have been motivated by the desire to allow the NRF to release or maintain the discovery context based on the reported binding outcome, propagate Provider NWDAF utilization/binding state to other NFs via the NRF’s existing subscribe/notify fabric of TS 29.510 clause 5.2.1, and enable Consumer-NWDAF-side retry/fallback to an alternative Provider NWDAF where matching fails, as 3GPP TR 23.700-91 expressly contemplates such NF–NRF interaction by cross-referencing companion 3GPP specifications (e.g., TS 23.501 clause 6.2.6, TS 23.288 clause 5.1) for foundational NF behavior, and Hinton expressly teaches at Section 1 the deployment concern of “stringent requirements on latency and computational resources” that the NF→NRF status-reporting primitive of TS 29.510 directly addresses by enabling efficient resource management for the resource-constrained Consumer NWDAF.

Regarding dependent claim 3, 3GPP TR 23.700-91, in view of Hinton and 3GPP TS 29.510, teach the method according to claim 2, the method further comprising: designing, by the first device, a second neural network model (see Hinton Abstract "compress the knowledge in an ensemble into a single model which is much easier to deploy", Section 1 "a small model ... that is more suitable for deployment"; the small / distilled model designed by the consumer side for deployment construed as a second neural network model designed by the first device, wherein the combination of 3GPP TR 23.700-91 and Hinton provides for the first device as the consumer NWDAF [network data analytics function] see 3GPP TR 23.700-91Section 6.56.1.1 on p. 231); sending, by the first device, second configuration information to the second device (see 3GPP TR 23.700-91 Section 6.56.1.2 step 6 on p. 232 "Consumer NWDAF requests/subscribes to the 'dataModelProvision service' of the discovered provider NWDAF instance", Section 6.56.1.2 step 7 on p. 232 "The discovered provider NWDAF instance response with the requested trained data model/model parameters", Section 6.56.1.1 on p. 231 “Input: Model type(algorithm), model inputs(e.g. event ID/feature set) and outputs(e.g. analytics ID), Optional: area of interests, UE types, application ID, NSSAI, time, Optional: requested model parameters Output: Model description or requested model parameters"; the Consumer NWDAF's request / subscribe message to the discovered Provider NWDAF carrying the dataModelProvision-service Input parameter list (sending, by the first device, second configuration information to the second device)), the second configuration information to configure a second reference neural network model (see Hinton Section 1 "to raise the temperature of the final softmax until the cumbersome model produces a suitably soft set of targets", Section 2 Eq. 1 "q_i = exp(z_i / T) / Σj exp(z_j / T)" with the description "where T is a temperature that is normally set to 1. Using a higher value for T produces a softer probability distribution over classes"; the temperature parameter T applied to the cumbersome model's softmax to shape its output distribution (the second configuration information to configure a second reference neural network model), where the cumbersome model corresponds to the second reference neural network model), the second data information (interpreted as a second data information per the 35 U.S.C. 112(b) rejection set forth above) comprising second sample data for distillation training by the second reference neural network model (see Hinton Section 2 "In the simplest form of distillation, knowledge is transferred to the distilled model by training it on a transfer set and using a soft target distribution for each case in the transfer set that is produced by using the cumbersome model with a high temperature in its softmax"; the transfer set used to elicit soft targets from the cumbersome model (a second data information comprising the second sample data for distillation training by the second reference neural network model)); and receiving, by the first device, second indication information returned by the second device (see 3GPP TR 23.700-91 Section 6.56.1.2 step 7 on p. 232 "The discovered provider NWDAF instance response with the requested trained data model/model parameters", Section 6.56.1.1 on p. 231 “Output: Model description or requested model parameters"; the consumer NWDAF receiving the provider NWDAF's response carrying the requested trained data model / model parameters); and training the second neural network model with the second indication information (see Hinton Section 2 "We then use the same high temperature when training the small model to match these soft targets"; training the small / distilled model to match the soft targets produced by the cumbersome model), the second indication information being obtained by processing the second sample data by the second reference neural network model (see Hinton Section 2 "a soft target distribution for each case in the transfer set that is produced by using the cumbersome model with a high temperature in its softmax"; the soft target distribution produced by the cumbersome model running on the transfer set corresponds to the second indication information being obtained by processing the second sample data by the second reference neural network model).

Regarding dependent claim 4, 3GPP TR 23.700-91, in view of Hinton and 3GPP TS 29.510, teach the method according to claim 3, further comprising: sending, by the first device, a second category of interest list to the second device, the second category of interest list comprising a set of categories in which the first device is configured for distillation training, the set of categories a subset of a category set in a second category list, the second category list comprising a set of preset categories of the second reference neural network model (see 3GPP TR 23.700-91 Section 6.56.1.2 step 6 on p. 232 "Consumer NWDAF requests/subscribes to the 'dataModelProvision service' of the discovered provider NWDAF instance", Section 6.56.1.1 on p. 231 Input "Optional: area of interests, UE types, application ID, NSSAI, time"; see Hinton Abstract "specialist models which learn to distinguish fine-grained classes that the full models confuse", Section 5.2 "each of which is trained on data that is highly enriched in examples from a very confusable subset of the classes", Section 5.2 "each specialist model is initialized with the weights of the generalist model. These weights are then slightly modified by training the specialist with half its examples coming from its special subset and half sampled at random from the remainder of the training set", Section 5.3 "a set of classes Sm that are often predicted together will be used as targets for one of our specialist models, m"; when the consumer NWDAF is functioning as such a specialist with respect to the Provider NWDAF’s generalist model, the consumer must communicate the special-class subset to the provider so that distillation can be conditioned on those classes. The 3GPP TR 23.700-91 Input list already accommodates optional consumer-specified scope parameters (“area of interests, UE types, application ID, NSSAI, time”), and adding an analogous category-of-interest parameter would teach sending, by the first device, a second category of interest list to the second device).

Regarding dependent claim 6, 3GPP TR 23.700-91, in view of Hinton and 3GPP TS 29.510, teach the method according to claim 3, wherein the designing, by the first device, a second neural network model comprises (see Hinton Abstract "compress the knowledge in an ensemble into a single model which is much easier to deploy", Section 1 "a small model ... that is more suitable for deployment"; the small / distilled model designed by the consumer side for deployment construed as a second neural network model designed by the first device, wherein the combination of 3GPP TR 23.700-91 and Hinton provides for the first device as the consumer NWDAF [network data analytics function] see 3GPP TR 23.700-91Section 6.56.1.1 on p. 231): sending, by the first device, a second network structure request to the second device to obtain structure information of the second reference neural network model from the second device; receiving, by the first device, a second structure request response sent by the second device, the second structure request response comprising the structure information of the second reference neural network model (see 3GPP TR 23.700-91 Section 6.56.1.2 step 6 on p. 232 "Consumer NWDAF requests/subscribes to the 'dataModelProvision service' of the discovered provider NWDAF instance", Section 6.56.1.2 step 7 on p. 232 "The discovered provider NWDAF instance response with the requested trained data model/model parameters", Section 6.56.1.1 on p. 231 Output "Model description or requested model parameters"; the Consumer NWDAF's request to the discovered Provider NWDAF (sending by the first device a second network structure request to the second device) for the trained data model and its parameters wherein the Provider NWDAF's response carrying the trained data model / model parameters and the model description / requested model parameters returned in that response (to obtain structure information of the second reference neural network from the second device; receiving, by the first device, a second structure request response sent by the second device, the second structure request comprising the structure information of the second reference neural network model)); and designing, by the first device, the second neural network model based on the structure information of the second reference neural network model (see Hinton Section 3 "we trained a single large neural net with two hidden layers of 1200 rectified linear hidden units on all 60,000 training cases", Section 3 "net with two hidden layers of 800 rectified linear hidden units and no regularization achieved 146 errors. But if the smaller net was regularized solely by adding the additional task of matching the soft targets produced by the large net at a temperature of 20, it achieved 74 test errors", Section 5.2 "each specialist model is initialized with the weights of the generalist model"; designing the small / distilled model with the same overall architecture form (two hidden layers) as the cumbersome model, sharing layer count while reducing units per layer (1200 → 800), and initializing the specialist's parameters from the cumbersome model's weights (and designing, by the first device, the second neural network model based on the structure information of the second reference neural network model).

Allowable Subject Matter
Claim 5 is objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims and to overcome the rejection under 35 U.S.C. 112 set forth in this Office Action.
The following is the statement of reasons for the indication of allowable subject matter:  The prior arts of record when taken individually or in combination do not expressly teach or render obvious the limitations recited in claim 5 when taken in the context of the claim as a whole.  At best prior arts of record specifically 3GPP TR 23.700-91 teaches performing calculation processing on the second sample data based on the second reference neural network model, wherein the cumbersome reference model runs on the transfer set on the provider side (Section 6.56.1.2 step 6 on p. 232 Consumer-to-Provider request, Section 6.56.1.2 step 7 on p. 232 “The discovered provider NWDAF instance response with the requested trained data model/model parameters”, Section 6.56.1.1 on p. 231 Output “Model description or requested model parameters”); Hinton contemplates teacher-side production of soft targets shaped by the specialist’s class subset: “The specialists that we used in our experiments on the JFT dataset collapsed all of their non-specialist classes into a single dustbin class” (Section 6.1), and “if a specialist is initialized with the weights of the generalist, we can make it retain nearly all of its knowledge about the non-special classes by training it with soft targets for the non-special classes in addition to training it with hard targets. The soft targets can be provided by the generalist” (Section 6.1); and 3GPP TS 29.510 discloses the stage 3 protocol and data model for the Nnrf Service Based Interface and providing stage 3 protocol definitions and message flows, and specifies the API for each service offered by the NRF (Section 1 Scope).

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
ZHANG, US 2022/0046101 (Feb. 10, 2022) (ABSTRACT Aspects of the disclosure provide methods and apparatuses for network data analytics. In some examples, an apparatus includes processing circuitry. The processing circuitry transmits a network data analytics function (NWDAF) service discovery request to a network repository function (NRF) network element. The NWDAF service discovery request indicates a requested network data analysis service. The processing circuitry receives an NWDAF service discovery response in response to the NWDAF service discovery request. The NWDAF service discovery response includes performance parameter information of one or more NWDAF network elements for the requested network data analysis service. Further, the processing circuitry selects, according to the performance parameter information of the one or more NWDAF network elements for the requested network data analysis service, a target NWDAF network element used for providing the requested network data analysis service, and transmits an NWDAF service request to the target NWDAF network element).
Li et al. “FedMD: Heterogeneous Federated Learning via Model Distillation” (2019) (Abstract Federated learning enables the creation of a powerful centralized model without compromising the data privacy of multiple participants. While successful, it does not incorporate the case where each participant independently designs its own model. Due to intellectual property concerns and heterogeneous nature of tasks and data, this is a widespread requirement in applications of federated learning to areas such as health care and AI as a service. In this work, we use transfer learning and knowledge distillation to develop a universal framework that enables federated learning when each agent owns not only their private data, but also uniquely designed models. We test our framework on the MNIST/FEMNIST dataset and the CIFAR10/CIFAR100 dataset and observe fast improvement across all participating models. With 10 distinct participants, the final test accuracy of each model on average receives a 20% gain on top of what’s possible without collaboration and is only a few percent lower than the performance each model would have obtained if all private datasets were pooled and made directly available for all participants).
                                                                                                                                                                                                                                                                                                                                                                                
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KUANG FU CHEN whose telephone number is (571)272-1393. The examiner can normally be reached M-F 9:00-5:30pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jennifer Welch can be reached on (571) 272-7212. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KC CHEN/Primary Patent Examiner, Art Unit 2143
Read full office action
Prosecution Timeline

May 16, 2023
Application Filed
May 13, 2026
Non-Final Rejection mailed — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/708,834
Patent 12639636
INTEGRATED MACHINE LEARNING PREDICTION AND OPTIMIZATION FOR DECISION-MAKING
4y 1m to grant Granted May 26, 2026
18/277,824
Patent 12639631
SPECTRAL CLUSTERING METHOD AND SYSTEM BASED ON UNIFIED ANCHOR AND SUBSPACE LEARNING
2y 9m to grant Granted May 26, 2026
17/250,926
Patent 12626142
System and Method for Automated Design Space Determination for Deep Neural Networks
5y 1m to grant Granted May 12, 2026
17/556,954
Patent 12626126
RECOMMENDER SYSTEM AND METHOD USING SHARED NEURAL ITEM REPRESENTATIONS FOR COLD-START RECOMMENDATIONS
4y 4m to grant Granted May 12, 2026
17/521,204
Patent 12619835
ADAPTERS FOR ZERO-SHOT MULTILINGUAL NEURAL MACHINE TRANSLATION
4y 5m to grant Granted May 05, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
81%
Grant Probability
99%
With Interview (+65.1%)
2y 11m (~0m remaining)
Median Time to Grant
Low
PTA Risk
Based on 258 resolved cases by this examiner. Grant probability derived from career allowance rate.