Last updated: May 29, 2026
Application No. 17/851,712
On-The-Fly Feeding of Personalized or Domain-Specific Submodels

Final Rejection §102§103
Filed
Jun 28, 2022
Priority
Jun 28, 2021 — provisional 63/215,710
Examiner
SUSSMAN MOSS, JACOB ZACHARY
Art Unit
2122
Tech Center
2100 — Computer Architecture & Software
Assignee
Google LLC
OA Round
2 (Final)
Interview Optional

— +16.7% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 22% grant rate with +16.7% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.
Based on 9 resolved cases, 2023–2026
Examiner Intelligence

SUSSMAN MOSS, JACOB ZACHARY View full profile →
Grants only 22% of cases
Career Allowance Rate
2 granted / 9 resolved
-32.8% vs TC avg
Strong +17% interview lift
Without
With
+16.7%
Interview Lift
resolved cases with interview
Typical timeline
3y 9m
Avg Prosecution
12 currently pending
Career history
Total Applications
across all art units
Statute-Specific Performance

§101
27.5%
-12.5% vs TC avg
§103
60.8%
+20.8% vs TC avg
§102
11.8%
-28.2% vs TC avg
Black line = Tech Center average estimate • Based on career data from 9 resolved cases
Office Action

§102 §103
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This action is responsive to the Application filed on. Claims 1-14 and 21-26 are pending in the case. Claims 1, 21 and 26 are independent claims.
This action is in response to amendments filed November 26th, 2025, in which claims 1-2, 5-6 and 13-14 have been amended, claims 21-26 have been added, and claims 15-20 have been cancelled. The amendments have been entered, and claims 1-14 and 21-26 are currently pending in the case. Claims 1, 21, and 26 are independent claims.
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-14 and 21-26 are rejected under 35 U.S.C. 103 as being unpatentable over Sandler et al. (US 20200104706 A1) (hereinafter “Sandler”) in view of Hu et al. ("LoRA: Low-Rank Adaptation of Large Language Models", Hu et al., 17 Jun 2021) (as cited in the IDS, hereinafter "Hu").

Regarding claim 1:
Sandler teaches [a] computing system for more efficient use of computational resources to deploy machine learning across different users, domains, context, or tasks, the computing system comprising:
one or more processors (Sandler, ¶32 “The computer system can include one or more processors;”); and 
one or more non-transitory memories that store instructions that when executed by the one or more processors cause the computing system to perform operations, the operations comprising (Sandler, ¶32 “one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computer system to perform any of the methods described herein.”):
loading a machine-learned base model into a volatile memory of the computing system (Sandler, ¶98 “The memory 114 can include one or more non-transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.”), wherein the machine-learned base model comprises a first set of learned parameter values (Sandler, ¶100 “In some implementations, the one or more machine-learned models 120 can be received from the server computing system 130 over network 180, stored in the user computing device memory 114, and then used or otherwise implemented by the one or more processors 112.” in light of the specification, loading a model can be considered initiating an execution session. Specification, ¶6 “The operations may include initiating an execution session of a machine learning library, where initiation of the execution session may include loading a machine-learned base model into at least a first memory of the one or more non-transitory memories, and where the machine-learned base model may include a first set of learned parameter values.”); and 
while the machine-learned base model is loaded into the volatile memory (Sandler, ¶101 “Additionally or alternatively, one or more machine-learned models 140 can be included in or otherwise stored and implemented by the server computing system 130 that communicates with the user computing device 102 according to a client-server relationship.” Here, the model on the server computing system communicating with the user computing device can be considered to be considered an ongoing execution session):
responsive to receiving a model input associated with a particular user, domain, context, or task (Sandler, ¶24 “receiving, by the one or more computing devices, new input data; and when the new input data is structured according to the first domain, employing, by the one or more computing devices, the machine-learned model excluding the model patch to process the new input data to generate a first prediction; and when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.” It is noted the claim recites alternative language, and Sandler teaches at least one of the alternatives.);
dynamically  adapting the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, the patched model can be considered the combined model); and
processing a model input associated with the request with the combined machine-learned model to generate a model output (Sandler, ¶25 “In some implementations, when the new input data is converted to the second domain, the method includes said employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate the second prediction.”).
Sandler does not teach "accessing, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task" 
However, Hu teaches accessing, by the computing system, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task (Hu, page 4, ¶1 “This allows for the creation of many customized models that can be activated and deactivated on the fly on machines that store the pre-trained weights.”) here, the activation of the customized model can be considered accessing from storage on a disk device or a flash memory);
Sandler and Hu are analogous art because both references concern methods for on-the-fly transfer learning. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Sandler’s learning system to incorporate the disk/cache taught by Hu. The motivation for doing so would have been to switch between tasks at a much lower cost, as stated in Hu, page 4, ¶1 “Another benefit is that during deployment, we can switch between tasks at a much lower cost by only swapping the LoRA weights, often measured in megabytes, as opposed to all the weights (350GB).”

Regarding claim 2:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein dynamically adapting the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model  comprises:
dynamically combining the second set of learned parameter values into the machine- learned base model according to an existing execution graph associated with the machine-learned base model (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, the patched model can be considered the combined model, the execution graph describing a set of dataflow computations to execute the combined machine-learned model (Sandler, ¶68 “In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in.” the positions i1, ..., in can be considered the execution graph in light of the specification, ¶7 “Dynamically combining the second set of learned parameter values into the machine-learned base model according to the existing execution graph may include: inserting the second set of learned parameter values into the machine-learned base model at one or more locations within the machine-learned base model specified by the existing execution graph to generate the combined machine-learned model.”).

Regarding claim 3:
Sandler in view of Hu teaches [t]he computing system of claim 2, wherein dynamically combining the second set of learned parameter values into the machine-learned base model according to the existing execution graph comprises:
inserting the second set of learned parameter values into the machine-learned base model at one or more locations within the machine-learned base model specified by the existing execution graph to generate the combined machine-learned model (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, the patched model can be considered the combined model and the positions i1, ..., in can be considered the execution graph in light of the specification, ¶7 “Dynamically combining the second set of learned parameter values into the machine-learned base model according to the existing execution graph may include: inserting the second set of learned parameter values into the machine-learned base model at one or more locations within the machine-learned base model specified by the existing execution graph to generate the combined machine-learned model.”).

Regarding claim 4:
Sandler in view of Hu teaches [t]he computing system of claim 3, wherein the one or more locations within the machine-learned base model specified by the existing execution graph comprises one or more hidden layers of the machine-learned base model that are distinct from and follow an initial input layer of the machine-learned base model (Sandler, ¶47 “In particular, in some implementations, at least a portion of the model patch is positioned structurally prior to a final output portion (e.g., a final layer) of the machine-learned model. For example, in some implementations, at least a portion of the model patch is included in an intermediate layer of the machine-learned model.” Here, the intermediate layer is a hidden layer and therefor follows an initial input layer.).

Regarding claim 5:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein dynamically adapting the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model comprises:
replacing one or more first learned parameter values included the first set of learned parameter values with one or more second learned parameter values included in the second set of learned parameter values (Sandler, ¶52 “Modifying the machine-learned model to include the model patch can include replacing at least one of the convolutional filters with a reduced-parameter version of the convolutional filter.” Here, the filter replaced can be considered the first learned parameter values and the reduced parameter version of the filter can be considered the second learned parameter values.).

Regarding claim 6:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein dynamically adapting the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model comprises:
adding the second set of learned parameter values to the first set of learned parameter values (Sandler “(Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, M’ contains the second set of parameter values added to the first.).

Regarding claim 7:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein the computing system consists of a server computing system (Sandler, ¶96 “FIG. 2A depicts a block diagram of an example computing system 100 according to example embodiments of the present disclosure. The system 100 includes a user computing device 102, a server computing system 130, and a training computing system 150 that are communicatively coupled over a network 180.”).

Regarding claim 8:
Sandler in view of Hu teaches [t]he computing system of claim 7, wherein:
the machine-learned submodel is associated with the particular user (Sandler, ¶63 “In such fashion, the personalized patch parameters remain private while updates to the shared public parameters are communicated to a central server for aggregation with other updates from other updates to improve a global model.” Here, the personalized patch parameters can be considered a submodel associated with a particular user); and
accessing the machine-learned submodel associated with the particular user comprises receiving the machine-learned submodel from a user device associated with the particular user (Sandler, ¶101 “For example, the machine-learned models 140 can be implemented by the server computing system 140 as a portion of a web service. Thus, one or more models 120 can be stored and implemented at the user computing device 102 and/or one or more models 140 can be stored and implemented at the server computing system 130.”).

Regarding claim 9:
Sandler in view of Hu teaches [t]he computing system of claim 7, wherein:
the machine-learned submodel is associated with the particular user (Sandler, ¶63 “In such fashion, the personalized patch parameters remain private while updates to the shared public parameters are communicated to a central server for aggregation with other updates from other updates to improve a global model.” Here, the personalized patch parameters can be considered a submodel associated with a particular user); and 
accessing the machine-learned submodel associated with the particular user comprises confirming that one or more authentication protocols have been satisfied as a condition of accessing the machine-learned submodel (Sandler, ¶62 “In some implementations, the central device data layer can communicate with each device component using an API (e.g., a private API).” Here, the private API can be considered the authentication protocol, and data accessed from the central device data layer includes the patches in light of the specification, ¶91 “The central device data layer can be a centralized repository of data for the computing device 50.”).

Regarding claim 10:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein the computing system consists of a mobile device or an embedded device (Sandler, ¶97 “The user computing device 102 can be any type of computing device, such as, for example, a personal computing device (e.g., laptop or desktop), a mobile computing device (e.g., smartphone or tablet), a gaming console or controller, a wearable computing device, an embedded computing device, an edge computing device, or any other type of computing device.”).

Regarding claim 11:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein accessing the machine-learned submodel associated with the particular user, domain, context, or task comprises:
receiving an identifier associated with the particular user, domain, context, or task (Sandler, ¶25 “In some implementations, the new input data can be structured according to the first domain, the second domain can include a smaller number of dimensions than the first domain” It is noted the claim recites alternative language, and Sandler in view of Hu teaches at least one of the alternatives.); and
accessing a data repository storing a plurality of machine-learned submodels associated with a plurality of different users, domains, contexts, or tasks to identify and retrieve the machine-learned submodel associated with the particular user, domain, context, or task (Sandler, ¶119 “The central intelligence layer includes a number of machine-learned models. For example, as illustrated in FIG. 2C, a respective machine-learned model (e.g., a model) can be provided for each application and managed by the central intelligence layer.” Further ¶115 “For example, each application can include a machine-learned model. Example applications include a text messaging application, an email application, a dictation application, a virtual keyboard application, a browser application, etc.” here, the central intelligence layer can be considered the data repository and the machine learned models for applications perform various tasks), wherein 
the machine-learned submodel associated with the particular user, domain, context, or task is logically associated with the identifier within the data repository (Sandler, ¶93 “This enables power-constrained operation, where an application can switch to a lower resolution to save on latency/power, without needing to ship separate models and having to make that trade-off decision at the application design time. Thus, a single model can be shipped to the device and then multiple different variants can be made (e.g., using different patches) and trained at the device to provide multiple models for different (but potentially related) tasks.” Here, the lower resolution model can be considered to be logically associate with the base model).

Regarding claim 12:
Sandler in view of Hu teaches [t]he computing system of claim 11, wherein the data repository is stored on a hard disk (Sandler, ¶101 “Thus, one or more models 120 can be stored and implemented at the user computing device 102 and/or one or more models 140 can be stored and implemented at the server computing system 130.” Further, figure 2A shows the server computing system having access to the memory which can be a hard disk ¶103 “The memory 134 can include one or more non-transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.”).

Regarding claim 13:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein the operations further comprise:
responsive to receiving a second request associated with a second particular user, domain, context, or task (Sandler, ¶24 “when the new input data is structured according to the first domain, employing, by the one or more computing devices, the machine-learned model excluding the model patch to process the new input data to generate a first prediction; and when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.” It is noted the claim recites alternative language, and Sandler in view of Hu teaches at least one of the alternatives.), accessing a second machine-learned submodel associated with the second particular user, domain, context, or task, wherein the second machine-learned submodel comprises a third set of learned parameter values (Sandler, ¶59 “As one example, a computing system can obtain a model that includes an existing set of learnable parameters and can generate a first model patch that includes a first set of learnable parameters and a second model patch that includes a second set of learnable parameters.”) that have been learned from training data associated with the second particular user, domain, context, or task (Sandler, ¶26 “In some implementations, the first domain can include a first image resolution, the first task can include processing imagery of the first input resolution, the second domain can include a second image resolution that is smaller than the first image resolution, and the second task can include processing imagery of the second input resolution.”);
dynamically adapting the machine-learned base model based on the second set of learned parameter values to generate a second combined machine-learned model, wherein the second combined machine-learned model comprises both the first set of learned parameter values and the third set of learned parameter values, and wherein the second combined machine-learned model excludes the second set of learned parameter values (Sandler, ¶31 “The method can include simultaneously: training, by the one or more computing devices, the model including the first model patch but not the second model patch on a first set of training data to perform a first task; and training, by the one or more computing devices, the model including the second model patch but not the first model patch on a second set of training data to perform a second task that is different than the first task.”); and
processing a second model input associated with the second request with the second combined machine-learned model to generate a second model output (Sandler, ¶24 “when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.”).

Regarding claim 14:
Sandler in view of Hu teaches [t]he computing system of claim 1, wherein the operations further comprise:
receiving a third model input unassociated with any particular user, …, context, or task (Sandler, ¶24 “when the new input data is structured according to the first domain, employing, by the one or more computing devices, the machine-learned model excluding the model patch to process the new input data to generate a first prediction; and when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.” It is noted the claim recites alternative language, and Sandler in view of Hu teaches at least one of the alternatives.); and
processing the third model input with the machine-learned base model to generate a third model output, wherein processing the third model input with the machine-learned base model comprises applying an identity operation (Sandler, ¶1 “More particularly, the present disclosure relates to systems and methods to perform parameter-efficient multi-task learning, transfer learning, and/or model personalization via the use of model patches.” Here, adding personalization through the use of the model patch can be considered an identity operation) at one or more locations within the machine-learned base model specified by the existing execution graph as added submodel locations (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}”).

Regarding claim 21:
Sandler teaches [a] computer-implemented method comprising:
loading, by a computing system comprising: one or more processors (Sandler, ¶32 “The computer system can include one or more processors;”) a machine-learned base model into a volatile memory of the computing system (Sandler, ¶98 “The memory 114 can include one or more non-transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.”), wherein the machine-learned base model comprises a first set of learned parameter values (Sandler, ¶100 “In some implementations, the one or more machine-learned models 120 can be received from the server computing system 130 over network 180, stored in the user computing device memory 114, and then used or otherwise implemented by the one or more processors 112.” in light of the specification, loading a model can be considered initiating an execution session. Specification, ¶6 “The operations may include initiating an execution session of a machine learning library, where initiation of the execution session may include loading a machine-learned base model into at least a first memory of the one or more non-transitory memories, and where the machine-learned base model may include a first set of learned parameter values.”); and 
while the machine-learned base model is loaded into the volatile memory (Sandler, ¶101 “Additionally or alternatively, one or more machine-learned models 140 can be included in or otherwise stored and implemented by the server computing system 130 that communicates with the user computing device 102 according to a client-server relationship.” Here, the model on the server computing system communicating with the user computing device can be considered to be considered an ongoing execution session):
responsive to receiving a model input associated with a particular user, domain, context, or task (Sandler, ¶24 “receiving, by the one or more computing devices, new input data; and when the new input data is structured according to the first domain, employing, by the one or more computing devices, the machine-learned model excluding the model patch to process the new input data to generate a first prediction; and when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.” It is noted the claim recites alternative language, and Sandler teaches at least one of the alternatives.);
dynamically  adapting, by the computing system, the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, the patched model can be considered the combined model); and
processing, by the computing system, a model input associated with the request with the combined machine-learned model to generate a model output (Sandler, ¶25 “In some implementations, when the new input data is converted to the second domain, the method includes said employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate the second prediction.”).
Sandler does not teach "accessing, by the computing system, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task" 
However, Hu teaches accessing, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task (Hu, page 4, ¶1 “This allows for the creation of many customized models that can be activated and deactivated on the fly on machines that store the pre-trained weights.”) here, the activation of the customized model can be considered accessing from storage on a disk device or a flash memory);
Sandler and Hu are analogous art because both references concern methods for on-the-fly transfer learning. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Sandler’s learning system to incorporate the disk/cache taught by Hu. The motivation for doing so would have been to switch between tasks at a much lower cost, as stated in Hu, page 4, ¶1 “Another benefit is that during deployment, we can switch between tasks at a much lower cost by only swapping the LoRA weights, often measured in megabytes, as opposed to all the weights (350GB).”

Regarding claims 22-25:
	Claims 22-25 are rejected under the same rationale as claims 5-6 and 8-9 respectively.

Regarding claim 26:
Sandler teaches [o]ne or more non-transitory memories that store instructions that when executed by the one or more processors cause a computing system to perform operations, the operations comprising (Sandler, ¶32 “one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computer system to perform any of the methods described herein.”):
loading a machine-learned base model into a volatile memory of the computing system (Sandler, ¶98 “The memory 114 can include one or more non-transitory computer-readable storage mediums, such as RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof.”), wherein the machine-learned base model comprises a first set of learned parameter values (Sandler, ¶100 “In some implementations, the one or more machine-learned models 120 can be received from the server computing system 130 over network 180, stored in the user computing device memory 114, and then used or otherwise implemented by the one or more processors 112.” in light of the specification, loading a model can be considered initiating an execution session. Specification, ¶6 “The operations may include initiating an execution session of a machine learning library, where initiation of the execution session may include loading a machine-learned base model into at least a first memory of the one or more non-transitory memories, and where the machine-learned base model may include a first set of learned parameter values.”); and 
while the machine-learned base model is loaded into the volatile memory (Sandler, ¶101 “Additionally or alternatively, one or more machine-learned models 140 can be included in or otherwise stored and implemented by the server computing system 130 that communicates with the user computing device 102 according to a client-server relationship.” Here, the model on the server computing system communicating with the user computing device can be considered to be considered an ongoing execution session):
responsive to receiving a model input associated with a particular user, domain, context, or task (Sandler, ¶24 “receiving, by the one or more computing devices, new input data; and when the new input data is structured according to the first domain, employing, by the one or more computing devices, the machine-learned model excluding the model patch to process the new input data to generate a first prediction; and when the new input data is structured according to the second domain, employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate a second prediction.” It is noted the claim recites alternative language, and Sandler teaches at least one of the alternatives.);
dynamically  adapting the machine-learned base model based on the second set of learned parameter values to generate a combined machine-learned model (Sandler, ¶68 “As one example, suppose a deep network M is a sequence of layers represented by their parameters (weights), W1, ..., Wn. This formulation ignores non-trainable layers (e.g., some kinds of activations). In this example, a model patch P is a set of parameters W′i1, ..., W′ik , that adds layers at positions i1, ..., in. Thus, a patched model M′={W1, ..., Wi1, W′i1, ...,Win, W′in, ..., Wn}” Here, the patched model can be considered the combined model); and
processing a model input associated with the request with the combined machine-learned model to generate a model output (Sandler, ¶25 “In some implementations, when the new input data is converted to the second domain, the method includes said employing, by the one or more computing devices, the machine-learned model including the model patch to process the new input data to generate the second prediction.”).
Sandler does not teach "accessing, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task" 
However, Hu teaches accessing, by the computing system, from storage on a disk device, a least-recently-used (LRU) cache, or a flash memory device a machine-learned submodel associated with the particular user, domain, context, or task, wherein the machine-learned submodel comprises a second set of learned parameter values that have been learned from training data associated with the particular user, domain, context, or task (Hu, page 4, ¶1 “This allows for the creation of many customized models that can be activated and deactivated on the fly on machines that store the pre-trained weights.”) here, the activation of the customized model can be considered accessing from storage on a disk device or a flash memory);
Sandler and Hu are analogous art because both references concern methods for on-the-fly transfer learning. Accordingly, it would have been obvious to a person of ordinary skill in the art, before the effective filing date of the claimed invention, to modify Sandler’s learning system to incorporate the disk/cache taught by Hu. The motivation for doing so would have been to switch between tasks at a much lower cost, as stated in Hu, page 4, ¶1 “Another benefit is that during deployment, we can switch between tasks at a much lower cost by only swapping the LoRA weights, often measured in megabytes, as opposed to all the weights (350GB).”

Response to Arguments
Applicant's arguments filed November 26th, 2025 have been fully considered but they are not persuasive.
Regarding the objections to the drawings, Applicant’s amended drawings have overcome the objections, which are withdrawn.
Applicant’s arguments with respect to the rejections under 35 U.S.C. § 102 have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JACOB Z SUSSMAN MOSS whose telephone number is (571) 272-1579. The examiner can normally be reached Monday - Friday, 9 a.m. - 5 p.m. ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached on (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.







/J.S.M./Examiner, Art Unit 2122  
                                                                                                                                                                          /KAKALI CHAKI/Supervisory Patent Examiner, Art Unit 2122
Read full office action
Prosecution Timeline

Jun 28, 2022
Application Filed
Aug 28, 2025
Non-Final Rejection mailed — §102, §103
Nov 24, 2025
Examiner Interview Summary
Nov 24, 2025
Applicant Interview (Telephonic)
Nov 26, 2025
Response Filed
Mar 12, 2026
Final Rejection mailed — §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/563,924
Patent 12608591
DEEP LEARNING MODELS PROCESSING TIME SERIES DATA
4y 3m to grant Granted Apr 21, 2026
Study what changed to get past this examiner. Based on 1 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
22%
Grant Probability
39%
With Interview (+16.7%)
3y 9m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 9 resolved cases by this examiner. Grant probability derived from career allowance rate.