Last updated: May 04, 2026
Application No. 18/232,465
Reducing Size of a Machine-Trained Model to Facilitate Storage and Transfer

Non-Final OA §101§103
Filed
Aug 10, 2023
Examiner
GURMU, MULUEMEBET
Art Unit
2163
Tech Center
2100 — Computer Architecture & Software
Assignee
Microsoft Technology Licensing, LLC
OA Round
1 (Non-Final)
Interview Optional

— +18.3% interview lift. Examiner has a relatively high allowance rate (79%); +18.3% interview lift. A written response may suffice.
Based on 476 resolved cases, 2023–2026
Examiner Intelligence

GURMU, MULUEMEBET View full profile →
Grants 79% — above average
Career Allowance Rate
378 granted / 476 resolved
+24.4% vs TC avg
Strong +18% interview lift
Without
With
+18.3%
Interview Lift
resolved cases with interview
Typical timeline
3y 2m
Avg Prosecution
30 currently pending
Career history
506
Total Applications
across all art units
Statute-Specific Performance

§101
18.8%
-21.2% vs TC avg
§103
61.3%
+21.3% vs TC avg
§102
3.3%
-36.7% vs TC avg
§112
1.6%
-38.4% vs TC avg
Black line = Tech Center average estimate • Based on career data from 476 resolved cases
Office Action

§101 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA. DETAILED ACTION Claims 1-20 are present in this application. Claims 1-20 are pending in this office action. This office action is NON-FINAL. Drawings The Drawings filed on 08/10/23 are acceptable for examination purposes. Specification The Specification filed on 08/10/23 is acceptable for examination purposes. Information Disclosure Statement The information disclosure statements (IDS) filed on 8/10/23, 8/13/23, 10/12/24, 10/30/24, 12/16/24, 9/22/25, 11/19/25, 12/03/25 and 2/26/26 ha ve been considered by the Examiner and made of record in the application file. Claim Interpretation Claim 1-3, “ A computer-readable storage medium ” is interpreted in light of Specification [00107] which states, “…the specific term "computer-readable storage medium" or "storage device" expressly excludes propagated signals per se in transit, while including all other forms of computer-readable media; a computer-readable storage medium or storage device is "non-transitory" in this regard”. Claim Rejections - 35 USC § 101 35 U.S.C. 101 reads as follows: Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this Claim s 4-20 are rejected under 35 U.S.C. 101 as directed to non-statutory subject matter of software, per se. The claim(s) lack(s) the necessary physical articles or objects to constitute a machine or manufacture within the meaning of 35 U.S.C. 101. In this case, applicant has claimed a “system" in the preamble to these claims without reciting any hardware element in the bodies of these claims; this implies that Applicant is claiming a system of software, per se, lacking the hardware necessary to realize any of the underlying functionality. Therefore, claim s 4-20 are directed to non-statutory subject matter as computer programs, per se. Claim Rejections 35 U.S.C. §103 6. In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status. A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over London (US Patent No. 11 , 200 , 511 B1 ) in view of Wesolowski et al. ( US 2019 / 0114537 A1 ). Regarding claim 1, London teaches a computer-readable storage medium for storing computer-readable instructions, ( See London Col. 1 6 lines 66-67, System memory 9020 may be configured to store instructions and data accessible by processor(s) 9010), a processing system executing the computer-readable instructions to perform operations, the operations comprising , ( See London Col. 1 6 lines 55-56, Processors 9010 may be any suitable processors capable of executing instructions) : storing a data structure that represents a machine-trained model in a data store, (See London Col. 13 lines 66-67, Col. 14 lines 1-2, store, access and update sampling weights associated with training examples used for a machine learning model, according to at least some embodiments. As shown in element 501, a tree data structure) , the data structure having a plurality of nodes associated with a plurality of respective portions of model-part information that are used to implement the machine- trained model, (See London Col. 1 6 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models), the nodes including a root node and a plurality of leaf nodes, ( See London Col. 1 3 lines 34-35, a path 422 may be traversed from the root node R to a leaf node L), the data structure having a main root-to-leaf (RTL) path through the data structure that includes a set of main-path nodes, ( See London Col. 1 4 lines 2-10, a tree data structure … The first n leaf nodes may be set to the initial sampling weights selected for the n training examples (e.g., 1/n), and the remaining leaf nodes may be set to zero … a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), the set of main-path nodes starting with the root node and ending with a particular leaf node, ( See London Col. 1 4 lines 7-11, a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left and right child nodes at each level), the main-path nodes being associated with respective portions of base model weights, ( See London Col. 1 4 lines 15-23 , The parameters of the model being trained may be adjusted based on the results obtained using the selected training example (element 507) … and all the nodes along the root-to-leaf path may be updated by adding the delta (element 516)), the data structure having a plurality of non-main RTL paths between the root node , ( See London Col. 1 3 lines 61-63, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path), and respective leaf nodes other than the main RTL path, the non-main RTL paths , ( (See London Col. 14 lines 2-10, a tree data structure… a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), including non-main-path nodes that are associated with respective portions of model- variance information, (S ee London Col. 13 lines 61-63 , the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path), London does not explicitly disclose the plurality of instances of model-part information including the portions of base model weights and the portions of model-variance information, and the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, ( and the portions of model- variance information being defined in the training operation , to produce variations of the prescribed behavior with less information compared to associated portions of base model weights . However, Wesolowski teaches the plurality of instances of model-part information including the portions of base model weights and the portions of model-variance information, ( See Wesolowski paragraph [0035], sampling weights associated with training examples used for a machine learning model ), and the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, ( See Wesolowski paragraph [0035], operation of the ML model, or behavior of the neural net, is dependent upon weight values, which may be learned so that the neural network provides a desired output for a given input), and the portions of model- variance information being defined in the training operation , ( See Wesolowski paragraph [0035], the number of check-points may be increased for this particular ML model or portion of the ML model), to produce variations of the prescribed behavior with less information compared to associated portions of base model weights , ( See Wesolowski paragraph [0035], A basic approach of Asynchronous SGD may include dividing a full training data set into a number of training subsets, and using each training subset to train a separate copy (or portion) of an ML model. The multiple ML models communicate their respective parameter (weight) updates through a centralized parameter server). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify the plurality of instances of model-part information including the portions of base model weights and the portions of model-variance information, and the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, ( and the portions of model- variance information being defined in the training operation , to produce variations of the prescribed behavior with less information compared to associated portions of base model weights of Wesolowski for accessing a user profile of the user and other data within the social-networking system . Regarding claim 2, London taught the computer-readable storage medium of claim 1, as described above. London does not explicitly disclose wherein the portions of model-variance information include respective portions of model-variance weights . However, Wesolowski teaches wherein the portions of model-variance information include respective portions of model-variance weights , See Wesolowski paragraph [0035], using each training subset to train a separate copy (or portion) of an ML model. The multiple ML models communicate their respective parameter (weight) updates through a centralized parameter server) . It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the portions of model-variance information include respective portions of model-variance weights of Wesolowski for accessing a user profile of the user and other data within the social-networking system . Claim 5 recites the same limitations as claim 2 above. Therefore, Claim 5 is rejected based on the same reasoning. Regarding claim 3, London taught the computer-readable storage medium of claim 1, as described above. London does not explicitly disclose wherein the portions of model-variance information include respective instances of machine-trained input information . However, Wesolowski teaches wherein the portions of model-variance information include respective instances of machine-trained input information , ( See Wesolowski paragraph [0055], each trained ML model may consider a user input (or request) and one (or more) available candidate items as an information pair (more specifically, as a user/request-and-candidate item pair)). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the portions of model-variance information include respective instances of machine-trained input information of Wesolowski for accessing a user profile of the user and other data within the social-networking system . Claim 6 recites the same limitations as claim 3 above. Therefore, Claim 6 is rejected based on the same reasoning. Regarding claim 4 London a method for executing a machine-trained model in a local system, comprising: requesting a portion of model weights of the machine-trained model from a source system , ( See London each trained ML model may consider a user input (or request) and one (or more) available candidate items as an information pair (more specifically, as a user/request-and-candidate item pair)) ; receiving, in response to the requesting, a portion of model-variance information from the source system over a communication path , ( See London receive corresponding responses from the service. In at least some embodiments, a model training request received from a client may trigger the adaptive sampling based training of a model) ; storing the portion of model-variance information , (See London Col. 13 lines 66-67, Col. 14 lines 1-2, store, access and update sampling weights associated with training examples used for a machine learning model, according to at least some embodiments. As shown in element 501, a tree data structure) ; and executing a model part of the machine-trained model , ( See London Col. 1 6 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models), by using the portion of model-variance information in conjunction with a portion of base model weights associated with the model part of the machine-trained model, ( See London Col. 1 0 lines 22-28 , adaptive sampling is made (which may be based on the constraints/preferences, the model type, and/or an examination of at least a portion of the training data), an algorithm similar to Algorithm 1 discussed above may be implemented for the training iterations of the model), the portion of model- variance information having a smaller size than the portion of base model weights, ( See London Col. 14 lines 60-64 ], The total size of the training data may also play a role in the decision as to whether to use adaptive sampling …i f the training data set is smaller than some pre-determined threshold, the benefits expected from adaptive sampling may be insufficient to justify the additional work involved ). London does not explicitly disclose the portion of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, and the portion of model- variance information being defined in the training operation to produce a variation of the prescribed behavior . However, Wesolowski teaches the portion of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, (See Wesolowski paragraph [0035], operation of the ML model, or behavior of the neural net, is dependent upon weight values, which may be learned so that the neural network provides a desired output for a given input), and the portion of model- variance information being defined in the training operation to produce a variation of the prescribed behavior, (See Wesolowski paragraph [0035], A basic approach of Asynchronous SGD may include dividing a full training data set into a number of training subsets, and using each training subset to train a separate copy (or portion) of an ML model. The multiple ML models communicate their respective parameter (weight) updates through a centralized parameter server). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify the portion of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, and the portion of model- variance information being defined in the training operation to produce a variation of the prescribed behavior of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 7, London taught the method of claim 4, as described above . London further teaches wherein the source system stores a data structure in a data store that represents the machine-trained model, (See London Col. 13 lines 66-67, Col. 14 lines 1-2, store, access and update sampling weights associated with training examples used for a machine learning model, according to at least some embodiments. As shown in element 501, a tree data structure), the data structure having a plurality of nodes associated with a plurality of respective portions of model-part information that are used to implement the machine- trained model, (See London Col. 16 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models), the nodes including a root node and a plurality of leaf nodes, (See London Col. 13 lines 34-35, a path 422 may be traversed from the root node R to a leaf node L), the data structure having a main root-to-leaf (RTL) path through the data structure that includes a set of main-path nodes, (See London Col. 14 lines 2-10, a tree data structure…The first n leaf nodes may be set to the initial sampling weights selected for the n training examples (e.g., 1/n), and the remaining leaf nodes may be set to zero…a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), the set of main-path nodes starting with the root node and ending with a particular leaf node, , (See London Col. 14 lines 7-11, a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left and right child nodes at each level), the main-path nodes being associated with respective portions of base model weights, (See London Col. 14 lines 15-23, The parameters of the model being trained may be adjusted based on the results obtained using the selected training example (element 507)…and all the nodes along the root-to-leaf path may be updated by adding the delta (element 516)), and the data structure having a plurality of non-main RTL paths between the root node , (See London Col. 13 lines 61-63, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path), and respective leaf nodes other than the main RTL path, the non-main RTL paths , (See London Col. 14 lines 2-10, a tree data structure… a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), including non-main-path nodes that are associated with respective portions of model- variance information , (See London Col. 13 lines 61-63, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path) . . Regarding claim 8, London taught the method of claim 7, as described above. London does not explicitly disclose wherein the local system is initialized to store all the portions of base model weights . However, Wesolowski teaches wherein the local system is initialized to store all the portions of base model weights , ( See Wesolowski paragraph [0035], a master parameter list/set (e.g., the weights of the ML model); in how the training set is distributed among multiple instances (or portions) of the ML model; in how often the master parameter list is updated with interim local parameter values). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the local system is initialized to store all the portions of base model weights of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 9, London taught the method of claim 4, as described above. 9. London further teaches wherein the executing involves identifying a next portion of model-variance information to request from the source system , ( See London Col. 14 lines 22-25, all the nodes along the root-to-leaf path may be updated by adding the delta (element 516). The random probabilistic traversal may then be performed again for the next training iteration) . Regarding claim 10, London taught the method of claim 4, as described above. London further teaches wherein the executing involves performing a transformer-based operation , ( See London Col. 17 lines 31-34, data transformations to convert data signals from one component (e.g., system memory 9020) into a format suitable for use by another) . Regarding claim 11, London taught the method of claim 4, as described above. London further teaches retrieving the local-part portion from the data store upon a subsequent need for the local-part portion , ( See London Col. 4 lines 1-2, a model training request received from a client may trigger the adaptive sampling based training of a model) . London does not explicitly disclose further comprising locally retaining the portion of model-variance information in the data store as a local-part portion . However, Wesolowski teaches further comprising locally retaining the portion of model-variance information in the data store as a local-part portion , ( See Wesolowski paragraph [0035], a master parameter list/set (e.g., the weights of the ML model); in how the training set is distributed among multiple instances (or portions) of the ML model; in how often the master parameter list is updated with interim local parameter values, etc ), It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify further comprising locally retaining the portion of model-variance information in the data store as a local-part portion of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 12, London taught the method of claim 1, as described above. London further teaches wherein the local system includes a local computing device that stores all local-part portions of the local system , ( (See London Col. 13 lines 66-67, Col. 14 lines 1-2, store, access and update sampling weights associated with training examples used for a machine learning model, according to at least some embodiments. As shown in element 501, a tree data structure ) ; . Regarding claim 13, London taught the method of claim 11, as described above. London further teaches wherein the local system includes a first local computing device that stores first local-part portions of the local system, (See London Col. 4 lines 31-34, a master parameter list/set (e.g., the weights of the ML model); in how the training set is distributed among multiple instances (or portions) of the ML model; in how often the master parameter list is updated with interim local parameter values, etc.) , and a second local computing device that stores second local-part portions of the local system, ( See London Col. 4 lines 31-34, a master parameter list/set (e.g., the weights of the ML model); in how the training set is distributed among multiple instances (or portions) of the ML model; in how often the master parameter list is updated with interim local parameter values, etc.) the second local-part portions being different than the first local-part portions, at least in part , ( See London Col. 13 lines 57-64 , A new weight 444 for the leaf node L may be computed based on the training example corresponding to that leaf node in various embodiments in accordance with Algorithm 1. To update weights in the tree 402, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path) . Regarding claim 14 , London teaches a local system for executing a machine-trained model, comprising: a data store for storing computer-readable instructions , ( See London Col. 16 lines 66-67, System memory 9020 may be configured to store instructions and data accessible by processor(s) 9010) ; a processing system for executing the computer-readable instructions in the data store, to perform operations including , (See London Col. 16 lines 55-56, Processors 9010 may be any suitable processors capable of executing instructions) : successively receiving portions of model-variance information from a source system over a communication path, (See London receive corresponding responses from the service. In at least some embodiments, a model training request received from a client may trigger the adaptive sampling based training of a model); and successively executing model parts of the machine-trained model , , See London Col. 16 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models) , associated with the portions of model-variance information to provide an output result, (See London Col. 14 lines 15-23, The parameters of the model being trained may be adjusted based on the results obtained using the selected training example (element 507)…and all the nodes along the root-to-leaf path may be updated by adding the delta (element 516)) , the source system storing a data structure that represents the machine-trained model, (See London Col. 13 lines 66-67, Col. 14 lines 1-2, store, access and update sampling weights associated with training examples used for a machine learning model, according to at least some embodiments. As shown in element 501, a tree data structure), the data structure having a plurality of nodes associated with a plurality of respective portions of model-part information that are used to implement the machine- trained model, (See London Col. 16 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models), the nodes including a root node and a plurality of leaf nodes, (See London Col. 13 lines 34-35, a path 422 may be traversed from the root node R to a leaf node L), the data structure having a main root-to-leaf (RTL) path through the data structure that includes a set of main-path nodes, the set of main-path nodes , (See London Col. 14 lines 2-10, a tree data structure…The first n leaf nodes may be set to the initial sampling weights selected for the n training examples (e.g., 1/n), and the remaining leaf nodes may be set to zero…a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), starting with the root node and ending with a particular leaf node, (See London Col. 14 lines 7-11, a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left and right child nodes at each level) , the main-path nodes being associated with respective portions of base model weights, (See London Col. 14 lines 15-23, The parameters of the model being trained may be adjusted based on the results obtained using the selected training example (element 507)…and all the nodes along the root-to-leaf path may be updated by adding the delta (element 516)), the data structure having a plurality of non-main RTL paths between the root node , (See London Col. 13 lines 61-63, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path), and respective leaf nodes other than the main RTL path, the non-main RTL paths , ( (See London Col. 14 lines 2-10, a tree data structure… a given training iteration a traversal may be started at the root node of the tree, and a path to a leaf node may be determined probabilistically, using the labels assigned to the respective left), including non-main-path nodes that are associated with respective portions of model- variance information, , (See London Col. 13 lines 61-63, the difference between the new weight and the old weight is added to the leaf node L and to each internal node 424 traversed along the root-to-leaf path), and the portions of model-variance information that are successively retrieved from the source system being associated with one of a plurality of paths represented by the data structure , (See London receive corresponding responses from the service. In at least some embodiments, a model training request received from a client may trigger the adaptive sampling based training of a model) . London does not explicitly disclose the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, and the portions of model- variance information being defined in the training operation to produce variations of the prescribed behavior with less information compared to associated portions of base model weights . However, Wesolowski teaches the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model , (See Wesolowski paragraph [0035], operation of the ML model, or behavior of the neural net, is dependent upon weight values, which may be learned so that the neural network provides a desired output for a given input), and the portions of model- variance information being defined in the training operation , (See Wesolowski paragraph [0035], the number of check-points may be increased for this particular ML model or portion of the ML model), to produce variations of the prescribed behavior with less information compared to associated portions of base model weights. , (See Wesolowski paragraph [0035], A basic approach of Asynchronous SGD may include dividing a full training data set into a number of training subsets, and using each training subset to train a separate copy (or portion) of an ML model. The multiple ML models communicate their respective parameter (weight) updates through a centralized parameter server). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify the portions of base model weights being defined in a training operation to produce prescribed behavior of the machine-trained model, and the portions of model- variance information being defined in the training operation to produce variations of the prescribed behavior with less information compared to associated portions of base model weights of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 15 , London taught the local system of claim 1 4 , as described above. London does not explicitly disclose wherein the portions of model-variance information expressed by the data structure include respective portions of model- variance weights. However, Wesolowski teaches wherein the portions of model-variance information expressed by the data structure include respective portions of model- variance weights , See Wesolowski paragraph [0035], using each training subset to train a separate copy (or portion) of an ML model. The multiple ML models communicate their respective parameter (weight) updates through a centralized parameter server). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the portions of model-variance information expressed by the data structure include respective portions of model- variance weights of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 16 , London taught the local system of claim 1 4 , as described above. London does not explicitly disclose wherein the portions of model-variance information expressed by the data structure include respective instances of machine- trained input information. However, Wesolowski teaches wherein the portions of model-variance information expressed by the data structure include respective instances of machine- trained input information , (See Wesolowski paragraph [0055], each trained ML model may consider a user input (or request) and one (or more) available candidate items as an information pair (more specifically, as a user/request-and-candidate item pair)). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the portions of model-variance information expressed by the data structure include respective instances of machine- trained input information of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 17 , London taught the local system of claim 1 4 , as described above. London further teaches wherein executing a particular model part of the machine-trained model involves identifying a next model part of the machine- trained model to execute , ( See London Col. 14 lines 22-25, all the nodes along the root-to-leaf path may be updated by adding the delta (element 516). The random probabilistic traversal may then be performed again for the next training iteration). . Regarding claim 18 , London taught the local system of claim 1 4 , as described above. London further teaches wherein executing a particular model part of the machine-trained model , See London Col. 16 lines 38-41, a portion or all of one or more of the technologies described herein, including the adaptive sampling algorithms and other aspects of training and executing machine learning models) , produces a result that depends on a particular portion of base model weights and a corresponding portion of model-variance information , (See London Col. 14 lines 15-23, The parameters of the model being trained may be adjusted based on the results obtained using the selected training example (element 507)…and all the nodes along the root-to-leaf path may be updated by adding the delta (element 516)) . Regarding claim 19 , London taught the local system of claim 1 4 , as described above. London further teaches reusing the local-part portion upon a subsequent need for the local-part portion , ( See London Col. 14 lines 25-29, performed again for the next training iteration. If mini-batches are used, multiple traversals may be performed during a single training iteration in various embodiments. After the training is complete, results obtained from the trained model may be used in an application-dependent manner (element 519)) . London does not explicitly disclose teaches further comprising locally retaining a particular portion of model-variance information that is received as a local-part portion . However, Wesolowski teaches further comprising locally retaining a particular portion of model-variance information that is received as a local-part portion , ( See Wesolowski paragraph [0035], a master parameter list/set (e.g., the weights of the ML model); in how the training set is distributed among multiple instances (or portions) of the ML model; in how often the master parameter list is updated with interim local parameter values, etc ), It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify teaches further comprising locally retaining a particular portion of model-variance information that is received as a local-part portion of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Regarding claim 20 , London taught the local system of claim 1 4 , as described above. London does not explicitly disclose wherein the local system is initialized to store all the portions of base model weights represented by the data structure . However, Wesolowski teaches wherein the local system is initialized to store all the portions of base model weights , ( See Wesolowski paragraph [00 24 ], a large amount of information to be stored … if the training of a particular ML model (or portion of a ML model, e.g., graph-segment) is very process-intensive or time consuming, then the number of check-points may be increased for this particular ML model or portion of the ML model ). It would have been obvious to one with ordinary skill in the art before the effective filing date of the claimed invention was made, to modify wherein the local system is initialized to store all the portions of base model weights represented by the data structure of Wesolowski for accessing a user profile of the user and other data within the social-networking system. Conclusions/Points of Contacts The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure. See form PTO-892. KROON et al. ( US 2022 / 0345756 A1 ), The processing comprises down-sampling one or more of the source views. Each source view typically comprises a two-dimensional array of pixels. Down-sampling means that size of this array is reduced (in one or both dimensions) by reducing the resolution of the view (in the respective dimensions). The processing may further comprise filtering in addition to down-sampling. REYES et al. ( US 2024 / 0127114 A1 ) the present technology enables determining the relative contribution of each of the local datasets to the aggregated training model and by taking into consideration and penalizing the uncertainty of each client in the aggregated result. For example, this enables determining, via the contribution of a training dataset, how the training data provided by a given entity has impacted the final model . Any inquiry concerning this communication or earlier communications from the examiner should be directed to FILLIN "Examiner name" \* MERGEFORMAT MULUEMEBET GURMU whose telephone number is FILLIN "Phone number" \* MERGEFORMAT (571)270-7095 . The examiner can normally be reached FILLIN "Work Schedule?" \* MERGEFORMAT M-F 9am - 5pm . Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, FILLIN "SPE Name?" \* MERGEFORMAT Tony Mahmoudi can be reached at FILLIN "SPE Phone?" \* MERGEFORMAT 5712724078 . The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /MULUEMEBET GURMU/ Primary Examiner, Art Unit 2163
Read full office action
Prosecution Timeline

Aug 10, 2023
Application Filed
Mar 11, 2026
Non-Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/336,381
Patent 12613931
ADAPTABLE EMBEDDED SEARCH ENGINE FUNCTIONALITY
2y 10m to grant Granted Apr 28, 2026
18/675,915
Patent 12613880
DOMAIN SPECIFIC LANGUAGE FOR DATA PIPELINE CREATION
1y 11m to grant Granted Apr 28, 2026
17/962,177
Patent 12591601
SYSTEM AND METHOD FOR HYBRID MULTILINGUAL SEARCH INDEXING
3y 5m to grant Granted Mar 31, 2026
18/545,037
Patent 12591621
GENERATIVE ARTIFICIAL INTELLIGENCE AND PREFERENCE AWARE HASHTAG GENERATION
2y 3m to grant Granted Mar 31, 2026
18/897,473
Patent 12591591
DISTRIBUTING LARGE AMOUNTS OF GLOBAL METADATA USING OBJECT FILES
1y 6m to grant Granted Mar 31, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

1-2
Expected OA Rounds
79%
Grant Probability
98%
With Interview (+18.3%)
3y 2m (~5m remaining)
Median Time to Grant
Low
PTA Risk
Based on 476 resolved cases by this examiner. Grant probability derived from career allowance rate.