Prosecution Insights
Last updated: April 19, 2026
Application No. 18/323,197

SCALABLE WEIGHT REPARAMETERIZATION FOR EFFICIENT TRANSFER LEARNING

Non-Final OA §102§103
Filed
May 24, 2023
Examiner
SCHNEE, HAL W
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
Qualcomm Incorporated
OA Round
1 (Non-Final)
84%
Grant Probability
Favorable
1-2
OA Rounds
2y 11m
To Grant
99%
With Interview

Examiner Intelligence

Grants 84% — above average
84%
Career Allow Rate
503 granted / 595 resolved
+29.5% vs TC avg
Strong +22% interview lift
Without
With
+22.1%
Interview Lift
resolved cases with interview
Typical timeline
2y 11m
Avg Prosecution
16 currently pending
Career history
611
Total Applications
across all art units

Statute-Specific Performance

§101
9.7%
-30.3% vs TC avg
§103
40.8%
+0.8% vs TC avg
§102
17.3%
-22.7% vs TC avg
§112
26.3%
-13.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 595 resolved cases

Office Action

§102 §103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Claim Rejections - 35 USC § 102 The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action: A person shall be entitled to a patent unless – (a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention. Claims 1-6, 9-12, 15, 18-23, 26, and 29 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Guo, Yunhui, et al. (“Spottune: transfer learning through adaptive fine-tuning,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019; hereinafter “Guo”). Regarding Claim 1, Guo teaches a processor-implemented method (section 4.1—the experiments describe computer-implemented neural networks and operations), comprising: training a first neural network to perform a task based on weights defined for a machine learning model trained to perform a different task and learned reparameterizing weights for each of a plurality of layers in the machine learning model (section 2 describes transfer learning, which trains a neural network to perform a task based on weights learned for a different task. Section 3.2 and fig. 2 describes learned reparameterizing weights, i.e. freezing or fine-tuning weights within layers of the neural network. Section 1 further describes making fine-tuning decisions individually for each layer); training a second neural network to generate a plurality of gating parameters based on a cost factor and the trained first neural network, each respective gating parameter of the plurality of gating parameters corresponding to the learned reparameterizing weights in a respective layer of the plurality of layers in the machine learning model (sections 3 and 3.2, and fig. 2—the first neural network is trained using an adaptive fine-tuning strategy to decide which layers of the network should be fine-tuned and which layers should have their parameters frozen. The policy network is a second neural network that generates the gating parameters which select weights in the first neural network to be fine-tuned or frozen); and updating the machine learning model based on the weights defined for the machine learning model, each gating parameter for each layer of the plurality of layers in the machine learning model, and the learned reparameterizing weights for each layer of the plurality of layers in the machine learning model (section 3.2—weights are updated using a backpropagation process). Regarding Claim 9, Guo teaches a processor-implemented method (section 3.1), comprising: extracting features from an input for which an inference is to be generated (sections 2 and 3.1 describe extracting features); generating the inference based on the extracted features from the input and a machine learning model having weights defined for each respective layer in the machine learning model based on base weights defined for each respective layer in the machine learning model, a gating parameter for each respective layer in the machine learning model, and reparameterizing weights for each respective layer in the machine learning model (sections 3 and 3.2, and fig. 2—the first neural network is trained using an adaptive fine-tuning strategy to decide which layers of the network should be fine-tuned and which layers should have their parameters frozen. The policy network is a second neural network that generates the gating parameters which select weights in the first neural network to be fine-tuned or frozen. The experiments is section 4 describe using the model for inference); and taking one or more actions based on the generated inference (sections 3.1 and 4.1—classifying an image comprises an action taken based on the inference). Regarding Claim 18, Guo teaches a system (section 3.2 and fig. 2), comprising: a memory having executable instructions stored thereon; and a processor configured to execute the executable instructions (section 4.1—the experiments executing neural networks to analyze datasets, which inherently comprise a memory having executable instructions stored thereon and a processor configured to execute the executable instructions) in order to cause the system to perform the operations of the present claim in the same manner as for claim 1, above. Regarding Claim 26, Guo teaches a system (section 3.2 and fig. 2), comprising: a memory having executable instructions stored thereon; and a processor configured to execute the executable instructions (section 4.1—the experiments executing neural networks to analyze datasets, which inherently comprise a memory having executable instructions stored thereon and a processor configured to execute the executable instructions) in order to cause the system to perform the operations of the present claim in the same manner as for claim 9, above. Regarding Claims 2 and 19, Guo teaches wherein training the first neural network comprises training the first neural network with a target task loss value (section 3.2, classification loss lc for the target task). Regarding Claims 3 and 20, Guo teaches wherein training the second neural network comprises training the second neural network based on a target cost input and a policy loss value (section 3.3—the value k can be considered a target cost input because it directly controls the computation cost. Section 3.2 describes training the policy network {second neural network} with backpropagation using a Gumbel Softmax distribution, indicating a policy loss value). Regarding Claims 4, 11, and 21, Guo teaches wherein the policy loss value is calculated based on an average Gumbel loss over the plurality of layers in the machine learning model and the target cost input (section 3.2). Regarding Claims 5 and 22, Guo teaches wherein the average Gumbel loss comprises an average of a product of a layer-wise weighting for a respective layer in the machine learning model based on a number of parameters in the weights defined for the machine learning model and a Gumbel loss for the respective layer (section 3.2 and equation 4). Regarding Claims 6, 12, and 23, Guo teaches wherein a total loss used in training the second neural network is based on a target task loss value and a product of the policy loss value and a loss weighting hyperparameter (section 3.2—the specific calculation of a product of the policy loss value and a loss weighting hyperparameter is a matter of design choice as is known in the art). Regarding Claim 10, Guo teaches wherein the machine learning model comprises a model having been generated based on a first neural network trained based on a target loss value and a second neural network trained based on a target cost input and a policy loss value (section 3.3—the value k can be considered a target cost input because it directly controls the computation cost. Section 3.2 describes training the first neural network based on classification loss lc for the target task and training the policy network {second neural network} with backpropagation using a Gumbel Softmax distribution, indicating a policy loss value). Regarding Claims 15 and 29, Guo teaches wherein the one or more actions comprise an action different from an action for which the machine learning model was originally trained (section 4.1—transfer learning comprises a machine learning model performing a different action than the one for which the model was originally trained. The model may be originally trained on a dataset for classifying images, for example {the ImageNet dataset}, and the action may be recognizing German traffic signs). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 7-8, 13-14, 24-25, and 27-28 are rejected under 35 U.S.C. 103 as being unpatentable over Guo, as applied to claims 1, 9, 18, and 26, above, in view of Wallingford, Matthew, et al. (“Task adaptive parameter sharing for multi-task learning,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022; hereinafter “Wallingford”). Regarding Claims 7, 13, 24, and 27, Guo does not specifically teach wherein: updating the machine learning model comprises binarizing the generated plurality of gating parameters, a gating parameter for a respective layer that is less than a threshold value does not modify weights for the respective layer, and a gating parameter for the respective layer that is greater than the threshold value modifies the weights for the respective layer by a respective learned reparameterizing weight. However, Wallingford teaches updating a machine learning model comprises binarizing a generated plurality of gating parameters, a gating parameter for a respective layer that is less than a threshold value does not modify weights for the respective layer, and a gating parameter for the respective layer that is greater than the threshold value modifies the weights for the respective layer by a respective learned reparameterizing weight (section 3—gating parameter Iτ is binarized based on a scoring parameter si being compared to the threshold t. Weights are only modified when the gating parameter is greater than the threshold τ). All of the claimed elements were known in Guo and Wallingford and could have been combined by known methods with no change in their respective functions. It therefore would have been obvious to a person of ordinary skill in the art at the time of filing of the applicant’s invention to combine the binarization of Wallingford with the gating parameters of Guo to yield the predictable result of wherein: updating the machine learning model comprises binarizing the generated plurality of gating parameters, a gating parameter for a respective layer that is less than a threshold value does not modify weights for the respective layer, and a gating parameter for the respective layer that is greater than the threshold value modifies the weights for the respective layer by a respective learned reparameterizing weight. One would be motivated to make this combination for the purpose of selecting the minimal necessary subset of layers that needs to be tuned to achieve the best performance (Wallingford, section 3). Claims 8, 14, 25, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Guo. Regarding Claims 8, 14, 25, and 28, Guo does not explicitly teach wherein the second neural network comprises a network including a plurality of linear layers and a non-linear layer comprising a rectified linear unit (ReLU). However, the examiner takes official notice that a non-linear layer comprising a rectified linear unit (ReLU) is well-known in the art as a layer that follows linear intermediate layers of a neural network such as the policy network (second neural network) of Guo. Claims 16-17 and 30 are rejected under 35 U.S.C. 103 as being unpatentable over Guo, as applied to claims 9 and 26, above, in view of Wu, Zhengmin, et al. (“Fast location and classification of small targets using region segmentation and a convolutional neural network,” Computers and Electronics in Agriculture 169 (2020): 105207; hereinafter “Wu”). Regarding Claims 16 and 30, Guo does not specifically teach wherein the one or more actions comprise segmenting visual content into one or more segments, each segment of the one or more segments corresponding to different classes of objects present in a scene captured in the visual content. However, Wu teaches wherein one or more actions comprise segmenting visual content into one or more segments, each segment of the one or more segments corresponding to different classes of objects present in a scene captured in the visual content (section 2.3 and fig. 5—an image is segmented and different targets in each segment are labeled by a neural network as different classes. Section 1 further details the different classes {parts of a broken hickory nut}). All of the claimed elements were known in Guo and Wu and could have been combined by known methods with no change in their respective functions. It therefore would have been obvious to a person of ordinary skill in the art at the time of filing of the applicant’s invention to combine the image segmentation and classification of Wu with the classifying and actions of Guo to yield the predictable result of wherein the one or more actions comprise segmenting visual content into one or more segments, each segment of the one or more segments corresponding to different classes of objects present in a scene captured in the visual content. One would be motivated to make this combination for the purpose of improving the recognition rate and allowing effective sorting of objects (Wu, section 1, last paragraph). Regarding Claim 17, Guo/Wu teaches wherein the one or more actions further comprise controlling one or more physical devices based on the segmented visual content (Wu, section 2.5 and fig. 7—the action comprises controlling a machine to sort objects). Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to HAL W SCHNEE whose telephone number is (571) 270-1918. The examiner can normally be reached M-F 7:30 a.m. - 6:00 p.m. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at 303-297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /HAL SCHNEE/Primary Examiner, Art Unit 2129
Read full office action

Prosecution Timeline

May 24, 2023
Application Filed
Mar 11, 2026
Non-Final Rejection — §102, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12596927
INFORMATION PROCESSING SYSTEM HAVING AN INFORMATION PROCESSING DEVICE AND MULTIPLE PARTIAL RESERVOIRS THAT ARE TRAINED, INFORMATION PROCESSING DEVICE THAT TRAINS MULTIPLE PARTIAL RESERVOIRS, AND NON-TRANSITORY COMPUTER READABLE MEMORY MEDIUM THAT STORES INFORMATION PROCESSING PROGRAM FOR TRAINING MULTIPLE PARTIAL RESERVOIRS
2y 5m to grant Granted Apr 07, 2026
Patent 12572785
METHODS AND HARDWARE FOR INTER-LAYER DATA FORMAT CONVERSION IN NEURAL NETWORKS
2y 5m to grant Granted Mar 10, 2026
Patent 12547886
ARTIFICIAL INTELLIGENCE-BASED INFORMATION MANAGEMENT SYSTEM PERFORMANCE METRIC PREDICTION
2y 5m to grant Granted Feb 10, 2026
Patent 12536414
NOTIFICATION MANAGEMENT AND CHANNEL SELECTION
2y 5m to grant Granted Jan 27, 2026
Patent 12524654
METACOGNITIVE SEDENION-VALUED NEURAL NETWORKS
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

1-2
Expected OA Rounds
84%
Grant Probability
99%
With Interview (+22.1%)
2y 11m
Median Time to Grant
Low
PTA Risk
Based on 595 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month