Prosecution Insights
Last updated: April 19, 2026
Application No. 17/534,340

ELECTRONIC DEVICE AND METHOD FOR TRAINING NEURAL NETWORK MODEL

Non-Final OA §103
Filed
Nov 23, 2021
Examiner
HICKS, AUSTIN JAMES
Art Unit
2142
Tech Center
2100 — Computer Architecture & Software
Assignee
Industrial Technology Research Institute
OA Round
3 (Non-Final)
76%
Grant Probability
Favorable
3-4
OA Rounds
3y 4m
To Grant
99%
With Interview

Examiner Intelligence

Grants 76% — above average
76%
Career Allow Rate
308 granted / 403 resolved
+21.4% vs TC avg
Strong +25% interview lift
Without
With
+25.1%
Interview Lift
resolved cases with interview
Typical timeline
3y 4m
Avg Prosecution
54 currently pending
Career history
457
Total Applications
across all art units

Statute-Specific Performance

§101
13.9%
-26.1% vs TC avg
§103
46.3%
+6.3% vs TC avg
§102
17.3%
-22.7% vs TC avg
§112
19.2%
-20.8% vs TC avg
Black line = Tech Center average estimate • Based on career data from 403 resolved cases

Office Action

§103
Notice of Pre-AIA or AIA Status The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . Response to Arguments Applicant's arguments filed 8/22/2025 have been fully considered but they are not persuasive. Applicant argues, “[t]here is nowhere in the teaching of Sohn that mentions the concept of a first neural network model includes a first sub-neural network model and a second sub-neural network model, and determine whether the second pseudo-label matches the first pseudo-label according to the probability vector.” Remarks 15. The submodels are shown in figure 1 below as the upper model and the lower model. The “according to” language pushes the interpretation of the claim to include any matching determination that includes a probability vector. In fig. 1 below, the matching happens at H(p,q), and it is done according to the two probability/prediction vectors output from the two models. PNG media_image1.png 243 533 media_image1.png Greyscale Allowable Subject Matter Claims 5-7 and 17-19 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims. The prior art of record does not teach or make obvious: “calculate an average probability of a first maximum probability in the first probability vector and a second maximum probability in the second probability vector; and in response to the average probability being greater than a probability threshold, determine whether the second pseudo-label matches the first pseudo-label…” (claim 5 and 17); “in response to a first maximum probability in the first probability vector being greater than a probability threshold and a second maximum probability in the second probability vector being greater than the probability threshold, determine whether the second pseudo-label matches the first pseudo-label…” (claims 6 and 18); or “the second pseudo-labeled data comprises a first probability vector corresponding to the first sub-neural network model and a second probability vector corresponding to the second sub-neural network model, wherein the processor is further configured to: calculate an average probability vector of the first probability vector and the second probability vector; and determine the second pseudo-label according to the average probability vector.” (claims 7 and 19). Claim Rejections - 35 USC § 103 The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action: A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made. Claims 1-4, 8-16 and 20-24 are rejected under 35 U.S.C. 103 as being unpatentable over Self-training with Noisy Student improves ImageNet classification by Xie et al, US 20150095017 A1 to Minh et al and FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence by Sohn et al. Xie teaches claims 1 and 13. An (Xie fig 1 student model is the first NN, see below) PNG media_image2.png 478 638 media_image2.png Greyscale obtain a first pseudo-labeled data with a first pseudo-label; (Xie fig 1 shows pseudo labeled dataset used by the student model.) input the first pseudo-labeled data into the first neural network model to obtain a second pseudo-labeled data, (Xie abs “we first train an EfficientNet model on labeled images and use it as a teacher to generate pseudo labels for 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher.” The second dataset is the pseudo labels generated by the “new teacher” which was the student model that trained on the last round of pseudo labeled data.) train the first neural network model according to the pseudo-labeled dataset; and (Xie abs “we first train an EfficientNet model on labeled images and use it as a teacher to generate pseudo labels for 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher.”) Xie doesn’t teach an electronic device. However, electronic device adaptable for training a neural network model, comprising: a storage medium, storing… a processor, coupled to the storage medium, wherein the processor is configured to… (Mnih para 15 “a system and method of predicting a word association between words in a word dictionary, comprising processor implemented steps of storing data defining a word association matrix including a plurality of vectors…” Mnih para 74 “As will be appreciated, removable storage unit 1018 includes a computer usable storage medium having stored therein computer software and/or data.“) The claims, Mnih and Xie are all machine learning algorithms. It would have been obvious to a person having ordinary skill in the art, at the time of filing, to execute Xie on a computer because computers are the machine in the phrase machine learning. Xie doesn’t teach matching the pseudo-labeled data. However, Sohn teaches wherein the second pseudo-label data includes a probability vector;… determine whether a second pseudo-label corresponding to the second pseudo-labeled data matches the first pseudo-label corresponding to the first pseudo-labeled data, wherein the first neural network model includes a first sub-neural network model and a second sub-neural network model, and determine whether the second pseudo-label matches the first pseudo-label according to the probability vector; (Sohn pg. 3 above eq. 4, “To obtain an artificial label, we first compute the model’s predicted class distribution given a weakly-augmented version of a given unlabeled image… Then, we use [highest probability prediction] as a pseudo-label…” The second-pseudo labeled data “matches” a first pseudo label when one of the predictions is over a prediction threshold, see Sohn fig. 1 below.) PNG media_image1.png 243 533 media_image1.png Greyscale in response to that the second pseudo-label matches the first pseudo-label, add the second pseudo-labeled data to a pseudo-labeled dataset; and (Sohn abs “FixMatch first generates pseudo-labels using the model’s predictions on weakly augmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image.) train a final neural network model according to a third pseudo-labeled data generated by the trained first neural network model. (Sohn fig. 1 “The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.” Sohn fig. 1’s training method is done at least “300” times, according to the 300 training epochs taught on page 18. When a pseudo label is “retained” (Sohn abs) then it becomes a fourth pseudo label data.) The claims, Sohn and Xie are all machine learning algorithms. It would have been obvious to a person having ordinary skill in the art, at the time of filing, to use FixMatch to inject noise into Xie’s “Noisy Student” because of “its simplicity, [and] state-of-the-art performance across a variety of standard semi-supervised learning benchmarks…” Sohn abs. Sohn teaches claims 2 and 14. The electronic device according to claim 1, wherein the processor is further configured to: in response to a maximum probability in the probability vector being greater than a probability threshold, determine whether the second pseudo-label matches the first pseudo-label. (Sohn pg. 3 above eq. 4, “To obtain an artificial label, we first compute the model’s predicted class distribution given a weakly-augmented version of a given unlabeled image… Then, we use [highest probability prediction] as a pseudo-label…” The second-pseudo labeled data “matches” a first pseudo label when one of the predictions is over a prediction threshold, see Sohn fig. 1 below. Specifically Fig1 states “When the model assigns a probability to any class which is above a threshold (dotted line), the prediction is converted to a one-hot pseudo-label. Then, we compute the model’s prediction for a strong augmentation of the same image (bottom). The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.”) PNG media_image1.png 243 533 media_image1.png Greyscale Sohn teaches claims 3 and 15. The electronic device according to claim 1, wherein the processor is further configured to: in response to the second pseudo-label matching the first pseudo-label, calculate a first cross-entropy loss between the first pseudo-labeled data and the second pseudo-labeled data; and (Sohn Fig1 states “When the model assigns a probability to any class which is above a threshold (dotted line), the prediction is converted to a one-hot pseudo-label. Then, we compute the model’s prediction for a strong augmentation of the same image (bottom). The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.”) train the first neural network model according to a loss function associated with the first cross-entropy loss. (Sohn pg. 3 above eq. 4, “To obtain an artificial label, we first compute the model’s predicted class distribution given a weakly-augmented version of a given unlabeled image… Then, we use [highest probability prediction] as a pseudo-label…” The second-pseudo labeled data “matches” a first pseudo label when one of the predictions is over a prediction threshold, see Sohn fig. 1 below. Specifically Fig1 states “When the model assigns a probability to any class which is above a threshold (dotted line), the prediction is converted to a one-hot pseudo-label. Then, we compute the model’s prediction for a strong augmentation of the same image (bottom). The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.”) Xie teaches claims 4 and 16. The electronic device according to claim 3, wherein the processor is further configured to: obtain a first labeled data; (labeled data in fig. 1 of Xie.) input the first labeled data to the first neural network model to obtain a second labeled data; (Xie fig. 1 shows a teacher model which is the first iteration of the student model, and it is fed labeled data, see below. The second labeled data in Xie is Xie’s first set of pseudo-labels.) PNG media_image2.png 478 638 media_image2.png Greyscale calculate a second cross-entropy loss between the first labeled data and the second labeled data; and (Xie sec. 2 “We then train a student model which minimizes the combined cross entropy loss on both labeled images and unlabeled images.”) train the first neural network model according to the loss function associated with the second cross-entropy loss. (Xie sec. 2 “We then train a student model which minimizes the combined cross entropy loss on both labeled images and unlabeled images.” The first student model is “equal-or-larger” which means that the first student model may just be the teacher model.) Sohn teaches claims 8 and 20. The electronic device according to claim 1, the second pseudo-label comprises a first sub-pseudo-label corresponding to the first sub-neural network model and a second sub-pseudo-label corresponding to the second sub-neural network model, (Sohn’s first weakly augmented image is the first sub-pseudo label. The second strongly augmented image comprises the first weakly augmented image and the augments, and the strongly augmented image is the second pseudo label.) wherein the processor is further configured to: in response to the first sub-pseudo-label matching the first pseudo-label and the second sub-pseudo-label matching the first pseudo-label, determine that the second pseudo-label matches the first pseudo-label. (Sohn Fig. 1 states “When the model assigns a probability to any class which is above a threshold (dotted line), the prediction is converted to a one-hot pseudo-label. Then, we compute the model’s prediction for a strong augmentation of the same image (bottom). The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.” The branching nature of the different inputs make the model with the weakly-augmented image a first subnetwork, and the heavily augmented image goes into the second subnetwork. Sohn sec. 2.1 Sohn fig. 1 matches the strongly and weakly augmented images.) PNG media_image1.png 243 533 media_image1.png Greyscale Sohn teaches claims 9 and 21. The electronic device according to claim 1, wherein the processor is further configured to: train a second neural network model according to a labeled dataset; (Sohn sec. 2.2 “The loss function for FixMatch consists of two cross-entropy loss terms: a supervised loss ls applied to labeled data and an unsupervised loss lu. Specifically, ls is just the standard cross-entropy loss on weakly augmented labeled examples…” The cross entropy loss is used to train the model.) input an unlabeled dataset into the second neural network model to obtain a highly trusted pseudo-labeled dataset and a partially trusted pseudo-labeled dataset; and (The augmented images in Sohn Fig. 1 are unlabeled dataset. The Psuedo label from the weak-augmented images is the highly trusted pseudo labeled dataset. The prediction from the strongly-augmented image is the partially trusted pseudo-labeled dataset.) train the first neural network model according to the partially trusted pseudo-labeled dataset, wherein the partially trusted pseudo-labeled dataset comprises the first pseudo-labeled data. (Sohn fig. 1 “The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.”) Sohn teaches claims 10 and 22. The electronic device according to claim 9, wherein the processor is further configured to: train the final neural network model according to the labeled dataset, the highly trusted pseudo-labeled dataset, and the partially trusted pseudo-labeled dataset. (Sohn fig. 1 “The model is trained to make its prediction on the strongly-augmented version match the pseudo-label via a cross-entropy loss.” The partially trusted dataset is the predictions from the strongly-augmented images. The highly trusted and labeled dataset are the pseudo label from the weakly augmented images, which includes labeled data, Sohn sec. 2.2 “The loss function for FixMatch consists of two cross-entropy loss terms: a supervised loss ls applied to labeled data and an unsupervised loss lu. Specifically, ls is just the standard cross-entropy loss on weakly augmented labeled examples…”) Sohn teaches claims 11 and 23. The electronic device according to claim 10, wherein the processor is further configured to: input the third pseudo-labeled data in the partially trusted pseudo-labeled dataset into the first neural network model to obtain a fourth pseudo-labeled data; and (Sohn fig. 1’s training method is done at least “300” times, according to the 300 training epochs taught on page 18. When a pseudo label is “retained” (Sohn abs) then it becomes a third pseudo label data.) in response to a fourth pseudo-label of the fourth pseudo-labeled data matching a third pseudo-label of the third pseudo-labeled data, update the partially trusted pseudo-labeled dataset according to the fourth pseudo-labeled data. (Sohn pg. 5 first paragraph below table 1 description, “Pseudo-labeling refers to a specific variant where model predictions are converted to hard labels [25], which is often used along with a confidence-based thresholding that retains unlabeled examples only when the classifier is sufficiently confident…”) Sohn teaches claims 12 and 24. The electronic device according to claim 10, wherein the processor is further configured to: input the third pseudo-labeled data in the partially trusted pseudo-labeled dataset into the first neural network model to obtain a fourth pseudo-labeled data; (Sohn fig. 1’s training method is done at least “300” times, according to the 300 training epochs taught on page 18. When a pseudo label is “retained” (Sohn abs) then it becomes a fourth pseudo label data.) in response to a fourth pseudo-label of the fourth pseudo-labeled data not matching a third pseudo-label of the third pseudo-labeled data, output the fourth pseudo-labeled data and receive a fourth labeled data corresponding to the fourth pseudo-labeled data; and (Sohn pg. 5 first paragraph below table 1 description, “Pseudo-labeling refers to a specific variant where model predictions are converted to hard labels [25], which is often used along with a confidence-based thresholding that retains unlabeled examples only when the classifier is sufficiently confident…”) update the labeled dataset according to the fourth labeled data. (Sohn pg. 5 first paragraph below table 1 description, “Pseudo-labeling refers to a specific variant where model predictions are converted to hard labels [25], which is often used along with a confidence-based thresholding that retains unlabeled examples only when the classifier is sufficiently confident…” If they don’t match the hard label is not retained, this is the update.) Conclusion Any inquiry concerning this communication or earlier communications from the examiner should be directed to Austin Hicks whose telephone number is (571)270-3377. The examiner can normally be reached Monday - Thursday 8-4 PST. Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice. If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Miranda Huang can be reached on (571) 270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300. Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. /AUSTIN HICKS/Primary Examiner, Art Unit 2124
Read full office action

Prosecution Timeline

Nov 23, 2021
Application Filed
Dec 14, 2024
Non-Final Rejection — §103
Mar 19, 2025
Response Filed
Apr 18, 2025
Final Rejection — §103
Aug 22, 2025
Request for Continued Examination
Aug 31, 2025
Response after Non-Final Action
Sep 04, 2025
Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

Patent 12591767
NEURAL NETWORK ACCELERATION CIRCUIT AND METHOD
2y 5m to grant Granted Mar 31, 2026
Patent 12554795
REDUCING CLASS IMBALANCE IN MACHINE-LEARNING TRAINING DATASET
2y 5m to grant Granted Feb 17, 2026
Patent 12530630
Hierarchical Gradient Averaging For Enforcing Subject Level Privacy
2y 5m to grant Granted Jan 20, 2026
Patent 12524694
OPTIMIZING ROUTE MODIFICATION USING QUANTUM GENERATED ROUTE REPOSITORY
2y 5m to grant Granted Jan 13, 2026
Patent 12524646
VARIABLE CURVATURE BENDING ARC CONTROL METHOD FOR ROLL BENDING MACHINE
2y 5m to grant Granted Jan 13, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Powered by AI — typically takes 5-10 seconds

Prosecution Projections

3-4
Expected OA Rounds
76%
Grant Probability
99%
With Interview (+25.1%)
3y 4m
Median Time to Grant
High
PTA Risk
Based on 403 resolved cases by this examiner. Grant probability derived from career allow rate.

Sign in with your work email

Enter your email to receive a magic link. No password needed.

Personal email addresses (Gmail, Yahoo, etc.) are not accepted.

Free tier: 3 strategy analyses per month