Last updated: April 19, 2026

Application No. 18/074,915

SYSTEM AND METHOD FOR UNSUPERVISED MULTI-MODEL JOINT REASONING

Non-Final OA §101§103

Filed

Dec 05, 2022

Examiner

FIGUEROA, KEVIN W

Art Unit

2124

Tech Center

2100 — Computer Architecture & Software

Assignee

Huawei Cloud Computing Technologies Co. Ltd.

OA Round

1 (Non-Final)

Interview Optional

— +21.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 362 resolved cases, 2023–2026

Examiner Intelligence

FIGUEROA, KEVIN W View full profile →

Grants 70% — above average

Career Allow Rate

252 granted / 362 resolved

+14.6% vs TC avg

Strong +21% interview lift

Without

With

+21.0%

Interview Lift

resolved cases with interview

Typical timeline

4y 0m

Avg Prosecution

20 currently pending

Career history

382

Total Applications

across all art units

Statute-Specific Performance

§101

24.4%

-15.6% vs TC avg

§103

52.0%

+12.0% vs TC avg

§102

9.1%

-30.9% vs TC avg

§112

7.1%

-32.9% vs TC avg

Black line = Tech Center average estimate • Based on career data from 362 resolved cases

Office Action

§101 §103

DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claims 17-19 are objected to because of the following informalities: the claims depend on “the computer system of claim 1”. Claim 1 is a method claim. It is assumed that these claims are intended to recite “the computer system of claim 13”.  
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


	Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
	Regarding claim 1,
Step 1: Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a method.
	Step 2A Prong One: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
	The limitations of:
predicting a first label for the input sample using a first machine learning (ML) model that has been trained to map samples to a first set of labels; (abstract mental process, a human can, through the aid of a generic computer/learning mode or mentally, look at a sample and determine a label for it)
determining if the first label satisfies prediction accuracy criteria; (mental process, a human based of arbitrary criteria determine if a label is satisfactory or not)
when the first label does not satisfy the prediction accuracy criteria, predicting a second label for the input sample using a second ML model that has been trained to map samples to a second set of labels that includes the first set of labels and a set of additional labels, (mental process, when it the human determines that the label is not satisfactory, they can use another different process to determine a different label)
	Step 2A Prong Two: Does the claim recite additional elements that integrate the judicial exception into a practical application?
	The limitations of:
when the first label satisfies the prediction accuracy criteria, outputting the first label as the predicted label for the input sample; (outputting data, insignificant extras-solution activity MPEP 2106.05(g))
[…] and outputting the second label as the predicted label for the input sample (outputting data, insignificant extras-solution activity MPEP 2106.05(g))
	Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
	The limitations of:
when the first label satisfies the prediction accuracy criteria, outputting the first label as the predicted label for the input sample; (outputting data, insignificant extras-solution activity MPEP 2106.05(g), transmitting data is well-understood, routine, and conventional in the art MPEP 2106.05(d)(II)(i))
[…] and outputting the second label as the predicted label for the input sample (outputting data, insignificant extras-solution activity MPEP 2106.05(g), transmitting data is well-understood, routine, and conventional in the art MPEP 2106.05(d)(II)(i))
	Dependent claim 2 recites determining if the label satisfies a criterion, mental evaluation.
	Dependent claim 3 recites predicting a probability for the label, a mental process, determining a free energy value, mathematical concepts, and comparing the value to a threshold, mental evaluation.
	Dependent claim 4 recites determining an entropy value and comparing it, mathematical concepts and mental evaluations.
	Dependent claim 5 recites determining accuracy and labels, mental processes. 
	Dependent claim 6 recites one model being smaller than another, instructions to apply the abstract idea 2106.05(f).
	Dependent claim 7 recites executing on a computing system, instructions to apply the abstract idea 2106.05(f).
	Dependent claim 8 recites executing on different devices, instructions to apply the abstract idea 2106.05(f).
	Dependent claim 9 recites predicting labels, determining a subset, and training the model, mental processes along with applying the abstract idea, instructions to apply the abstract idea 2106.05(f).
	Dependent claim 10 training the model, instructions to apply the abstract idea 2106.05(f).
	Dependent claim 11 recites the models being neural network models with different layers, instructions to apply the abstract idea 2106.05(f).
	Independent claim 12 is a combination of independent claim 1 and dependent claim 3, and therefore is subject to the analysis of those claims.
	Independent claim 13 recites the same substantial subject matter as independent claim 1, only differing in embodiment. The change in embodiment does not meaningfully change the above analysis and therefore the claim is subject to the same rejection.
	Dependent claims 14-19 correspond to dependent claims 2-6 and 9.
	Independent claim 20 recites the same substantial subject matter as independent claim 1, only differing in embodiment. The change in embodiment does not meaningfully change the above analysis and therefore the claim is subject to the same rejection.
	Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-2, 5-8, 11, 13-14, 17-18, and 20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen et al. US 2020/0151578 in view of Lee et al. US 2020/0311546.
	Regarding claims 1, 13, and 20, Chen teaches “a method for predicting a label for an input sample, comprising predicting a first label for the input sample using a first machine learning (ML) model that has been trained to map samples to a first set of labels” ([0054] “S103. Input one or more data samples without determined labels from the set of data samples without determined labels into the prediction model, and determine the corresponding prediction values output by the model as learning labels of the one or more data samples without determined labels”); 
	“determining if the first label satisfies prediction accuracy criteria” ([0058] “S104. Obtain a sampling subset having the learning labels according to the current set of data samples without determined labels, and check the learning labels of the sampling subset, so as to obtain an accuracy rate of the learning labels”); 
	“when the first label satisfies the prediction accuracy criteria, outputting the first label as the predicted label for the input sample” ([0061] and [0063] “S105. Determine if the accuracy rate of the learning labels meets a preset requirement. If yes, the method proceeds to S107 […] S107. Label the remaining data samples in the set of data labels without determined labels with their corresponding learning labels.”); and 
	“when the first label does not satisfy the prediction accuracy criteria” ([0061] “S105. Determine if the accuracy rate of the learning labels meets a preset requirement. If yes, the method proceeds to S107; otherwise, the method proceeds to execute S106 and then return to S102.”),
More explicitly, Lee teaches “predicting a second label for the input sample using a second ML model that has been trained to map samples to a second set of labels that includes the first set of labels and a set of additional labels, and outputting the second label as the predicted label for the input sample” (Lee [0038] “Referring to FIG. 2, the edge device 300 performs operations up to a previous layer to an exit point EP in view of the exit point EP, and then calculates inference confidence based on the operation result. Next, the edge device 300 compares the calculated inference confidence with a predetermined threshold. If it is determined that the confidence is higher than the threshold, the edge device 300 outputs the processed operation result and stops operations of the layers subsequent the exit point EP. Such an output structure is referred to as a “local exit”. If it is determined that the inference confidence is lower than the threshold, the edge device 300 transmits an intermediate operation result obtained by processing up to corresponding layer, to the cloud 400”) 
It would have been obvious to one having ordinary skill in the art at the time that the invention was effectively filed to combine the teachings of Chen with that of Lee since by combining the techniques, one can use a smaller model to make predictions thereby improving computer efficiency and if necessary, use the bigger model when higher accuracy is needed.
Note that independent claims 13 and 20 recite the same substantial subject matter as independent claim 1, only differing in embodiment. The differences in embodiment, a system and non-transitory computer readable medium are taught by Chen [0014].
Regarding claims 2 and 14, the Chen and Lee references have been addressed above. Chen further teaches “wherein the determining if the first label satisfies the prediction accuracy criteria comprises evaluating if the input sample is in-distribution relative to a distribution that corresponds to the first set of labels, wherein when the input sample is evaluated to be in-distribution then the first label satisfies the prediction accuracy criteria” ([0061] and [0063] “S105. Determine if the accuracy rate of the learning labels meets a preset requirement. If yes, the method proceeds to S107 […] S107. Label the remaining data samples in the set of data labels without determined labels with their corresponding learning labels.”).
Regarding claims 5 and 17, the Chen and Lee references have been addressed above. Chen further teaches “wherein the first ML model is trained to map samples that fall within the set of additional labels to a further label, and determining if the first label satisfies prediction accuracy criteria comprises, prior to evaluating if the input sample is in-distribution, determining if the first label predicted for the input sample corresponds to the further label, and if so then determining that the first label does not satisfy the prediction accuracy criteria” ([0062] “S106. Label the data samples in the sampling subset with the label checking result that comprises reliable labels for the data samples, and move the sampling subset from the set of data samples without determined labels to the set of data samples with determined labels.” and [0064] “Assuming that U.sub.S is used to represent a sampling subset, if the accuracy rate of the learning labels of U.sub.S reaches a criterion (e.g., 95%), it is considered that learning labels of the entire U (or U.sub.0) are trustworthy and may be output directly as labeling results; otherwise, the next round of learning may be triggered.”) 
Regarding claims 6 and 18, the Chen and Lee references have been addressed above. Lee further teaches “wherein the first ML model is a smaller ML model than the second ML model” (Lee [0004] “the edge device performs the operation processes up to only some layers of the DNN, then determines an inference result, and does not perform any more unnecessary operations if it is determined that the confidence of the inference result is satisfied to some extent” i.e. the edge device is smaller)
Regarding claim 7, the Chen and Lee references have been addressed above. Chen further teaches“wherein the first ML model and the second ML model are executed on a first computing system, the method including receiving the input sample at the first computing system through a network and returning the predicted label through the network” ([0089] “The systems, devices, modules or units set forth in the foregoing embodiments may be achieved, for example, by computer chips or entities, or by products with certain functions.”)
Regarding claim 8, the Chen and Lee references have been addressed above. Lee further teaches “wherein the first ML model is executed on a first device and the second ML model is executed on a second device, the method comprising transmitting the input sample from the first device to the second device when the first label does not satisfy the prediction accuracy criteria” (Lee abstract “A processor partitions a deep neural network having a plurality of exit points and at least one partition point in a branch corresponding to each of the exit points, for distributed processing in an edge device and a cloud”)
Regarding claim 11¸the Chen and Lee references have been addressed above. Lee further teaches “wherein the first ML model and the second ML model are deep neural network models and the first ML model has fewer NN layers than the second ML model” (Lee [0038] “Referring to FIG. 2, the edge device 300 performs operations up to a previous layer to an exit point EP in view of the exit point EP, and then calculates inference confidence based on the operation result. Next, the edge device 300 compares the calculated inference confidence with a predetermined threshold. If it is determined that the confidence is higher than the threshold, the edge device 300 outputs the processed operation result and stops operations of the layers subsequent the exit point EP. Such an output structure is referred to as a “local exit”. If it is determined that the inference confidence is lower than the threshold, the edge device 300 transmits an intermediate operation result obtained by processing up to corresponding layer, to the cloud 400”)
Claim(s) 3, 12 and 15 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Lee further in view of Dayan, Peter, et al. "The helmholtz machine.".
Regarding claims 3 and 15, the Chen and Lee references do not explicitly teach the claim limitation. Dayan however teaches “wherein the first ML model predicts a probability for each of the labels included in the first set of labels, wherein evaluating if the input sample is in-distribution comprises” (Dayan pg. 1 ¶1 “Following Helmholtz, we view the human perceptual system as a statistical inference engine whose function is to infer the probable causes of sensory input. We show that a device of this kind can learn how to perform these inferences without requiring a teacher to label each sensory input vector with its underlying causes. A recognition model is used to infer a probability distribution over the underlying causes from the sensory input, and a separate generative model, which is also learned, is used to train the recognition model”): “determining a free energy value for the input sample based on the predicted probabilities for all of the labels included in the first set of labels; (Dayan pg. 2
    PNG
    media_image1.png
    756
    946
    media_image1.png
    Greyscale
) 
It would have been obvious to one having ordinary skill in the art at the time that the invention was made to combine the teachings of Chen and Lee with that of Dayan since a combination of known methods would yield predictable results. As shown in Dayan back in 1995, it was known to use free energy as a metric for determining accuracy. Therefore these techniques would operate in a known and predictable manner with the systems of above.
Chen further teaches comparing the free energy value to a defined threshold to determine when prediction accuracy criteria is satisfied” ([0061] “S105. Determine if the accuracy rate of the learning labels meets a preset requirement” with the values of above) 
Regarding claim 12, claim 12 is a combination of independent claim 1 and dependent claim 3. Therefore the claim is rejected under the combination of both rejections.
Claim(s) 4 and 16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Lee further in view of Hazard et al. US 2019/0311220.
Regarding claims 4 and 16, the Chen and Lee references have been addressed above. The references do not explicitly teach entropy values. Hazard however teaches “wherein the first ML model predicts a probability for each of the labels included in the first set of labels, wherein evaluating if the input sample is in-distribution comprises: determining an entropy value for the input sample based on the predicted probabilities for all of the labels included in the first set of labels” (Hazard [0017] “Various embodiments herein use a measure of information entropy to determine the additional surprisal (or surprise) that a data point provides to a set of data. Information entropy is the expected value of surprisal. Example measures of surprisal are described elsewhere herein.”); and “comparing the entropy value to a defined threshold to determine when prediction accuracy criteria is satisfied” ([0156] “For example, a check 730 may be made to determine whether the prediction conviction is above a first threshold and whether the familiarity conviction is below a second threshold. If that condition is met, then, in some embodiments, the case may be excluded 740 from the model. When the prediction conviction is high and the familiarity conviction is low, then, in some embodiments, it may be the case that the case is easy to “label” or associate with an outcome, but is not needed in the model (e.g., it does not provide much or any additional information), and it therefore may be excluded 740 from the model. Therefore, the case can be excluded without reducing the overall effectiveness of the model by much. The thresholds for high prediction conviction and low familiarity conviction may be any appropriate threshold including, a value scaled by the size of the model, a value scaled by the accuracy of the model, a fixed value, etc. If the conviction measure is used instead of conviction (without being taken as a ratio to the expected value), then additional thresholds may be appropriate including a fixed value of entropy, entropy scaled based on the model, entropy scaled based on other measures of the model, etc.”)
It would have been obvious to one having ordinary skill in the art at the time that the invention was made to combine the teachings of Chen and Lee with that of Hazard since a combination of known methods would yield predictable results. As shown in Hazard, entropy is a well-known machine learning loss function and would operate as expected in the systems above.
Claim(s) 9-10 and 19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Chen in view of Lee further in view of Kurta US 2017/0061330.
Regarding claims 9 and 19, the Chen and Lee references have been addressed above. They do not explicitly teach the claim limitations. Kurta however teaches “comprising, prior to predicting the first label, training the first model by: predicting labels for a set of unlabeled data samples using the second ML model to generate a set of pseudo-labeled data samples that correspond to the second set of labels” (Kurata [0064] “The selection of the subset may be performed based on a frequency of appearance relevant to each combination in the given training data 140”); “determining a subset of the second set of labels to include in the first set of labels based on the frequency of occurrence of the labels in the set of pseudo- labeled data samples” (Kurata [0064] “The co-occurring combination listing module 132 is configured to list the labels co-occurred in the training data 140 so as to obtain combinations of co-occurring labels expected to be appeared together for an input query. In a preferable embodiment, the co-occurring combination listing module 132 is further configured to select a subset from among the listed combinations. The selection of the subset may be performed based on a frequency of appearance relevant to each combination in the given training data 140”); 
“training the first ML model using the set of pseudo-labeled data samples to map samples to the first set of labels” (Kurata [0064] “By selecting the subset, relatively popular co-occurring combinations can be treated in a preferential manner even if the number of the combination exceeds capacity of the novel learning technique owing to the topology of the neural network.”) 
 It would have been obvious to one having ordinary skill in the art at the time that the invention was made to combine the teachings of Chen and Lee with that of Kurta since the frequency of an item is a known metric to determine how important it is. Therefore by combining frequency measurements, the learning system would operate in predictable manner with the references as shown above.
Regarding claim 10, the Chen, Lee, and Kurta references have been addressed above. Chen further teaches “wherein training the first ML model comprises training the first ML model to map samples that fall within the set of additional labels to a further label that corresponds to all of the second set of labels that are not included in the first set of labels” (abstract “if the accuracy rate does not meet the preset requirement, labeling the data samples in the subset with the determined labels for the data samples in the subset, and moving the subset from the first set of data samples to the second set of data samples; and after the iteration ends, labeling the remaining data samples in the first set with the associated learning labels”) 
Conclusion

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN W FIGUEROA whose telephone number is (571)272-4623. The examiner can normally be reached Monday-Friday, 10AM-6PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

KEVIN W FIGUEROA
Primary Examiner
Art Unit 2124



/Kevin W Figueroa/            Primary Examiner, Art Unit 2124

Read full office action

Prosecution Timeline

Dec 05, 2022

Application Filed

Feb 21, 2026

Non-Final Rejection — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/669,203

Patent 12586093

SYSTEMS AND METHODS FOR FACILITATING NETWORK CONTENT GENERATION VIA A DYNAMIC MULTI-MODEL APPROACH

2y 5m to grant Granted Mar 24, 2026

17/687,809

Patent 12573477

MOLECULAR STRUCTURE ACQUISITION METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM

2y 5m to grant Granted Mar 10, 2026

19/223,082

Patent 12570281

METHOD FOR EVALUATING DRIVING RISK LEVEL IN TUNNEL BASED ON VEHICLE BUS DATA AND SYSTEM THEREFOR

2y 5m to grant Granted Mar 10, 2026

17/493,492

Patent 12554964

CIRCUIT FOR HANDLING PROCESSING WITH OUTLIERS

2y 5m to grant Granted Feb 17, 2026

17/244,006

Patent 12547873

METHOD AND APPARATUS WITH NEURAL NETWORK INFERENCE OPTIMIZATION IMPLEMENTATION

2y 5m to grant Granted Feb 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

1-2

Expected OA Rounds

70%

Grant Probability

91%

With Interview (+21.0%)

4y 0m

Median Time to Grant

Low

PTA Risk

Based on 362 resolved cases by this examiner. Grant probability derived from career allow rate.