Last updated: May 29, 2026
Application No. 18/048,197
System and method for improving efficacy of supervised learning

Final Rejection §101§103
Filed
Oct 20, 2022
Priority
Oct 21, 2021 — provisional 63/270,243
Examiner
LEE, MICHAEL CHRISTOPHER
Art Unit
2128
Tech Center
2100 — Computer Architecture & Software
Assignee
NFERENCE, INC.
OA Round
2 (Final)
This examiner grants 61% of cases after interview

— +26.0% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 144 resolved cases, 2023–2026
Examiner Intelligence

LEE, MICHAEL CHRISTOPHER View full profile →
Grants 61% of resolved cases
Career Allowance Rate
88 granted / 144 resolved
+6.1% vs TC avg
Strong +26% interview lift
Without
With
+26.0%
Interview Lift
resolved cases with interview
Typical timeline
3y 3m
Avg Prosecution
19 currently pending
Career history
195
Total Applications
across all art units
Statute-Specific Performance

§101
15.9%
-24.1% vs TC avg
§103
79.7%
+39.7% vs TC avg
§102
0.8%
-39.2% vs TC avg
§112
3.3%
-36.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 144 resolved cases
Office Action

§101 §103
DETAILED ACTION
Notice of AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Amendment
Applicant’s Amendment and remarks dated 4/7/2026 have been considered.  Claims 1-45 are pending.
Drawing Objections.  The drawing objections are withdrawn in view of Applicant’s amendments to the specification to add reference characters 309a, 309d, 407, 410, 1007, 1008, and 1105.
35 U.S.C. 112(b) Rejections.  The rejections to claims 15-18, 22-25, 30, 34-35, 40, and 44-45 are withdrawn in view of Applicant’s amendments to such claims.
Written Description.  The examiner notes that at least para. 0142 appears to provide sufficient written description support for the amendments to the independent claims.

Response to Arguments
On page 3 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant asserts that “the claim as a whole, considered as an ordered combination, is not directed to an abstract idea.”
The examiner respectfully disagrees.  Each and every limitation of the claim (the whole claim) was considered, and the claim is almost entirely mental processes as explained in the office action.  Moreover, as an “ordered combination”, claim 1 merely to labeling data, capturing disagreement between labeling agents, and quantifying model uncertainty, which is a mental process.  Claim 1 does not have any limitations related to supervised learning at all, as no machine learning model is even trained using the labeled data.

On pages 3-4 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant asserts that “the claim as a whole, considered as an ordered combination, is not directed to an abstract idea.”

    PNG
    media_image1.png
    174
    646
    media_image1.png
    Greyscale


    PNG
    media_image2.png
    164
    642
    media_image2.png
    Greyscale

	The examiner respectfully disagrees.  The examiner provided for each identified mental process, a simple example showing how the claim language could practically be mentally performed, which Applicant has not rebutted.  The examiner properly applied the broadest reasonable interpretation standard, and Applicant has not provided any specific argument for any specific limitation about why such interpretation is not reasonable.

On page 4 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant argues:

    PNG
    media_image3.png
    166
    634
    media_image3.png
    Greyscale

	The examiner respectfully disagrees.  In response to Applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “(1) receiving labeling outputs ...; (2) computes a measure of divergence...”; and (3) encodes and persists...”) are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	The broadest reasonable interpretation of this limitation simply requires capturing disagreement between a plurality of labeling agents, which can be as simple as noting that 2 agents disagree on a particular label.

On page 4 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant argues:


    PNG
    media_image4.png
    284
    654
    media_image4.png
    Greyscale

The examiner respectfully disagrees.  In response to Applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “a systematic process of capturing disagreement” and “coordinate the outputs of multiple independent labeling agents across a corpus-scale candidate set, compute a structured divergence measure from those outputs, and retain that measure as a persistent artifact available for retrieval at a later and entirely distinct computation stage”) are not recited in the rejected claims. Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
	Moreover, its unclear what “categorical distinction” there is between a mental observation and a “computation pipeline.”  The entire data labeling “pipeline” recited by the claim is a mental process.

On page 4 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant argues:

    PNG
    media_image5.png
    90
    640
    media_image5.png
    Greyscale

	The examiner respectfully disagrees.  The broadest reasonable interpretation of this limitation includes the mental process of quantifying model uncertainty (e.g., giving a percentage to the model confidence), “at test time”, which means at the time the model is being tested and quantified.

On pages 4-5 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant argues that “at test time” is a “term of art in machine learning that refers specifically to the inference stage of a trained and deployed model” and argues that a human reviewer cannot operate “at test time.”
The examiner respectfully disagrees.  Applicant’s argument that “at test time” is a “term of art” is unsupported by the specification or any evidence.  The broadest reasonable interpretation simply requires the quantifying of the model uncertainty, using the uncertainty measure at test time, which means at the time of the quantifying of the model.

On page 5 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 1, Applicant argues:

    PNG
    media_image6.png
    414
    644
    media_image6.png
    Greyscale

	The examiner respectfully disagrees.  The “earlier steps” of labeling are all mental steps, and as explained above, the newly-added limitations are also mental steps.  Moreover, the claim does not recite or require any “supervised learning” as argued by Applicant.

On pages 5-7 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 2, Applicant argues that “amended claim 1 is squarely focused on a technical improvement to the supervised learning pipeline.”
The examiner respectfully disagrees.  As explained by MPEP 2106.04(d)(1), such improvement needs to be to a technology or “technical field”, and data labeling is a mental process, not a technology or technical field.  After reviewing the portions of the specification identified by Applicant, the examiner disagrees that one of ordinary skill would find the claim to pertain to any improvement to any alleged “supervised learning pipeline” because the claim itself does not even require any supervised learning.

On pages 7-8 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2A, Prong 2, Applicant argues the newly-added limitations “have no meaningful mental process analog” and integrate the judicial exception into a practical application.
The examiner respectfully disagrees.  As explained above, the newly-added limitations are mental processes.  The only non-mental limitation recited is the “pretrained model” limitation, which is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (a generic pretrained model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).  Moreover, the “pretrained model” does not actually perform any function in the claim, as the pretrained model is passively referred to as having a “pretrained vector space.”

On pages 8-9 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection to claim 1 under 35 U.S.C. 101, with respect to Step 2B, Applicant makes arguments with respect to In re Berkheimer.
The examiner respectfully disagrees with all these arguments.  First, as explained by MPEP 2106.05(d), the well-understood, routine, conventional activity consideration is merely a consideration under Step 2B, and is not a standalone test.  The examiner relied on MPEP 2106.05(f), and explained that the “pretrained model” limitation is recited at a high-level of generality such it amounts to no more than mere instructions to apply the exception using a generic computer component (See MPEP 2106.05(f)).
Second, the examiner acknowledges that the examiner did not provide any evidence under Berkheimer to attempt to establish that any limitations are well-understood, routine, or conventional activity.  However, Applicant has not provided any evidence to show that such limitations are not well-understood, routine, or conventional activity either.  This lack of evidence in either direction does not favor, nor disfavor, a finding of eligibility.

On pages 9-10 of Applicant’s 4/7/2026 Amendment and remarks, Applicant argues that the remaining claims are subject matter eligible for the same reasons argued with respect to claim 1.
The examiner respectfully disagrees for the same reasons explained with respect to claim 1.

On pages 10-11 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection of claim 1 under 35 U.S.C. 103, Applicant asserts that the MEDALION and SHARMA references do not teach the newly-added “capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure” and “quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates” limitations.
The examiner agrees that MEDALION and SHARMA do not explicitly teach these limitations.  The previous rejection to claim 1 under 35 U.S.C. 103 is hereby withdrawn.  However, new rejections in view of the MEDALION, SHARMA, and YAO references are provided herein, where such new grounds of rejection are necessitated by Applicant’s amendments to claim 1.

On pages 11-15 of Applicant’s 4/7/2026 Amendment and remarks, with respect to the rejection of claims 2-45 under 35 U.S.C. 103, Applicant asserts that the prior art of record does not teach the newly-added “capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure” and “quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates” limitations added to the independent claims.
The examiner agrees that the prior art previously of record does not explicitly teach these limitations.  The previous rejections to claims 2-45 under 35 U.S.C. 103 are hereby withdrawn.  However, new rejections in view of at least the YAO reference are provided herein, where such new grounds of rejection are necessitated by Applicant’s amendments to the independent claims.

Claim Objections
Claims 1, 26, and 36 are objected to because of the following informalities: 
In claim 1, line 11, “as test time” should read “at test time” to be consistent with the arguments made by Applicant on pages 4-5.
In claim 26, line 15, “as test time” should read “at test time” to be consistent with the arguments made by Applicant on pages 4-5.
In claim 36, line 13, “as test time” should read “at test time” to be consistent with the arguments made by Applicant on pages 4-5.
Appropriate correction is required.

Claim Rejections - 35 USC § 101
Claims 1-45 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 

Regarding Step 1 of the Alice/Mayo framework, Claims 1-25 are directed to a method (a process), Claims 26-35 are directed to a system (a machine), and Claims 36-45 are directed to a non-transitory computer-readable medium (an article of manufacture), which each fall within one of the four statutory categories of inventions.

Regarding Claim 1
Step 2A, prong 1 (Is the claim directed to a law of nature, a natural phenomenon or an abstract idea).
Claim 1 recites the following mental processes, that in each case under the broadest reasonable interpretation, covers performance of the limitation in the mind (including an observation, evaluation, judgment, opinion) or with the aid of pencil and paper but for the recitation of generic computer components (e.g., “pretrained model”). 
selecting a first plurality of input candidates from a corpus of data; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally select a plurality of words from a corpus of data, such as an encyclopedia)
mapping the first plurality of input candidates onto a pretrained vector space of a pretrained model; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space)
clustering the first plurality of input candidates in the pretrained vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space, and then draw circles around clusters of data points that are close to each other)
adding the first plurality of input candidates to a plurality of queues for labelling; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally add the selected words to a plurality of queues for labeling, where such queues can be mental lists of words for labeling)
labelling the first plurality of input candidates (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label words (the input candidates))
	capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally determine if multiple annotators, such as human annotators, disagree on a particular label and quantify such disagreement, such as a fraction of the number of annotators that agree divided by the number of total annotators)
	quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally determine a confidence score in the model output, such as by determining if the model is 100%, 50%, or 0% accurate (rounding to the closest), by using the agreement measure of the annotators as of the time the model uncertainty is being tested)

Step 2A, prong 2 (Does the claim recite additional elements that integrate the judicial exception into a practical application?).
The judicial exception is not integrated into a practical application.  In particular, the claim recites the additional elements (e.g., “pretrained model”) which are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using a generic computer component (See MPEP 2106.05(f)). 
	Regarding the “pretrained model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of a pretrained model.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (a generic pretrained model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).
Accordingly, at Step 2A, prong two, after considering all claim elements individually and as an ordered combination, it is determined that the claims do not integrate the judicial exception into a practical application.

Step 2B (Does the claim recite additional elements that amount to significantly more than the judicial exception?)
	In accordance with Step 2B, the claim does not include additional elements that are sufficient to amount to significantly more that the judicial exception.  As discussed above, the additional elements (e.g., “pretrained model”) are recited at a high-level of generality such that they amount to no more than mere instructions to apply the exception using a generic computer component (See MPEP 2106.05(f)).
Regarding the “pretrained model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).
Accordingly, at Step 2A, prong two, after considering all claim elements individually and as an ordered combination, it is determined that the claims do not integrate the judicial exception into a practical application.

Regarding Claim 2
Step 2A, Prong 1
wherein labelling the first plurality of input candidates is performed by humans.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label words with mentally-derived labels)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 3
Step 2A, Prong 1
wherein labelling the first plurality of input candidates is performed algorithmically.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label words using an algorithm to determine the label, e.g., if the word starts with a letter from a-m, label it with a “0” and if it starts with a letter from n-z, label it with a “1”)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 4
Step 2A, Prong 1
wherein labelling comprises identifying cluster centroids in the pretrained vector space.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally identify cluster centroids, such as by mentally selecting a data point of the cluster as a centroid, where such clusters are drawn around points on a x-y axis that represents a 2-dimensional pretrained vector space)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 5
Step 2A, Prong 1
wherein the pretrained vector space is created by mapping input to sparse/dense distributed representations.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map input words to a sparse/dense distributed representations that have few zero values)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 6
Step 2A, Prong 1
wherein the pretrained vector space comprises learned parameters of a probability distribution.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper) calculate a probability distribution and map such probability distribution to the pretrained vector space)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 7
Step 2A, Prong 2
Regarding the “the pretrained vector space is learned by performing density estimation” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (e.g., any manner of training a model to learn a vector space that uses density estimation in any manner).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “the pretrained vector space is learned by performing density estimation” limitation,  such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 8
Step 2A, Prong 2
Regarding the “wherein the pretrained model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (specific types of architectures for the pretrained model. As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.  Moreover, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of specific types of architectures for the pretrained model.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (specific types of architectures for the pretrained model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “wherein the pretrained model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception.  MPEP 2106.05(h).  Moreover, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 9
Step 2A, Prong 1
further comprising partitioning the labeled first plurality of input candidates into a train set, a development set, a test set, and an out-of-distribution set, wherein partitioning the labeled first plurality of input candidates comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition input candidates (such as 4 different words), into each of the recited sets (1 in each))
adding labeled cluster centroids in the pretrained vector space from the first plurality of input candidates to the train set; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), add a word corresponding to a cluster centroid to the train set as recited by this limitation)
adding labeled cluster children in the pretrained vector space from the first plurality of input candidates to one of the development set and the test set; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a cluster child to one of the development set and the test set as recited by this limitation)
adding labeled singletons in the pretrained vector space from the first plurality of input candidates to one of the train set and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a singleton to one of the train set and the out-of-distribution set as recited by this limitation)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 10
Step 2A, Prong 2
Regarding the “further comprising creating a fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (e.g., any technique for fine tuning a model is covered).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “further comprising creating a fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 11
Step 2A, Prong 2
Regarding the “wherein creating the fine tuned model comprises using the pretrained model to create the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result (e.g., any technique for fine tuning a pre-trained model is covered).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “wherein creating the fine tuned model comprises using the pretrained model to create the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation attempts to cover a solution to an identified problem with no restriction on how the result is accomplished, or provides no description of the mechanism for accomplishing the result.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 12
Step 2A, Prong 2
Regarding the “further comprising assigning a first plurality of outputs using the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of a fine tuned model.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (a fine tuned model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “further comprising assigning a first plurality of outputs using the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 13
Step 2A, Prong 2
Regarding the “wherein the fine tuned model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use (specific types of architectures for the pretrained model. As explained by the Supreme Court, a claim directed to a judicial exception cannot be made eligible "simply by having the applicant acquiesce to limiting the reach of the patent for the formula to a particular technological use." Diamond v. Diehr, 450 U.S. 175, 192 n.14, 209 USPQ 1, 10 n. 14 (1981). Thus, limitations that amount to merely indicating a field of use or technological environment in which to apply a judicial exception do not integrate a judicial exception into a practical application.  Moreover, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of specific types of architectures for the pretrained model.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (specific types of architectures for the pretrained model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
Regarding the “wherein the fine tuned model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof” limitation, such limitation amounts to no more than generally linking the use of a judicial exception to a particular technological environment or field of use as explained above, which does not amount to significantly more than the judicial exception.  MPEP 2106.05(h).  Moreover, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 14
Step 2A, Prong 1
further comprising evaluating performance of the fine tuned model on the test set, wherein evaluating the performance of the fine tuned model comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally evaluate the performance of the fine tuned model on the test set, such as mentally evaluating whether the performance is good or bad in the human’s opinion)
mapping the test set onto a fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the fine tuned vector space)
clustering the test set in the fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the fine tuned vector space, and then draw circles around clusters of data points that are close to each other)
quantifying heterogeneity of test set clusters in the fine tuned vector space; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally review clusters in the test set as mapped and graphed onto a x-y axis, and mentally determine the distance from the centroid to the farthest point in the cluster to quantify the difference between the centroid and farthest point)
providing a confidence score for the fine tuned model.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally evaluate the performance of the fine tuned model and mentally generate a confidence score (e.g., from 1 to 10) about how confident the human is in its analysis of the performance of the model)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 15
Step 2A, Prong 1
further comprising labelling a second plurality of input candidates from the corpus of data, wherein labelling the second plurality of input candidates comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label a second set of words from the same corpus)
mapping the train set and development set onto the pretrained vector space and a fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space and fine tuned vector space, respectively)
clustering the train set and development set in the pretrained vector space and the fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space and fine tuned vector space, respectively, and then draw circles around clusters of data points that are close to each other)
identifying heterogeneous clusters and singletons in the fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally identify clusters that have diversity (corresponding to “heterogeneous clusters”) and clusters that only have a single data point)
selecting the second plurality of input candidates such that the second plurality of input candidates are nearest to at least one of the heterogeneous clusters and singletons in the fine tuned vector space; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally review heterogeneous clusters and singletons in the fine tuned vector space and select points that are near to such clusters and/or singletons)
labelling the second plurality of input candidates.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label the second plurality of input candidates)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 16
Step 2A, Prong 1
further comprising partitioning the labeled second plurality of input candidates into the train set, the development set, the test set, and the out-of-distribution set, wherein partitioning the labeled second plurality of input candidates comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition input candidates (such as 4 different words), into each of the recited sets (1 in each))
adding labeled cluster centroids from the second plurality of input candidates to the train set; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), add a word corresponding to a cluster centroid to the train set as recited by this limitation)
adding labeled cluster children from the second plurality of input candidates to one of a development set and the test set; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a cluster child to one of the development set and the test set as recited by this limitation)
 adding labeled singletons from the second plurality of input candidates to the one of the train set and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a singleton to one of the train set and the out-of-distribution set as recited by this limitation)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 17
Step 2A, Prong 1
wherein labelling of the second plurality of input candidates comprises algorithmically labelling the second plurality of input candidates.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label words using an algorithm to determine the label, e.g., if the word starts with a letter from a-m, label it with a “0” and if it starts with a letter from n-z, label it with a “1”)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 18
Step 2A, Prong 1
further comprising assigning a confidence score for the labelling of the second plurality of input candidates using a bipartite graph of the pretrained vector space and the fine tuned vector space.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), draw a bipartite graph of the pretrained vector space and the fine tuned vector space, and utilize such graph when determining a confidence score to assign)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 19
Step 2A, Prong 1
evaluating performance of an ensemble of two or more fine tuned models on the test set, wherein evaluating the performance of the ensemble of two or more fine tuned models comprises determining whether the ensemble of two or more fine tuned models concur on an output; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally determine if the ensemble of two or more fine tuned models concur on an output, and if so, evaluate the performance as being good)
mapping the train, development, and test sets onto one or more pairs of pretrained vector spaces and fine tuned vector spaces; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pair of a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on x-y axes corresponding to the basis of the pair of the pretrained model vector space and the fine tuned vector space, respectively)
assigning a confidence score for each of the two or more fine tuned models using a bipartite graph for each of the one or more pairs of pretrained vector spaces and fine tuned vector spaces.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), draw bipartite graphs for the pairs of the pretrained vector space and the fine tuned vector space, and utilize such graphs when determining a confidence score to assign)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 20
Step 2A, Prong 1
selecting a third plurality of input candidates from the corpus of data; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally select a third plurality of words from a corpus of data, such as an encyclopedia)
labeling the third plurality of input candidates ... ; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label words (the input candidates))
mapping the third plurality of input candidates onto the pretrained vector space and a fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space and fine tuned vector space, respectively)
clustering the third plurality of input candidates in the pretrained vector space and the fine tuned vector space; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on an x-y axis corresponding to the basis of the pretrained vector space and fine tuned vector space, respectively, and then draw circles around clusters of data points that are close to each other)
identifying heterogeneous clusters and singletons in the fine tuned vector space; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally identify clusters that have diversity (corresponding to “heterogeneous clusters”) and clusters that only have a single data point)
assigning a confidence score for the labelling of the third plurality of input candidates using a bipartite graph of the pretrained vector space and the fine tuned vector space.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), draw a bipartite graph of the pretrained vector space and the fine tuned vector space, and utilize such graph when determining a confidence score to assign)

Step 2A, Prong 2
	Regarding the “using the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of a fine tuned model.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (a generic fine tuned model).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
	Regarding the “using the fine tuned model” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 21
Step 2A, Prong 1
labeling a third plurality of input candidates ... ; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label a third plurality of input candidates)
determining whether the ensemble of two or more fine tuned models concur on labeling of the third plurality of input candidates; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally determine if the ensemble of two or more fine tuned models concur on labeling of the third plurality of words)
mapping the third plurality of input candidates onto one or more pairs of pretrained vector spaces and fine tuned vector spaces; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally map words onto a pair of a pretrained vector space of a pretrained model and a fine tuned vector space of a fine tuned model, such as if the vector space is 2-dimensional, and mapping words involves plotting such words on x-y axes corresponding to the basis of the pair of the pretrained model vector space and the fine tuned vector space, respectively)
assigning a confidence score for each of the two or more fine tuned models using a bipartite graph for each of the one or more pairs of pretrained vector spaces and fine tuned vector spaces.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), draw bipartite graphs for the pairs of the pretrained vector space and the fine tuned vector space, and utilize such graphs when determining a confidence score to assign)

Step 2A, Prong 2
	Regarding the “using an ensemble of two or more fine tuned models on the third plurality of input candidates” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception.  In particular, the claim only recites the additional element of an ensemble of two or more fine tuned models.  This additional element is recited at a high-level of generality and amounts to no more than mere instructions to apply the exception using a generic computer component (generic fine tuned models in an ensemble).  Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea (See MPEP 2106.05(f)).

Step 2B
	Regarding the “using an ensemble of two or more fine tuned models on the third plurality of input candidates” limitation, such limitation is recited at a high-level of generality and amounts to no more than adding the words “apply it” (or an equivalent) with the judicial exception, because the limitation merely provides instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea.  Accordingly, this additional element does not add significantly more than the judicial exception. (See MPEP 2106.05(f)).

Regarding Claim 22
Step 2A, Prong 1
further comprising selecting a plurality of failed inputs for examination, wherein the plurality of failed inputs are inputs of the third plurality of inputs candidates that have a lowest confidence score; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally input words into the models, and if the confidence scores are lowest, selecting those inputs as “failed inputs”)
selecting a plurality of neighbors of each of the plurality of failed inputs; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally review a vector space plotted on an x-y axis and select neighbors of each of the plurality of failed inputs)
labelling the plurality of neighbors; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label the neighbors)
partitioning the plurality of neighbors onto the train set, the development set, the test set, and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition neighbors (such as 4 different words), into each of the recited sets (1 in each))

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 23
Step 2A, Prong 1
further comprising selecting a plurality of failed inputs for examination, wherein the plurality of failed inputs are inputs of the third plurality of inputs candidates that have a lowest confidence score; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally input words into the models, and if the confidence scores are lowest, selecting those inputs as “failed inputs”)
selecting a plurality of neighbors of each of the plurality of failed inputs; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally review a vector space plotted on an x-y axis and select neighbors of each of the plurality of failed inputs)
labelling the plurality of neighbors; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally label the neighbors)
partitioning the plurality of neighbors onto the train set, the development set, the test set, and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition neighbors (such as 4 different words), into each of the recited sets (1 in each))

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 24
Step 2A, Prong 1
wherein partitioning the plurality of neighbors comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition neighbors (such as 4 different words), into each of the recited sets (1 in each))
adding labeled cluster centroids from the plurality of neighbors to the train set; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), add a word corresponding to a cluster centroid to the train set as recited by this limitation)
adding labeled cluster children from the plurality of neighbors to one of the development set and the test set; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a cluster child to one of the development set and the test set as recited by this limitation)
adding labeled singletons from the plurality of neighbors to one of the train set and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a singleton to one of the train set and the out-of-distribution set as recited by this limitation)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 25
Step 2A, Prong 1
wherein partitioning the plurality of neighbors comprises: (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally partition neighbors (such as 4 different words), into each of the recited sets (1 in each))
adding labeled cluster centroids from the plurality of neighbors to the train set; (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally (or using pencil and paper), add a word corresponding to a cluster centroid to the train set as recited by this limitation)
adding labeled cluster children from the plurality of neighbors to one of the development set and the test set; and (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a cluster child to one of the development set and the test set as recited by this limitation)
adding labeled singletons from the plurality of neighbors to one of the train set and the out-of-distribution set.  (under the broadest reasonable interpretation, this limitation can be performed by a human mentally (or with pencil and paper), for example, a human can mentally(or with pencil and paper), for example, a human can mentally(or using pencil and paper), add a word corresponding to a singleton to one of the train set and the out-of-distribution set as recited by this limitation)

Regarding Step 2A, Prong 2, the claim does not include any additional elements that integrate the judicial exception into a practical application and regarding Step 2B, there are no additional elements recited that amount to significantly more than the judicial exception.

Regarding Claim 26
Step 2A, Prong 1
	Claim 26 recites a system that corresponds to the method of claim 1, and therefore the analysis under Step 2A, Prong 1 with respect to claim 1 also applies to this claim 26.  While claim 26 recites additional generic computing components (“non-transitory memory”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2A, Prong 1.

Step 2A, Prong 2
	Claim 26 recites a system that corresponds to the method of claim 1, and therefore the analysis under Step 2A, Prong 2 with respect to claim 1 also applies to this claim 26.  While claim 26 recites additional generic computing components (“non-transitory memory”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2A, Prong 2.

Step 2B
	Claim 26 recites a system that corresponds to the method of claim 1, and therefore the analysis under Step 2B with respect to claim 1 also applies to this claim 26.  While claim 26 recites additional generic computing components (“non-transitory memory”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2B.
	Claim 27 depends from claim 26 and claims a system that corresponds to the method of claim 9, and is therefore rejected for the same reasons explained above with respect to claims 9 and 26.
Claim 28 depends from claim 27 and claims a system that corresponds to the method of claim 10, and is therefore rejected for the same reasons explained above with respect to claims 10 and 27.
Claim 29 depends from claim 28 and claims a system that corresponds to the method of claim 14, and is therefore rejected for the same reasons explained above with respect to claims 14 and 28.
Claim 30 depends from claim 28 and claims a system that corresponds to the method of claim 15, and is therefore rejected for the same reasons explained above with respect to claims 15 and 28.
Claim 31 depends from claim 28 and claims a system that corresponds to the method of claim 19, and is therefore rejected for the same reasons explained above with respect to claims 19 and 28.
Claim 32 depends from claim 28 and claims a system that corresponds to the method of claim 20, and is therefore rejected for the same reasons explained above with respect to claims 20 and 28.
Claim 33 depends from claim 28 and claims a system that corresponds to the method of claim 21, and is therefore rejected for the same reasons explained above with respect to claims 21 and 28.
Claim 34 depends from claim 32 and claims a system that corresponds to the method of claim 22, and is therefore rejected for the same reasons explained above with respect to claims 22 and 32.
Claim 35 depends from claim 33 and claims a system that corresponds to the method of claim 23, and is therefore rejected for the same reasons explained above with respect to claims 23 and 33.

Regarding Claim 36
Step 2A, Prong 1
	Claim 36 recites a non-transitory computer-readable medium that corresponds to the method of claim 1, and therefore the analysis under Step 2A, Prong 1 with respect to claim 1 also applies to this claim 36.  While claim 36 recites additional generic computing components (“non-transitory computer-readable medium”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2A, Prong 1.

Step 2A, Prong 2
	Claim 36 recites a non-transitory computer-readable medium that corresponds to the method of claim 1, and therefore the analysis under Step 2A, Prong 2 with respect to claim 1 also applies to this claim 36.  While claim 36 recites additional generic computing components (“non-transitory computer-readable medium”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2A, Prong 2.

Step 2B
	Claim 36 recites a non-transitory computer-readable medium that corresponds to the method of claim 1, and therefore the analysis under Step 2B with respect to claim 1 also applies to this claim 36.  While claim 36 recites additional generic computing components (“non-transitory computer-readable medium”, “one or more hardware processors”, and “instructions”), such additional generic computing components do not change the analysis under Step 2B.

	Claim 37 depends from claim 36 and claims a non-transitory computer-readable medium that corresponds to the method of claim 9, and is therefore rejected for the same reasons explained above with respect to claims 9 and 36.
Claim 38 depends from claim 37 and claims a non-transitory computer-readable medium that corresponds to the method of claim 10, and is therefore rejected for the same reasons explained above with respect to claims 10 and 37.
Claim 39 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 14, and is therefore rejected for the same reasons explained above with respect to claims 14 and 38.
Claim 40 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 15, and is therefore rejected for the same reasons explained above with respect to claims 15 and 38.
Claim 41 depends from claim 28 and claims a non-transitory computer-readable medium that corresponds to the method of claim 19, and is therefore rejected for the same reasons explained above with respect to claims 19 and 28.
Claim 42 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 20, and is therefore rejected for the same reasons explained above with respect to claims 20 and 38.
Claim 43 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 21, and is therefore rejected for the same reasons explained above with respect to claims 21 and 38.
Claim 44 depends from claim 42 and claims a non-transitory computer-readable medium that corresponds to the method of claim 22, and is therefore rejected for the same reasons explained above with respect to claims 22 and 42.
Claim 45 depends from claim 43 and claims a non-transitory computer-readable medium that corresponds to the method of claim 23, and is therefore rejected for the same reasons explained above with respect to claims 23 and 43.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-4, 26, and 36 are rejected under 35 U.S.C. 103 as being unpatentable over US 20210287261 A1, hereinafter referenced as MEDALION, in view of US 20220019730 A1, hereinafter referenced as SHARMA, and further in view of US 20210366106 A1, hereinafter referenced as YAO.

Regarding Claim 1
	MEDALION teaches:
A method comprising: (MEDALION, para. 0016: “Embodiments of the present disclosure relate to various systems and methods that may predict a business' category based e.g., on a given business description or an associated vendor.”)
selecting a first plurality of input candidates from a corpus of data; (MEDALION, para. 0046: “FIG. 6 is a flow diagram showing a process 600 that may occur within the system 100 of FIG. 1, according to an embodiment of the present disclosure. In some embodiments, process 600 may be used to generate a list of factors associated with businesses registered with a business accounting software. The factors associated with the business' success may be related to the products and/or services offered by the business and the format of which those products and/or services are offered by the business. The factors may also be related to the products and/or services purchased by the business from a vendor and the format of which those products and/or services are purchased from the vendor. At block 602, retrieved according to standard text extraction techniques including, but not limited to, OCR techniques. In one embodiment, the factors may be all factors associated with users or businesses of accounting software stored in database 122.”;
Examiner’s Note: factors are extracted from text (corresponding to recited “first plurality of input candidates”) from database 122 (corresponding to recited “corpus of data”))
mapping the first plurality of input candidates onto a pretrained vector space of a pretrained model; (MEDALION, para. 0026: “Embedding module 110 may be configured to embed text to vector form within a continuous vector space. In some embodiments, embedding module 110 may convert business-related text into a merchant vector within a continuous vector space. In some embodiments, a word2vec model may be used to convert text to the vector space. The word2vec model may be pre-trained”;
MEDALION, para. 0047: “At block 604, embedding module 110 may embed the retrieved factors to a vector space. In some embodiments, the processing of block 604 may include some operations similar to or the same as described in relation to embedding module 110 in the context of FIG. 1. Embedding module may apply a word2vec algorithm using a CBOW approach to generate a vector representation of the business factors.”
Examiner’s Note: the word2vec model corresponds to the recited “pretrained model”, where the embedding module 110 embeds the factors to the vector space of word2vec (corresponding to recited “pretrained vector space of a pretrained model”))
clustering the first plurality of input candidates in the pretrained vector space; (MEDALION, para. 0029: “For example, the clustering module 112 may be configured to generate clusters of merchant vectors within a vector space. In some embodiments, clustering module 112 may also generate common vendor vectors based on clusters of invoice vectors”
MEDALION, para. 0030: “In some embodiments, clustering module 112 may be operable to implement a variety of clustering techniques, such as k-means, affinity propagation, spectral clustering, hierarchical clustering, density-based spatial cluster of applications with noise (DBSCAN), OPTICS, Gaussian mixture modeling, or Birch. For example, k-means clustering techniques may separate samples into a pre-defined number of groups of equal variance. For a k-means algorithm, the centroids of each cluster (e.g., the central point of each business category in the vector space) is chosen ahead of time.”
MEDALION, para. 0048: “At block 606, clustering module 112 may cluster the plurality of factor vectors. In some embodiments, the clustering module 112 may cluster vectors received from a word2vec embedding followed by an LSTM layer, such as in framework 500. In some embodiments, clustering module 112 may form clusters in the vector space based on the factor vectors according to a mean-shift clustering algorithm.”;
Examiner’s Note: With respect to Fig. 6, the factors are clustered in the word2vec vector space)

However, MEDALION fails to explicitly teach:
adding the first plurality of input candidates to a plurality of queues for labelling
labelling the first plurality of input candidates
capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure
quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates.

However, in a related field of endeavor (annotating data for machine learning and AI systems, see paras. 0003-0005), SHARMA teaches:
adding the first plurality of input candidates to a plurality of queues for labelling (SHARMA, para. 0025: “The queue table shows the labeling queue, which consists of the following in an example embodiment: 1) unlabeled assets, and 2) assets that had labels, but were deleted because they needed to be relabeled. Assets in the queue are distributed among the registered labelers unless the asset is specifically reserved (indicated by a “Reserved by” field). A reserved asset will become unreserved if it is not labeled within 90 minutes of being reserved. A performance tab metric is where a user can view the average metrics across all labelers or drill down into individual performance for label time or review time.”;
SHARMA, para. 0081: “The example embodiment can be configured for: registering a plurality of labelers to which annotation tasks are assigned (processing block 1010); populating a labeling queue with content data to be annotated (processing block 1020); assigning annotation tasks from the labeling queue to the plurality of labelers (processing block 1030)”;
Examiner’s Note: the MEDALION-SHARMA combination now modifies the system of MEDALION so that the factors extracted from text (corresponding to recited “first plurality of input candidates”) are now put into the labelling queues of SHARMA (1 queue per labeler) for labeling)
labelling the first plurality of input candidates.  (SHARMA, para. 0052: “An example embodiment of the automated content labeling platform provides important tools to facilitate the automation of the asset labeling process. In particular, the platform provides: a model-assisted labeling workflow, a real-time human-in-the-loop labeling workflow, and an automated labeling queue system.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”
SHARMA, para. 0080: “To label the text data, the user can: select the tool from the left sidebar; and highlight the text to assign an entity (must be in this order).”;
Examiner’s Note: the MEDALION-SHARMA combination now modifies the system of MEDALION so that the factors extracted from text (corresponding to recited “first plurality of input candidates”) are now put into the labelling queues of SHARMA (1 queue per labeler) for human labelers to use a labeling editor to apply labels to each of the factors, or for an automated labeling flow)

Before the effective filing date of the present application, it would have been obvious to modify the text processing system of MEDALION with the teachings of SHARMA as explained above.  As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).

	However, MEDALION and SHARMA fail to explicitly teach:
capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure 
quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates.

	However, in a related field of endeavor (data annotation, see para. 0031), YAO teaches:
capturing disagreement between a plurality of labeling agents on labeling of the first plurality of input candidates as an uncertainty measure (YAO, para. 0042: “The medical scan annotator system 106 can be used to gather annotations of medical scans based on review of the medical scan image data by users of the system such as radiologists or other medical professionals. Medical scans that require annotation, for example, that have been triaged from a hospital or other triaging entity, can be sent to multiple users selected by the medical scan annotator system 106, and the annotations received from the multiple medical professionals can be processed automatically by a processing system of the medical scan annotator system, allowing the medical scan annotator system to automatically determine a consensus annotation of each medical scan.”;
YAO, para. 0043: “Annotation similarity data can be generated by comparing the first annotation data to the second annotation data, and consensus annotation data can be generated based on the first annotation data and the second annotation data in response to the annotation similarity data indicating that the difference between the first annotation data and the second annotation data compares favorably to an annotation discrepancy threshold. The consensus annotation data can be mapped to the medical scan in a medical scan database.”;
Examiner’s Note: YAO discloses that a plurality of medical professionals (corresponding to recited “plurality of labeling agents”) annotates medial scans, and consensus (or lack thereof) is determined  as annotation similarity data which can be compared to a threshold (corresponding to recited “capturing disagreement”; the MEDALION-SHARMA-YAO combination now modifies the system of MEDALION so that the consensus annotation data and annotation similarity data of YAO are also tracked)
quantifying model uncertainty, using the uncertainty measure as test time, on the first plurality of input candidates. (YAO, para. 0342: “In various embodiments, the consensus function compares the confidence data to a confidence threshold. The retroactive discrepancy notification can include the confidence data. The automated assessment data can include a binary normality decision and an automated confidence score and the consensus function can compare the confidence data to the automated confidence score. The human assessment data can further include severity data associated with a medical condition severity indicated by the medical report and wherein the consensus function compares the severity data to a severity threshold.”;
YAO, para. 0358: “A computer vision model, such as any of the inference functions 1-K of the medical scan annotating system 2612, is generated by training on the plurality of medical scans and the plurality of medical labels, wherein a model confidence generated by the computer vision model is calibrated based on the plurality of confidence scores...”;
Examiner’s Note: YAO discloses determining a model confidence score (corresponding to recited “quantify model uncertainty”), where such model confidence score is based on the consensus data at the time the model is being tested for a score; the MEDALION-SHARMA-YAO combination now modifies the system of MEDALION so that the consensus annotation data and annotation similarity data of YAO are utilized to determine a model confidence score, at the time the model is tested for such confidence score, as in YAO)

Before the effective filing date of the present application, it would have been obvious to modify the text processing system of MEDALION with the teachings of SHARMA and YAO as explained above.  As disclosed by YAO, one of ordinary skill would have been motivated to do so in order to determine which annotators are better than others, such that those annotators scores are given more weight. (para. 0042).

Regarding Claim 2
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION fails to explicitly teach:
wherein labelling the first plurality of input candidates is performed by humans.  

However, in a related field of endeavor (annotating data for machine learning and AI systems, see paras. 0003-0005), SHARMA teaches:
wherein labelling the first plurality of input candidates is performed by humans.  (SHARMA, para. 0052: “An example embodiment of the automated content labeling platform provides important tools to facilitate the automation of the asset labeling process. In particular, the platform provides: a model-assisted labeling workflow, a real-time human-in-the-loop labeling workflow, and an automated labeling queue system.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”
SHARMA, para. 0080: “To label the text data, the user can: select the tool from the left sidebar; and highlight the text to assign an entity (must be in this order).”;
Examiner’s Note: the MEDALION-SHARMA-YAO combination now modifies the system of MEDALION so that the factors extracted from text (corresponding to recited “first plurality of input candidates”) are now put into the labelling queues of SHARMA (1 queue per labeler) for human labelers to use a labeling editor to apply labels to each of the factors)

Before the effective filing date of the present application, it would have been obvious to modify the text processing system of MEDALION with the teachings of SHARMA and YAO as explained above.  As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).

Regarding Claim 3
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION fails to explicitly teach:
wherein labelling the first plurality of input candidates is performed algorithmically.  

However, in a related field of endeavor (annotating data for machine learning and AI systems, see paras. 0003-0005), SHARMA teaches:
wherein labelling the first plurality of input candidates is performed algorithmically.  (SHARMA, para. 0043: “Once an asset with a benchmark label gets a human- or computer-generated label, a benchmark score can be automatically calculated.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”)

Before the effective filing date of the present application, it would have been obvious to modify the text processing system of MEDALION with the teachings of SHARMA with respect to labeling queues for labeling data.  As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).

Regarding Claim 4
	MEDALION, SHARMA, and YAO teach the method of claim 1.  MEDALION further teaches:
wherein labelling comprises identifying cluster centroids in the pretrained vector space. (MEDALION: “In some embodiments, clustering module 112 may be operable to implement a variety of clustering techniques, .... For example, k-means clustering techniques may separate samples into a pre-defined number of groups of equal variance. For a k-means algorithm, the centroids of each cluster (e.g., the central point of each business category in the vector space) is chosen ahead of time. The algorithm may assign each sample (e.g., each merchant vector or vendor vector) to its nearest centroid, create new centroids/categories by taking the mean value of all the samples, and compute the differences between the old and new centroids. The algorithm may repeat these steps until the difference value is below a certain pre-defined threshold.”;
Examiner’s Note: the MEDALION-SHARMA-YAO combination now modifies the system of MEDALION so that the factors extracted from text (corresponding to recited “first plurality of input candidates”) are now put into the labelling queues of SHARMA (1 queue per labeler) labeling with respect to the chosen centroid of MEDALION.)

Before the effective filing date of the present application, it would have been obvious to modify the text processing system of MEDALION (which includes choosing centroids of clusters) with the teachings of SHARMA with respect to labeling queues for labeling data.  As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).

Regarding Claim 26
	MEDALION teaches:
A system comprising: (MEDALION, para. 0094: “The example computer system, mobile computing system, and/or communication system 700”)
a non-transitory memory; and (MEDALION, para. 0095: “The memory 704 can represent a machine-readable medium on which is stored one or more sets of instructions, software, firmware, or other processing logic (e.g., logic 708) embodying any one or more of the methodologies or functions described and/or claimed herein. ... While the machine-readable medium of an example embodiment can be a single medium, the term “machine-readable medium” should be taken to include a single non-transitory medium or multiple non-transitory media (e.g., a centralized or distributed database, and/or associated caches and computing systems) that stores the one or more sets of instructions.”)
one or more hardware processors configured to read instructions from the non-transitory memory that, when executed cause the one or more hardware processors to perform operations comprising: (MEDALION, para. 0095: “The logic 708, or a portion thereof, may also reside, completely or at least partially within the processor 702 during execution thereof by the computer system, mobile computing system, and/or communication system 700.”)
	The remaining limitations of claim 26 correspond to the method of claim 1 and are therefore rejected for the same reasons explained above with respect to claim 1.

Regarding Claim 36
	MEDALION teaches:
A non-transitory computer-readable medium storing instructions that, (MEDALION, para. 0095: “The memory 704 can represent a machine-readable medium on which is stored one or more sets of instructions, software, firmware, or other processing logic (e.g., logic 708) embodying any one or more of the methodologies or functions described and/or claimed herein. ... While the machine-readable medium of an example embodiment can be a single medium, the term “machine-readable medium” should be taken to include a single non-transitory medium or multiple non-transitory media (e.g., a centralized or distributed database, and/or associated caches and computing systems) that stores the one or more sets of instructions.”) when executed by one or more hardware processors, cause the one or more hardware processors to perform operations comprising: (MEDALION, para. 0095: “The logic 708, or a portion thereof, may also reside, completely or at least partially within the processor 702 during execution thereof by the computer system, mobile computing system, and/or communication system 700.”)
The remaining limitations of claim 36 correspond to the method of claim 1 and are therefore rejected for the same reasons explained above with respect to claim 1.

Claims 5 and 8 are rejected under 35 U.S.C. 103 as being unpatentable over MEDALION in view of SHARMA and YAO and further in view of US 20200334410 A1, hereinafter referenced as YEREBAKAN.

Regarding Claim 5
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION, SHARMA, and YAO fail to explicitly teach:
wherein the pretrained vector space is created by mapping input to sparse/dense distributed representations.  

However, in a related field of endeavor (encoding text for text analysis, see para. 0002), YEREBAKAN teaches:
wherein the pretrained vector space is created by mapping input to sparse/dense distributed representations.  (YEREBAKAN, para. 0033: “Word embeddings may be mappings of individual words or phrases of the vocabulary onto real-valued vectors representative thereof in a multidimensional vector space. Each vector may be a dense distributed representation of the word in the vector space. Word-embeddings may be learned/generated to provide that words or phrases that have a similar meaning have a similar representation in vector space.”; 
Examiner’s Note: the MEDALION-SHARMA-YAO-YEREBAKAN combination now modifies the system of MEDALION so that the factors extracted from text are now mapped to a pretrained vector space created using the dense distributed representation teachings of YEREBAKAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, and YEREBAKAN as explained above.  One of ordinary skill would have been motivated to do so in order to conserve storage resources by using dense distributed representations that have relatively few non-zero values.

Regarding Claim 8
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION, SHARMA, and YAO fail to explicitly teach:
wherein the pretrained model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof. 

However, in a related field of endeavor (encoding text for text analysis, see para. 0002), YEREBAKAN teaches:
wherein the pretrained model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof. (YEREBAKAN, para. 0034: “For example, the training may be implemented using a Recurrent Neural Network (RNN) architecture, in which an internal memory may be used to process arbitrary sequences of inputs. For example, the training may be implemented using a Long Short-Term Memory (LSTM) based Recurrent Neural Network (RNN) architecture, for example comprising one or more LSTM cells for remembering values over arbitrary time intervals, and/or for example comprising gated recurrent units (GRU). The training may be implemented using a convolutional neural network (CNN). Other suitable neural networks may be used. The word embeddings may be learned using various techniques, for example Word2vec, Glove, Fasttext or similar.”
Examiner’s Note: the MEDALION-SHARMA-YAO-YEREBAKAN combination now modifies the system of MEDALION so that the factors extracted from text are now mapped to a pretrained vector space, where such pretrained vector space corresponds to a pretrained RNN or CNN as in YEREBAKAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, and YEREBAKAN as explained above.  As disclosed by YEREBAKAN, one of ordinary skill would have been motivated to do so because YEREBAKAN discloses text analysis methods that “provide an increase in accuracy.” (para. 0073). One of ordinary skill would have been motivated to do so because neural networks are known to be efficient and powerful types of machine learning models.

Claim 6 is rejected under 35 U.S.C. 103 as being unpatentable over MEDALION in view of SHARMA and YAO and further in view of US 20200125996 A1, hereinafter referenced as PAPARAJU.

Regarding Claim 6
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION, SHARMA, and YAO fail to explicitly teach:
wherein the pretrained vector space comprises learned parameters of a probability distribution.  

	However, in a related field of endeavor (machine-learning-based tools, see para. 0002), PAPARAJU teaches:
wherein the pretrained vector space comprises learned parameters of a probability distribution.  (PAPARAJU, para. 0012: In some examples, the initial vector space is sampled to produce the probability distribution whose parameters are learned using a variational autoencoder and the collaborative data includes user vectors corresponding to the software dependencies.”; 
PAPARAJU, para. 0024: “At block 408, processing device 104 applies variational autoencoder 180 to the initial vector space to learn the parameters for a probability distribution and sample from the probability distribution representative vectors in the latent vector space.”
Examiner’s Note: the MEDALION-SHARMA-YAO-PAPARAJU combination now modifies the system of MEDALION so that the factors extracted from text are now mapped to a pretrained vector space, and where probability distribution parameters are learned and also mapped to the same vector space)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, and PAPARAJU as explained above.  As disclosed by PAPARAJU, one of ordinary skill would have been motivated to do so because PAPARAJU teaches that such distribution representative vectors can be used to reduce dimensionality.  (para. 0024). 

Claim 7 is rejected under 35 U.S.C. 103 as being unpatentable over MEDALION in view of SHARMA and YAO and further in view of Huang, Chin-Wei, et al. "Learnable explicit density for continuous latent space and variational inference." arXiv preprint arXiv:1710.02248 (2017), hereinafter referenced as HUANG.

Regarding Claim 7
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION, SHARMA, and YAO fail to explicitly teach:
the pretrained vector space is learned by performing density estimation.  

	However, in a related field of endeavor (machine learning with respect to continuous latent spaces, see p. 1, section 1), HUANG teaches:
the pretrained vector space is learned by performing density estimation.  (HUANG, p. 3, section 4: “As suggested in sections 2 and 3, we propose to use one-to-one correspondence to define a learnable explicit density (LED) model for both inference and sample generation. First, inspired by (4), we found that updating the prior alone is reminiscent of MLE. One can think of data points projected onto the latent space via Monte Carlo sampling as a data distribution”;
HUANG, p. 4, section 6: “In this paper, we first reinterpret training with the variational lower bound as layer-wise density estimation. Treating the Monte Carlo samples from the approximate posterior distributions as projected data distribution suggests using a flexible prior to avoid overestimate of entropy”
Examiner’s Note: HUANG teaches training a machine learning model using layer-wise density estimation, which means that the vector space learned by the model is learned using such layer-wise density estimation; the MEDALION-SHARMA-YAO-HUANG combination now modifies the system of MEDALION so that the factors extracted from text are now mapped to a pretrained vector space, and where such pretrained vector space is learned using layer-wise density estimation as in HUANG)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, and HUANG as explained above.  As disclosed by HUANG, one of ordinary skill would have been motivated to do so because HUANG teaches that such layer-wise density estimation avoids overestimating entropy.  (p. 4, section 6).

Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over MEDALION in view of SHARMA and YAO, and further in view of Sundar, Vijaya Kumar, et al. "Out-of-distribution detection in multi-label datasets using latent space of β-vae." 2020 IEEE Security and Privacy Workshops (SPW). IEEE, 2020, hereinafter referenced as SUNDAR, and further in view of US 20150286707 A1, hereinafter referenced as LEVITAN.

Regarding Claim 9
	MEDALION, SHARMA, and YAO teach the method of claim 1.  However, MEDALION, SHARMA, and YAO fail to explicitly teach:
further comprising partitioning the labeled first plurality of input candidates into a train set, a development set, a test set, and an out-of-distribution set, wherein partitioning the labeled first plurality of input candidates comprises: 
adding labeled cluster centroids in the pretrained vector space from the first plurality of input candidates to the train set; 
adding labeled cluster children in the pretrained vector space from the first plurality of input candidates to one of the development set and the test set; and 
adding labeled singletons in the pretrained vector space from the first plurality of input candidates to one of the train set and the out-of-distribution set.  

	However, in a related field of endeavor (out-of-distribution detection for autoencoders, see p. 250, section I), SUNDAR teaches:
further comprising partitioning the labeled first plurality of input candidates into a train set, a development set, a test set, and an out-of-distribution set, wherein partitioning the labeled first plurality of input candidates comprises: (SUNDAR, p. 252, section II.B: “As shown in Fig. 2-b, we split the partition data into training, validation and testing subsets. The training set has images of one generative factor value (e.g. day). The validation set has images from the generative factor with values leading to OOD (e.g. night, evening). The test set has a mix of images from all values of generative factors. However, the test images are not used during training or validation.”;
Examiner’s Note: SUNDAR teaches splitting data into a training set (corresponding to recited “train set”), a test set (corresponding to recited “test set”), and a validation set split into two parts, the first having images from the generative factor (corresponding to recited “development set”), and a second portion having images leading to OOD (this second partition corresponding to recited “out-of-distribution set”); the MEDALION-SHARMA-YAO-SUNDAR combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR)
adding labeled cluster centroids in the pretrained vector space from the first plurality of input candidates to the train set; (SUNDAR, p. 252, section II.B: “As shown in Fig. 2-b, we split the partition data into training, validation and testing subsets. The training set has images of one generative factor value (e.g. day). The validation set has images from the generative factor with values leading to OOD (e.g. night, evening). The test set has a mix of images from all values of generative factors. However, the test images are not used during training or validation.”;
Examiner’s Note: SUNDAR teaches having a specific type of data in the training set; the MEDALION-SHARMA-YAO-SUNDAR combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the selected centroids of MEDALION (see para. 0030) are specifically added to the train set as the available data is partitioned as in SUNDAR)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, and SUNDAR as explained above.  As disclosed by SUNDAR, one of ordinary skill would have been motivated to do so because SUNDAR teaches that its partition approach “requires less computational resource compared to the other OOD detection methods we mentioned.”  (p. 251, section I).  One of ordinary skill would further understand the benefit of partitioning data between different phases of machine learning so that the trained model is validated and tested using data that was not used for training at all, to determine how well the trained model handles new data.

	However, MEDALION, SHARMA, YAO, and SUNDAR fails to explicitly teach:
adding labeled cluster children in the pretrained vector space from the first plurality of input candidates to one of the development set and the test set; 
adding labeled singletons in the pretrained vector space from the first plurality of input candidates to one of the train set and the out-of-distribution set.  

	However, in a related field of endeavor (data clustering using “cluster feature (CF)-tree based hierarchical clustering”, see para. 0004), LEVITAN teaches:
adding labeled cluster children in the pretrained vector space from the first plurality of input candidates to one of the development set and the test set; (LEVITAN, para. 0116: “A CF-tree is a very compact summary of dataset in the way that each entry (leaf entry) in the leaf node is a sub-cluster, which absorbs the data cases that are close together, as measured by the tightness index η’ and controlled by a specific threshold value T. The CF-tree is built dynamically as new data case is inserted, it is used to guide to a new insertion into the correct sub-cluster for clustering purpose. The CF-tree is a height-balanced tree with four parameters: 1) the branching factor B for the non-leaf nodes, (The branching factor B is the maximum number of entries that a non-leaf node can hold. A non-leaf entry is of the form [CFi, childi], i=1, . . . , B, in which childi is a pointer to its ith child node and CFi is the cluster feature of the sub-cluster represented by this child.);”;
Examiner’s Note: LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children; the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the cluster children of LEVITAN are specifically added to one of the development set and the test set as the available data is partitioned as in SUNDAR)
adding labeled singletons in the pretrained vector space from the first plurality of input candidates to one of the train set and the out-of-distribution set.  (LEVITAN, para. 0003: “Agglomerative clustering starts with a singleton cluster (i.e. a cluster that contains one data case only) and proceeds by successively merging that cluster with other clusters”;
LEVITAN, para. 0179: “Notice that the distance between the cluster center and a cluster Cj is computed by considering the center of cluster Cs as a singleton cluster Cs′.”
Examiner’s Note: LEVITAN teaches a CF-tree based hierarchical clustering, where clusters having a single data point are called singletons; the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the singleton clusters of LEVITAN are specifically added to one of the train set and the out-of-distribution set as the available data is partitioned as in SUNDAR)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, and LEVITAN as explained above. As disclosed by LEVITAN, one of ordinary skill would have been motivated to do so because LEVITAN teaches that its clustering solution results in an “enhanced set of evaluation and diagnostic features enabling insight, interactivity, and an improved overall user experience.” (para.0086).  One of ordinary skill would further understand the benefit of organizing data as children clusters and singletons, for example, to use children clusters to represent sub-groups and singletons as outliers or anomalies.

Claims 10-17, 27-30, and 37-40 are rejected under 35 U.S.C. 103 as being unpatentable over MEDALION in view of SHARMA, YAO, SUNDAR, and LEVITAN, and further in view of US 20230334045 A1, hereinafter referenced as BERGMAN.

Regarding Claim 10
	MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN teach the method of claim 9.  However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
further comprising creating a fine tuned model.  

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
further comprising creating a fine tuned model.  (BERGMAN, para. 0060: “In some implementations, the first model is a bidirectional encoder representations from transformers (BERT) model. The first model may be initially pre-trained on a large corpus of language data, such as Wikipedia. In some implementations, the operations executed at 304 are for further training (i.e., fine-tuning) the first model to train the first model on the information from the received training dataset for a specific classifier.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION using the dataset splits of SUNDAR and the CF-tree hierarchical clustering of LEVITAN, where fine-tuning is performed as in BERGMAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN.

Regarding Claim 11
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 10.  However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
wherein creating the fine tuned model comprises using the pretrained model to create the fine tuned model. 

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
wherein creating the fine tuned model comprises using the pretrained model to create the fine tuned model.  (BERGMAN, para. 0060: “In some implementations, the first model is a bidirectional encoder representations from transformers (BERT) model. The first model may be initially pre-trained on a large corpus of language data, such as Wikipedia. In some implementations, the operations executed at 304 are for further training (i.e., fine-tuning) the first model to train the first model on the information from the received training dataset for a specific classifier.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION using the dataset splits of SUNDAR and the CF-tree hierarchical clustering of LEVITAN, where fine-tuning is performed as in BERGMAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN.

Regarding Claim 12
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 10.  However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
further comprising assigning a first plurality of outputs using the fine tuned model.  

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
further comprising assigning a first plurality of outputs using the fine tuned model.  (BERGMAN, para. 0083: “The fine-tuning of the first model 706 pools data in a single output token. Once the output token has been created, the final model can use the vector representation of the classification output as input and the corresponding label to train.”;
Examiner’s Note: the MEDALION-SHARMA-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION using the dataset splits of SUNDAR and the CF-tree hierarchical clustering of LEVITAN, where fine-tuning is performed as in BERGMAN, and then an output is assigned using the fine-tuned model as in BERGMAN).

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN.

Regarding Claim 13
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 10.  However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
wherein the fine tuned model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof.  

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
Wherein the fine tuned model is selected from a group consisting of transformers, convolutional neural networks, recurrent neural networks, graph neural networks, and combinations thereof.  (BERGMAN, para. 0039: “This predictive approach allows a natural language processing (NLP) model, such as a bidirectional encoder representations from transformers (BERT) model, to be pre-trained using a large body of language data, such as Wikipedia, and then to be further trained (i.e., fine-tuned) using datasets of search queries and interpretations for the search queries rated by humans for quality.”;
BERGMAN, para. 0081: “In some implementations, the first model 706 is a BERT model. The first model 706 may support fine-tuning after the first model 706 is pre-trained over a vast corpus of language data, and then specialized over a smaller corpus of human-rated language data for a particular classification problem. This is beneficial because the first model 706 then needs less manually-evaluated, human rating data, which is expensive to collect, in order to make an evaluation. Furthermore, fine-tuning the first model 706 may beneficially increase efficiency and improve the bandwidth of the system, as a smaller amount of human labeled training sets may be used to specialize the first model 706.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION using the dataset splits of SUNDAR and the CF-tree hierarchical clustering of LEVITAN, where fine-tuning is performed as in BERGMAN, and then an output is assigned using the fine-tuned model as in BERGMAN, where the model being fine-tuned can be a BERT model as in BERGMAN, which is a transformer-based model).

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN.

Regarding Claim 14
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 10.  MEDALION further teaches:
mapping the test set onto a fine tuned vector space; (MEDALION, para. 0026: “Embedding module 110 may be configured to embed text to vector form within a continuous vector space. In some embodiments, embedding module 110 may convert business-related text into a merchant vector within a continuous vector space. In some embodiments, a word2vec model may be used to convert text to the vector space. The word2vec model may be pre-trained”;
MEDALION, para. 0047: “At block 604, embedding module 110 may embed the retrieved factors to a vector space. In some embodiments, the processing of block 604 may include some operations similar to or the same as described in relation to embedding module 110 in the context of FIG. 1. Embedding module may apply a word2vec algorithm using a CBOW approach to generate a vector representation of the business factors.”
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION and maps the factors to the updated fine tuned vector space (where fine-tuning is performed as in BERGMAN))
clustering the test set in the fine tuned vector space; (MEDALION, para. 0029: “For example, the clustering module 112 may be configured to generate clusters of merchant vectors within a vector space. In some embodiments, clustering module 112 may also generate common vendor vectors based on clusters of invoice vectors”
MEDALION, para. 0030: “In some embodiments, clustering module 112 may be operable to implement a variety of clustering techniques, such as k-means, affinity propagation, spectral clustering, hierarchical clustering, density-based spatial cluster of applications with noise (DBSCAN), OPTICS, Gaussian mixture modeling, or Birch. For example, k-means clustering techniques may separate samples into a pre-defined number of groups of equal variance. For a k-means algorithm, the centroids of each cluster (e.g., the central point of each business category in the vector space) is chosen ahead of time.”
MEDALION, para. 0048: “At block 606, clustering module 112 may cluster the plurality of factor vectors. In some embodiments, the clustering module 112 may cluster vectors received from a word2vec embedding followed by an LSTM layer, such as in framework 500. In some embodiments, clustering module 112 may form clusters in the vector space based on the factor vectors according to a mean-shift clustering algorithm.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION and clusters the factors in the updated fine tuned vector space (where fine-tuning is performed as in BERGMAN))

However, MEDALION and SHARMA fail to explicitly teach:
wherein evaluating the performance of the fine tuned model comprises: 
quantifying heterogeneity of test set clusters in the fine tuned vector space; and 
providing a confidence score for the fine tuned model.  

However, in a related field of endeavor (out-of-distribution detection for autoencoders, see p. 250, section I), SUNDAR teaches:
wherein evaluating the performance of the fine tuned model comprises: (SUNDAR, p. 253, section III: “We evaluated our methodology on nuScenes dataset”; 
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR and then evaluates the model’s performance as in SUNDAR, where such model is fine-tuned as in BERGMAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by SUNDAR, one of ordinary skill would have been motivated to do so because SUNDAR teaches that its partition approach “requires less computational resource compared to the other OOD detection methods we mentioned.”  (p. 251, section I).  One of ordinary skill would further understand the benefit of partitioning data between different phases of machine learning so that the trained model is validated and tested using data that was not used for training at all, to determine how well the trained model handles new data.

However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
quantifying heterogeneity of test set clusters in the fine tuned vector space; and 
providing a confidence score for the fine tuned model.  

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
quantifying heterogeneity of test set clusters in the fine tuned vector space; and (BERGMAN, para. 0064: “ For example, the model training circuit 208 may be configured to train the second model to analyze the number of different clusters a user enters search queries for during a session on average (e.g., the diversity of clusters)”; 
Examiner’s Note: the number of different clusters, which have a diversity (corresponding to recited “heterogeneity of test set clusters”) corresponds to the recited “quantifying”; the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR and then evaluates the model’s performance as in SUNDAR, where such model is fine-tuned as in BERGMAN, and such model determines the diversity of clusters as in BERGMAN)
providing a confidence score for the fine tuned model.  (BERGMAN, para. 0066: “The final evaluation determined by the second model, via the final evaluation circuit 318, may be associated with a confidence score, such as a percentage value for confidence that the second model did not make an error in its final evaluation.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR and then evaluates the model’s performance as in SUNDAR, where such model is fine-tuned as in BERGMAN, and a confidence score for the accuracy of the model is determined as in BERGMAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above.  As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN, and further generating a confidence score for the accuracy of a model as in BERGMAN.

Regarding Claim 15
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 10.  MEDALION further teaches:
mapping the train set and development set onto the pretrained vector space and a fine tuned vector space; (MEDALION, para. 0026: “Embedding module 110 may be configured to embed text to vector form within a continuous vector space. In some embodiments, embedding module 110 may convert business-related text into a merchant vector within a continuous vector space. In some embodiments, a word2vec model may be used to convert text to the vector space. The word2vec model may be pre-trained”;
MEDALION, para. 0047: “At block 604, embedding module 110 may embed the retrieved factors to a vector space. In some embodiments, the processing of block 604 may include some operations similar to or the same as described in relation to embedding module 110 in the context of FIG. 1. Embedding module may apply a word2vec algorithm using a CBOW approach to generate a vector representation of the business factors.”
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION and maps the factors to the updated fine tuned vector space and the original vector space (where fine-tuning is performed as in BERGMAN))
clustering the train set and development set in the pretrained vector space and the fine tuned vector space; (MEDALION, para. 0029: “For example, the clustering module 112 may be configured to generate clusters of merchant vectors within a vector space. In some embodiments, clustering module 112 may also generate common vendor vectors based on clusters of invoice vectors”
MEDALION, para. 0030: “In some embodiments, clustering module 112 may be operable to implement a variety of clustering techniques, such as k-means, affinity propagation, spectral clustering, hierarchical clustering, density-based spatial cluster of applications with noise (DBSCAN), OPTICS, Gaussian mixture modeling, or Birch. For example, k-means clustering techniques may separate samples into a pre-defined number of groups of equal variance. For a k-means algorithm, the centroids of each cluster (e.g., the central point of each business category in the vector space) is chosen ahead of time.”
MEDALION, para. 0048: “At block 606, clustering module 112 may cluster the plurality of factor vectors. In some embodiments, the clustering module 112 may cluster vectors received from a word2vec embedding followed by an LSTM layer, such as in framework 500. In some embodiments, clustering module 112 may form clusters in the vector space based on the factor vectors according to a mean-shift clustering algorithm.”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now fine-tunes the machine learning models of MEDALION and clusters the factors in the updated fine tuned vector space and the original vector space (where fine-tuning is performed as in BERGMAN))

However, MEDALION fails to explicitly teach:
further comprising labelling a second plurality of input candidates from the corpus of data, wherein labelling the second plurality of input candidates comprises: 
identifying heterogeneous clusters and singletons in the fine tuned vector space; 
selecting the second plurality of input candidates such that the second plurality of input candidates are nearest to at least one of the heterogeneous clusters and singletons in the fine tuned vector space; and 
labelling the second plurality of input candidates.  

However, in a related field of endeavor (annotating data for machine learning and AI systems, see paras. 0003-0005), SHARMA teaches:
further comprising labelling a second plurality of input candidates from the corpus of data, wherein labelling the second plurality of input candidates comprises:  (SHARMA, para. 0052: “An example embodiment of the automated content labeling platform provides important tools to facilitate the automation of the asset labeling process. In particular, the platform provides: a model-assisted labeling workflow, a real-time human-in-the-loop labeling workflow, and an automated labeling queue system.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”
SHARMA, para. 0080: “To label the text data, the user can: select the tool from the left sidebar; and highlight the text to assign an entity (must be in this order).”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now labels additional factors from the corpus of MEDALION as taught by SHARMA)
labelling the second plurality of input candidates.  (SHARMA, para. 0052: “An example embodiment of the automated content labeling platform provides important tools to facilitate the automation of the asset labeling process. In particular, the platform provides: a model-assisted labeling workflow, a real-time human-in-the-loop labeling workflow, and an automated labeling queue system.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”
SHARMA, para. 0080: “To label the text data, the user can: select the tool from the left sidebar; and highlight the text to assign an entity (must be in this order).”;
Examiner’s Note: the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now labels additional factors from the corpus of MEDALION as taught by SHARMA)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).

However, MEDALION, SHARMA, YAO, and SUNDAR fail to explicitly teach:
identifying ... singletons in the fine tuned vector space; (LEVITAN, para. 0003: “Agglomerative clustering starts with a singleton cluster (i.e. a cluster that contains one data case only) and proceeds by successively merging that cluster with other clusters”;
LEVITAN, para. 0179: “Notice that the distance between the cluster center and a cluster Cj is computed by considering the center of cluster Cs as a singleton cluster Cs′.”
Examiner’s Note: LEVITAN teaches a CF-tree based hierarchical clustering, where clusters having a single data point are called singletons; the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the singleton clusters of LEVITAN are specifically identified)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by LEVITAN, one of ordinary skill would have been motivated to do so because LEVITAN teaches that its clustering solution results in an “enhanced set of evaluation and diagnostic features enabling insight, interactivity, and an improved overall user experience.” (para.0086).  One of ordinary skill would further understand the benefit of organizing data as children clusters and singletons, for example, to use children clusters to represent sub-groups and singletons as outliers or anomalies.

However, MEDALION, SHARMA, YAO, SUNDAR, and LEVITAN fail to explicitly teach:
identifying heterogeneous clusters 
selecting the second plurality of input candidates such that the second plurality of input candidates are nearest to at least one of the heterogeneous clusters and singletons in the fine tuned vector space; and 

However, in a related field of endeavor (natural language processing models, see paras. 0039-0040), BERGMAN teaches:
identifying heterogeneous clusters (BERGMAN, para. 0064: “ For example, the model training circuit 208 may be configured to train the second model to analyze the number of different clusters a user enters search queries for during a session on average (e.g., the diversity of clusters)”; 
Examiner’s Note: clusters that have diversity (corresponding to recited “heterogeneous clusters”) are identified by BERGMAN; the MEDALION-SHARMA-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR and then evaluates the model’s performance as in SUNDAR, where such model is fine-tuned as in BERGMAN, and further determines clusters having diversity as in BERGMAN)
selecting the second plurality of input candidates such that the second plurality of input candidates are nearest to at least one of the heterogeneous clusters and singletons in the fine tuned vector space; and (BERGMAN, para. 0064: “ For example, the model training circuit 208 may be configured to train the second model to analyze the number of different clusters a user enters search queries for during a session on average (e.g., the diversity of clusters)”; 
Examiner’s Note: clusters that have diversity (corresponding to recited “heterogeneous clusters”) are identified by BERGMAN; the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR and identifies outliers using a distance metric as in LEVITAN (see para. 0014), and then labels the factors of MEDALION that are nearest to the heterogeneous clusters (of BERGMAN) and the singletons (of LEVITAN), where “nearness” is determined using the distance metric of LEVITAN)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by BERGMAN, one of ordinary skill would have been motivated to do so because BERGMAN teaches that fine-tuning a pretrained model, rather than re-training a brand new model, “needs less manually-evaluated, human rating rate, which is expensive to collect.” (para. 0081).  One of ordinary skill would further understand the benefit of fine-tuning a model that is known to be robust and accurate (such as BERT) and fine-tuning for a particular purpose, as disclosed by BERGMAN, and further generating a confidence score for the accuracy of a model as in BERGMAN.

Regarding Claim 16
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 15.  However, MEDALION and SHARMA fail to explicitly teach:
further comprising partitioning the labeled second plurality of input candidates into the train set, the development set, the test set, and the out-of-distribution set, wherein partitioning the labeled second plurality of input candidates comprises: 
adding labeled cluster centroids from the second plurality of input candidates to the train set; 
adding labeled cluster children from the second plurality of input candidates to one of a development set and the test set; and 
adding labeled singletons from the second plurality of input candidates to the one of the train set and the out-of-distribution set.  

However, in a related field of endeavor (out-of-distribution detection for autoencoders, see p. 250, section I), SUNDAR teaches:
further comprising partitioning the labeled second plurality of input candidates into the train set, the development set, the test set, and the out-of-distribution set, wherein partitioning the labeled second plurality of input candidates comprises: (SUNDAR, p. 252, section II.B: “As shown in Fig. 2-b, we split the partition data into training, validation and testing subsets. The training set has images of one generative factor value (e.g. day). The validation set has images from the generative factor with values leading to OOD (e.g. night, evening). The test set has a mix of images from all values of generative factors. However, the test images are not used during training or validation.”;
Examiner’s Note: SUNDAR teaches splitting data into a training set (corresponding to recited “train set”), a test set (corresponding to recited “test set”), and a validation set split into two parts, the first having images from the generative factor (corresponding to recited “development set”), and a second portion having images leading to OOD (this second partition corresponding to recited “out-of-distribution set”); the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR for a second set of input candidates)
adding labeled cluster centroids from the second plurality of input candidates to the train set; (SUNDAR, p. 252, section II.B: “As shown in Fig. 2-b, we split the partition data into training, validation and testing subsets. The training set has images of one generative factor value (e.g. day). The validation set has images from the generative factor with values leading to OOD (e.g. night, evening). The test set has a mix of images from all values of generative factors. However, the test images are not used during training or validation.”;
Examiner’s Note: SUNDAR teaches having a specific type of data in the training set; the MEDALION-SHARMA-YAO-SUNDAR-LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the selected centroids of MEDALION (see para. 0030) are specifically added to the train set as the available data is partitioned as in SUNDAR for the second set of input candidates)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by SUNDAR, one of ordinary skill would have been motivated to do so because SUNDAR teaches that its partition approach “requires less computational resource compared to the other OOD detection methods we mentioned.”  (p. 251, section I).  One of ordinary skill would further understand the benefit of partitioning data between different phases of machine learning so that the trained model is validated and tested using data that was not used for training at all, to determine how well the trained model handles new data.

However, MEDALION, SHARMA, YAO, and SUNDAR fails to explicitly teach:
adding labeled cluster children from the second plurality of input candidates to one of a development set and the test set; and 
adding labeled singletons from the second plurality of input candidates to the one of the train set and the out-of-distribution set.  

However, in a related field of endeavor (data clustering using “cluster feature (CF)-tree based hierarchical clustering”, see para. 0004), LEVITAN teaches:
adding labeled cluster children from the second plurality of input candidates to one of a development set and the test set; and (LEVITAN, para. 0116: “A CF-tree is a very compact summary of dataset in the way that each entry (leaf entry) in the leaf node is a sub-cluster, which absorbs the data cases that are close together, as measured by the tightness index η’ and controlled by a specific threshold value T. The CF-tree is built dynamically as new data case is inserted, it is used to guide to a new insertion into the correct sub-cluster for clustering purpose. The CF-tree is a height-balanced tree with four parameters: 1) the branching factor B for the non-leaf nodes, (The branching factor B is the maximum number of entries that a non-leaf node can hold. A non-leaf entry is of the form [CFi, childi], i=1, . . . , B, in which childi is a pointer to its ith child node and CFi is the cluster feature of the sub-cluster represented by this child.);”;
Examiner’s Note: LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children; the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the cluster children of LEVITAN are specifically added to one of the development set and the test set as the available data is partitioned as in SUNDAR for the second set of inputs)
adding labeled singletons from the second plurality of input candidates to the one of the train set and the out-of-distribution set.  (LEVITAN, para. 0003: “Agglomerative clustering starts with a singleton cluster (i.e. a cluster that contains one data case only) and proceeds by successively merging that cluster with other clusters”;
LEVITAN, para. 0179: “Notice that the distance between the cluster center and a cluster Cj is computed by considering the center of cluster Cs as a singleton cluster Cs′.”
Examiner’s Note: LEVITAN teaches a CF-tree based hierarchical clustering, where clusters having a single data point are called singletons; the MEDALION-SHARMA-YAO-SUNDAR- LEVITAN-BERGMAN combination now trains the machine learning models of MEDALION using the dataset splits of SUNDAR, where the singleton clusters of LEVITAN are specifically added to one of the train set and the out-of-distribution set as the available data is partitioned as in SUNDAR for the second set of inputs)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by LEVITAN, one of ordinary skill would have been motivated to do so because LEVITAN teaches that its clustering solution results in an “enhanced set of evaluation and diagnostic features enabling insight, interactivity, and an improved overall user experience.” (para.0086).  One of ordinary skill would further understand the benefit of organizing data as children clusters and singletons, for example, to use children clusters to represent sub-groups and singletons as outliers or anomalies.

Regarding Claim 17
	MEDALION, SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN teach the method of claim 15.  However, MEDALION fails to explicitly teach:
wherein labelling of the second plurality of input candidates comprises algorithmically labelling the second plurality of input candidates.  

However, in a related field of endeavor (annotating data for machine learning and AI systems, see paras. 0003-0005), SHARMA teaches:
wherein labelling of the second plurality of input candidates comprises algorithmically labelling the second plurality of input candidates.  (SHARMA, para. 0043: “Once an asset with a benchmark label gets a human- or computer-generated label, a benchmark score can be automatically calculated.”
SHARMA, para. 0055: “The predicted labels can facilitate and improve both an automated labeling workflow and a manual labeling workflow.”)

Before the effective filing date of the present application, it would have been obvious to modify the system of MEDALION with the teachings of SHARMA, YAO, SUNDAR, LEVITAN, and BERGMAN as explained above. As disclosed by SHARMA, one of ordinary skill would have been motivated to produce “good quality training data for an AI system” that has good labels.  (para. 0005).


	Claim 27 depends from claim 26 and claims a system that corresponds to the method of claim 9, and is therefore rejected for the same reasons explained above with respect to claims 9 and 26.
Claim 28 depends from claim 27 and claims a system that corresponds to the method of claim 10, and is therefore rejected for the same reasons explained above with respect to claims 10 and 27.
Claim 29 depends from claim 28 and claims a system that corresponds to the method of claim 14, and is therefore rejected for the same reasons explained above with respect to claims 14 and 28.
Claim 30 depends from claim 28 and claims a system that corresponds to the method of claim 15, and is therefore rejected for the same reasons explained above with respect to claims 15 and 28.
	Claim 37 depends from claim 36 and claims a non-transitory computer-readable medium that corresponds to the method of claim 9, and is therefore rejected for the same reasons explained above with respect to claims 9 and 36.
Claim 38 depends from claim 37 and claims a non-transitory computer-readable medium that corresponds to the method of claim 10, and is therefore rejected for the same reasons explained above with respect to claims 10 and 37.
Claim 39 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 14, and is therefore rejected for the same reasons explained above with respect to claims 14 and 38.
Claim 40 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 15, and is therefore rejected for the same reasons explained above with respect to claims 15 and 38.

Allowable Subject Matter
Claims 18-25, 31-35, and 41-45 are objected to as being dependent upon a rejected base claim, but would be allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

The following is a statement of reasons for the indication of allowable subject matter:  

Claim 18 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome, because none of the references of record either alone or in combination fairly disclose or suggest the combination of limitations specified in claim 18, including at least:  
further comprising assigning a confidence score for the labelling of the second plurality of input candidates using a bipartite graph of the pretrained vector space and the fine tuned vector space.  

The closest prior art of record discloses: 
	MEDALION discloses techniques for mapping text (in the form of “factors”) onto the pre-trained vector space of the word2vec system. (paras. 0026, 0047).
	SHARMA teaches both automatic and manual process flows for labeling content, where such flows includes labeling queues.  (paras. 0025, 0055, 0081-0082). 
	YAO teaches consensus annotation data concerning a plurality of human annotators (paras. 0042-0043) and determining a model confidence score. (para. 0358).
	SUNDAR discloses partitioning datasets into training, validation and testing subsets, where the validation set is split into portions to account for out-of-distribution samples. (p. 252, section II.B).
LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children, and where singletons are clusters with only a single element.  (paras. 0003, 0116, 0179).
	BERGMAN teaches fine-tuning a pretrained model, such as the well-known BERT language model, and determining confidence scores with respect to such models.  (paras. 0039, 0060, 0066). 
US 20200193323 A1, hereinafter referenced as ALESIANI, discloses using bipartite graphs for mixed integer linear programming (MILP) problems, where the bipartite graphs show if embeddings are closer in an embedding space.  (paras. 0093-0095).

However, the examiner has found that the distinct feature of the Applicant's claimed invention over the prior art is the explicit claiming of the aforementioned limitations in combination with all the other limitations as specified in claim 18.  In particular, the prior art of record does not specifically teach using bipartite graphs for both a pretrained vector space and fine tuned vector space specifically to assign a confidence score for the labeling of candidates.  Therefore, because the prior art of record does not anticipate nor make obvious the limitations recited in claim 18, claim 18 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

Claim 19 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome, because none of the references of record either alone or in combination fairly disclose or suggest the combination of limitations specified in claim 19, including at least:  
assigning a confidence score for each of the two or more fine tuned models using a bipartite graph for each of the one or more pairs of pretrained vector spaces and fine tuned vector spaces

The closest prior art of record discloses: 
	MEDALION discloses techniques for mapping text (in the form of “factors”) onto the pre-trained vector space of the word2vec system. (paras. 0026, 0047).
	SHARMA teaches both automatic and manual process flows for labeling content, where such flows includes labeling queues.  (paras. 0025, 0055, 0081-0082). 
YAO teaches consensus annotation data concerning a plurality of human annotators (paras. 0042-0043) and determining a model confidence score. (para. 0358).
	SUNDAR discloses partitioning datasets into training, validation and testing subsets, where the validation set is split into portions to account for out-of-distribution samples. (p. 252, section II.B).
LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children, and where singletons are clusters with only a single element.  (paras. 0003, 0116, 0179).
	BERGMAN teaches fine-tuning a pretrained model, such as the well-known BERT language model, and determining confidence scores with respect to such models.  (paras. 0039, 0060, 0066). 
US 20200193323 A1, hereinafter referenced as ALESIANI, discloses using bipartite graphs for mixed integer linear programming (MILP) problems, where the bipartite graphs show if embeddings are closer in an embedding space.  (paras. 0093-0095).

However, the examiner has found that the distinct feature of the Applicant's claimed invention over the prior art is the explicit claiming of the aforementioned limitations in combination with all the other limitations as specified in claim 19.  In particular, the prior art of record does not specifically teach using bipartite graphs for pairs of a pretrained vector space and a fine tuned vector space specifically to assign a confidence score for the labeling of candidates.  Therefore, because the prior art of record does not anticipate nor make obvious the limitations recited in claim 19, claim 19 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

Claim 20 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome, because none of the references of record either alone or in combination fairly disclose or suggest the combination of limitations specified in claim 20, including at least:  
assigning a confidence score for the labelling of the third plurality of input candidates using a bipartite graph of the pretrained vector space and the fine tuned vector space

The closest prior art of record discloses: 
	MEDALION discloses techniques for mapping text (in the form of “factors”) onto the pre-trained vector space of the word2vec system. (paras. 0026, 0047).
	SHARMA teaches both automatic and manual process flows for labeling content, where such flows includes labeling queues.  (paras. 0025, 0055, 0081-0082). 
YAO teaches consensus annotation data concerning a plurality of human annotators (paras. 0042-0043) and determining a model confidence score. (para. 0358).
	SUNDAR discloses partitioning datasets into training, validation and testing subsets, where the validation set is split into portions to account for out-of-distribution samples. (p. 252, section II.B).
LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children, and where singletons are clusters with only a single element.  (paras. 0003, 0116, 0179).
	BERGMAN teaches fine-tuning a pretrained model, such as the well-known BERT language model, and determining confidence scores with respect to such models.  (paras. 0039, 0060, 0066). 
US 20200193323 A1, hereinafter referenced as ALESIANI, discloses using bipartite graphs for mixed integer linear programming (MILP) problems, where the bipartite graphs show if embeddings are closer in an embedding space.  (paras. 0093-0095).

However, the examiner has found that the distinct feature of the Applicant's claimed invention over the prior art is the explicit claiming of the aforementioned limitations in combination with all the other limitations as specified in claim 20.  In particular, the prior art of record does not specifically teach using bipartite graphs for both a pretrained vector space and fine tuned vector space specifically to assign a confidence score for the labeling of candidates.  Therefore, because the prior art of record does not anticipate nor make obvious the limitations recited in claim 20, claim 20 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

Claim 21 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome, because none of the references of record either alone or in combination fairly disclose or suggest the combination of limitations specified in claim 21, including at least:  
assigning a confidence score for each of the two or more fine tuned models using a bipartite graph for each of the one or more pairs of pretrained vector spaces and fine tuned vector spaces

The closest prior art of record discloses: 
	MEDALION discloses techniques for mapping text (in the form of “factors”) onto the pre-trained vector space of the word2vec system. (paras. 0026, 0047).
	SHARMA teaches both automatic and manual process flows for labeling content, where such flows includes labeling queues.  (paras. 0025, 0055, 0081-0082). 
YAO teaches consensus annotation data concerning a plurality of human annotators (paras. 0042-0043) and determining a model confidence score. (para. 0358).
	SUNDAR discloses partitioning datasets into training, validation and testing subsets, where the validation set is split into portions to account for out-of-distribution samples. (p. 252, section II.B).
LEVITAN teaches a CF-tree based hierarchical clustering, where clusters have cluster children, and where singletons are clusters with only a single element.  (paras. 0003, 0116, 0179).
	BERGMAN teaches fine-tuning a pretrained model, such as the well-known BERT language model, and determining confidence scores with respect to such models.  (paras. 0039, 0060, 0066). 
US 20200193323 A1, hereinafter referenced as ALESIANI, discloses using bipartite graphs for mixed integer linear programming (MILP) problems, where the bipartite graphs show if embeddings are closer in an embedding space.  (paras. 0093-0095).

However, the examiner has found that the distinct feature of the Applicant's claimed invention over the prior art is the explicit claiming of the aforementioned limitations in combination with all the other limitations as specified in claim 21.  In particular, the prior art of record does not specifically teach using bipartite graphs for pairs of a pretrained vector space and a fine tuned vector space specifically to assign a confidence score for the labeling of candidates.  Therefore, because the prior art of record does not anticipate nor make obvious the limitations recited in claim 21, claim 21 would be considered allowable if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

	Claims 22-25 depend from claim 20, and would be allowed for the same reasons explained with respect to claim 20, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome
Claim 31 depends from claim 28 and claims a system that corresponds to the method of claim 19, and would therefore be allowed for the same reasons explained above with respect to claims 19, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 32 depends from claim 28 and claims a system that corresponds to the method of claim 20, and is therefore rejected for the same reasons explained above with respect to claims 20, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 33 depends from claim 28 and claims a system that corresponds to the method of claim 21, and would therefore be allowed for the same reasons explained above with respect to claims 21, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 34 depends from claim 32 and claims a system that corresponds to the method of claim 22, and would therefore be allowed for the same reasons explained above with respect to claims 22, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 35 depends from claim 33 and claims a system that corresponds to the method of claim 23, and would therefore be allowed for the same reasons explained above with respect to claims 23, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 41 depends from claim 28 and claims a non-transitory computer-readable medium that corresponds to the method of claim 19, and would therefore be allowed for the same reasons explained above with respect to claims 19, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 42 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 20, and would therefore be allowed for the same reasons explained above with respect to claims 20, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 43 depends from claim 38 and claims a non-transitory computer-readable medium that corresponds to the method of claim 21, and would therefore be allowed for the same reasons explained above with respect to claims 21, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 44 depends from claim 42 and claims a non-transitory computer-readable medium that corresponds to the method of claim 22, and would therefore be allowed for the same reasons explained above with respect to claims 22, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.
Claim 45 depends from claim 43 and claims a non-transitory computer-readable medium that corresponds to the method of claim 23, and would therefore be allowed for the same reasons explained above with respect to claims 23, if rewritten in independent form including all of the limitations of the base claim and any intervening claims, and if the rejections under 35 U.S.C. 101 are overcome.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 
US 20150149134 A1 (Mehta).  “Additionally or alternatively, the operating states represented by the cluster of data patterns may be determined by presenting the data patterns to experts and receiving label inputs from the experts, in a crowd-sourced collaborative filtering manner, that label the data patterns with corresponding operating states (e.g. normal or abnormal state) and provide additional information on severity of such states. These previously received label inputs may then be used to semantically label incoming unknown data with corresponding identifiers of operating states. If the label inputs are in conflict or disagreement with each other, the label inputs may be weighed and used to probabilistically determine which label corresponds to which data pattern, with some labels having a higher likelihood than others corresponding to a given data pattern.” (para. 0067).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL C LEE whose telephone number is (571)272-4933. The examiner can normally be reached M-F 12:00 pm - 8:00 pm ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at 571-272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.




/MICHAEL C. LEE/Examiner, Art Unit 2128
Read full office action
Prosecution Timeline

Oct 20, 2022
Application Filed
Oct 07, 2025
Non-Final Rejection mailed — §101, §103
Mar 03, 2026
Interview Requested
Apr 07, 2026
Response Filed
May 08, 2026
Final Rejection mailed — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/475,724
Patent 12603081
METHOD AND SERVER FOR A TEXT-TO-SPEECH PROCESSING
4y 7m to grant Granted Apr 14, 2026
17/732,871
Patent 12602605
QUANTUM COMPUTER ARCHITECTURE BASED ON MULTI-QUBIT GATES
3y 11m to grant Granted Apr 14, 2026
17/207,554
Patent 12591915
METHODS AND SYSTEMS FOR DETERMINING RECOMMENDATIONS BASED ON REAL-TIME OPTIMIZATION OF MACHINE LEARNING MODELS
5y 0m to grant Granted Mar 31, 2026
18/885,396
Patent 12585743
INTERFACE ACCESS PROCESSING METHOD, COMPUTER DEVICE AND STORAGE MEDIUM
1y 6m to grant Granted Mar 24, 2026
17/486,877
Patent 12568935
AI-BASED LIVESTOCK MANAGEMENT SYSTEM AND LIVESTOCK MANAGEMENT METHOD THEREOF
4y 5m to grant Granted Mar 10, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

3-4
Expected OA Rounds
61%
Grant Probability
87%
With Interview (+26.0%)
3y 3m (~0m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 144 resolved cases by this examiner. Grant probability derived from career allowance rate.