Office Action Analysis: 18158025 — SYSTEMS AND METHODS FOR LABEL PROPAGATION USING SUPERVISED PROJECTIONS OF SEMANTIC EMBEDDINGS

Office Action

§101 §103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
The present application is being examined under the pre-AIA  first to invent provisions. 
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 07/23/2025 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 112b
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 1-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

The term “optimal” in claims 1, 2, 13, 14, 15, 18, and 20 is a relative term which renders the claim indefinite. The term “optimal” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. The “set of dimensions” is rendered indefinite by the use of the term “optimal”.

Dependent claims are also rejected under 112(b) due to inheriting the deficiencies of claim rejected claims.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.

Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
In reference to claim 1:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a manufacture

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“based on the first output, determining an optimal set of dimensions for the labeling task,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an optimal set of dimensions for labeling task based on the output.
“selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the sparsely labeled dataset,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could select a first projection from the projected graph.
“determining a first distance between the first projection and a second projection in the projected graph,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first distance between the first projection and a second projection.
“determining a first likelihood the first projection has a first label,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first likelihood the first projection has a first label.
“determining a second likelihood the first projection has a second label based in part on the first distance to the second projection,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a second likelihood the first projection has a second label based in part on the first distance to the second projection.
“and comparing the first likelihood to the second likelihood;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first and second likelihood.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“A system for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the system further comprising: cloud-based storage circuitry configured to store: the sparsely labeled dataset, a language model, wherein the language model has been trained separately from the sparsely labeled dataset, and an artificial intelligence model, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“cloud-based control circuitry configured to: receiving the sparsely labeled dataset, receiving a labeling task,” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
“determining a semantic graph by embedding the sparsely labeled dataset using the language model, processing the semantic graph in the artificial intelligence model,” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“receiving a first output from the artificial intelligence model,” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
“determining a projected graph by projecting the semantic graph into the optimal set of dimensions,” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and cloud-based input/output circuitry configured to: generating for display, on a user interface, a recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“A system for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the system further comprising: cloud-based storage circuitry configured to store: the sparsely labeled dataset, a language model, wherein the language model has been trained separately from the sparsely labeled dataset, and an artificial intelligence model, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“cloud-based control circuitry configured to: receiving the sparsely labeled dataset, receiving a labeling task,” (well-understood, routine, conventional MPEP 2106.05(d))
“determining a semantic graph by embedding the sparsely labeled dataset using the language model, processing the semantic graph in the artificial intelligence model,” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“receiving a first output from the artificial intelligence model,” (well-understood, routine, conventional MPEP 2106.05(d))
“determining a projected graph by projecting the semantic graph into the optimal set of dimensions,” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and cloud-based input/output circuitry configured to: generating for display, on a user interface, a recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 2:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“based on the first output, determining an optimal set of dimensions for the labeling task;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an optimal set of dimensions for the labeling task.
“determining a first distance between a first projection and a second projection in the projected graph, wherein the second projection has a second label;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first distance between a first projection and a second projection.
“determining a first likelihood the first projection has a first label;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first likelihood the first projection has a first label.
“determining a second likelihood the first projection has the second label based in part on the first distance to the second projection;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a second likelihood the first projection has the second label based in part on the first distance to the second projection.
“comparing the first likelihood to the second likelihood;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first likelihood to the second likelihood.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“A method for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the method further comprising: receiving a dataset; receiving a labeling task;” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
“determining a semantic graph by embedding the dataset using a language model;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“processing the semantic graph in an artificial intelligence model to generate a first output, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“determining a projected graph by projecting the semantic graph into the optimal set of dimensions;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, a first recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“A method for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the method further comprising: receiving a dataset; receiving a labeling task;” (well-understood, routine, conventional MPEP 2106.05(d))
“determining a semantic graph by embedding the dataset using a language model;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“processing the semantic graph in an artificial intelligence model to generate a first output, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“determining a projected graph by projecting the semantic graph into the optimal set of dimensions;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, a first recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 3:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 2, wherein determining a first distance between the first projection and a second projection in the projected graph further comprises: selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the dataset.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could select a first projection from the projected graph.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 4:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Claim 4 inherits the abstract idea of the parent claim 

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“The method of claim 2, wherein comparing the first likelihood to the second likelihood further comprises: in response to comparing the first likelihood to the second likelihood, generating for display, on the user interface, a second recommendation to continue the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“The method of claim 2, wherein comparing the first likelihood to the second likelihood further comprises: in response to comparing the first likelihood to the second likelihood, generating for display, on the user interface, a second recommendation to continue the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 5:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 2, wherein determining the second likelihood the first projection has the second label is based in part on a third likelihood the second projection has the second label.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine the second likelihood the first projection has the second label based in part on a third likelihood the second projection has the second label.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No  

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 6:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 2, further comprising: determining a composite likelihood from the first likelihood and the second likelihood;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a composite likelihood from the first and second likelihood.
“determining a first entropy score from the composite likelihood, wherein the first entropy score indicates randomness of label components in the composite likelihood;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first entropy score from the composite likelihood.
“comparing the first entropy score to a first threshold entropy score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first entropy score to a first threshold entropy score.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on the user interface, a third recommendation to review the first label being assigned to the first projection.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on the user interface, a third recommendation to review the first label being assigned to the first projection.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 7:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 2, further comprising: determining a first predicted label for the first projection;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first predicted label for the first projection.
“determining a first consistency score for the first projection based on a comparison of the first predicted label and the first label, wherein the first consistency score indicates a degree of consensus between the first predicted label and the first label;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first consistency score the first projection.
“comparing the first consistency score to a first threshold consistency score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first consistency score to a first threshold consistency score.
“in response to comparing the first consistency score to the first threshold consistency score, filtering the first projection to a first group;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could filter the first projection to a first group based on the  comparison.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“and generating for display, on the user interface, a fourth recommendation to use the first group as a training sample for a supervised learning task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)). 
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“and generating for display, on the user interface, a fourth recommendation to use the first group as a training sample for a supervised learning task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)). 
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 8:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 2, further comprising: determining a first outlier score based on the first distance;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first outlier score based on the first distance.
“comparing the first outlier score to a first threshold outlier score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first outlier score to a first threshold outlier score.
“and selecting a recommendation from a plurality of recommendations based on comparing the first outlier score to a first threshold outlier score.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could select a recommendation based on the comparison of the first outlier score.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 9:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Claim 9 inherits the abstract idea of the parent claim  

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“The method of claim 3, wherein the dataset comprises a first subset having given labels, and a second subset, wherein the second subset makes up between 90 and 99.99 percent of the dataset.” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“The method of claim 3, wherein the dataset comprises a first subset having given labels, and a second subset, wherein the second subset makes up between 90 and 99.99 percent of the dataset.” (well-understood, routine, conventional MPEP 2106.05(d))
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 10:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 9, wherein the first likelihood is greater than the second likelihood, and wherein comparing the first likelihood to the second likelihood further comprises assigning the first label to the first projection” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could assign the first label to the first projection.
“assigning the first label to the first datapoint,” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could assign the first label to the first datapoint.
“and assigning the first label to the first text input.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could assign the first label to the first text input.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 11:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 10, wherein the first subset comprises the first text input having the first label, and wherein the method further comprises: comparing the first label to a corresponding given label from the given labels of the first subset;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first label to a corresponding given label from the given labels of the first subset.
“and determining an evaluation of the artificial intelligence model in response to comparing the first label to the corresponding given label.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an evaluation of the artificial intelligence model in response to the comparison between first label and corresponding given label.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 12:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Claim 12 inherits the abstract idea of the parent claim  

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“The method of claim 2, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“The method of claim 2, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.” (well-understood, routine, conventional MPEP 2106.05(d))
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 13:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 12, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an amount of information present in the semantic graph.
“determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first optimal dimension from the original dimensions.
“comparing the first amount of information to a cut-off score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first amount of information to a cut-off score.
“and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could include/group the first optimal dimension in the optimal set of dimensions.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 14:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 13, wherein determining the first optimal dimension from the original dimensions comprises: determining an original vector for each original dimension of the original dimensions;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an original vector for each original dimension.
“determining a correlation between the original vector and all other original vectors, wherein the correlation comprises a measure of shared information;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a correlation between the original vector and all other original vectors.
“determining a first optimal vector based on the correlation;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine first optimal vector based on the correlation.
“and determining the first optimal dimension from the first optimal vector.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine the first optimal dimension from the first optimal vector.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 15:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 13, further comprising: determining a second optimal dimension from the original dimensions, wherein the second optimal dimension has a second amount of information, and wherein the second amount of information is less than the first amount of information;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a second optimal dimension from the original dimensions.
“combining the first amount of information and the second amount of information into a running total amount of information;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could combine the first and second amount of information into a running total amount of information.
“comparing the running total amount of information to the cut-off score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the running total amount of information to the cut-off score.
“and in response to comparing the running total amount of information to the cut-off score, discarding the second optimal dimension.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could discard/not consider the second optimal dimension.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 16:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“and determining the cut-off score based on the first user input.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine the cut-off score based on the input.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“The method of claim 13, further comprising: receiving a first user input;” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g)) 
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“The method of claim 13, further comprising: receiving a first user input;” (well-understood, routine, conventional MPEP 2106.05(d))
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 17:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a process

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The method of claim 13, wherein the cut-off score is between ninety-five and ninety-nine percent of the amount of information in the semantic graph.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a cut-off score to be between ninety-five and ninety-nine percent of the amount of information in the semantic graph.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 

In reference to claim 18:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“based on the first output, determine an optimal set of dimensions for the labeling task;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an optimal set of dimensions for the labeling task.
“determine a first distance between a first projection and a second projection in the projected graph, wherein the second projection has a second label;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first distance between a first projection and a second projection.
“determine a first likelihood the first projection has a first label;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first likelihood the first projection has a first label.
“determine a second likelihood the first projection has the second label based in part on the first distance to the second projection;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a second likelihood the first projection has the second label based in part on the first distance to the second projection.
“compare the first likelihood to the second likelihood;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first and second likelihood.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“A non-transitory, computer readable medium storing computer instructions which, when executed by one or more computer processors, cause the one or more computer processors to: receive a dataset; receive a labeling task;” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
“determine a semantic graph by embedding the dataset using a language model;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“process the semantic graph in an artificial intelligence model to generate a first output, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“determine a projected graph by projecting the semantic graph into the optimal set of dimensions;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, a first recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“A non-transitory, computer readable medium storing computer instructions which, when executed by one or more computer processors, cause the one or more computer processors to: receive a dataset; receive a labeling task;” (well-understood, routine, conventional MPEP 2106.05(d))
“determine a semantic graph by embedding the dataset using a language model;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“process the semantic graph in an artificial intelligence model to generate a first output, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“determine a projected graph by projecting the semantic graph into the optimal set of dimensions;” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
“and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, a first recommendation to stop the labeling task.” is merely reciting the words "apply it" (or an equivalent) with the judicial exception, or merely including instructions to implement an abstract idea on a computer, or merely using a computer as a tool to perform an abstract idea (MPEP 2106.05(f)).
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 19:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
Claim 19 inherits the abstract idea of the parent claim  

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
“The non-transitory, computer readable medium of claim 18, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.” (insignificant extra-solution activity mere data gathering MPEP 2106.05(g))
The claim does not include additional elements that are integrated into a practical application.

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
“The non-transitory, computer readable medium of claim 18, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.” (well-understood, routine, conventional MPEP 2106.05(d))
The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception.

In reference to claim 20:
Step 1 - Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a machine

Step 2A Prong 1 - Does the claim recite an abstract idea, law of nature, or natural phenomenon?
“The non-transitory, computer readable medium of claim 19, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine an amount of information present in the semantic graph.
“determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could determine a first optimal dimension form the original dimensions.
“comparing the first amount of information to a cut-off score;” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could compare the first amount of information to a cut-off score.
“and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions.” which is an abstract idea because it is directed to a mental process, an observation, evaluation, judgement, or opinion. The limitation as drafted, and under a broadest reasonable interpretation, can be performed in the human mind, or by a human using a pen and paper (MPEP 2106.04(a)(2)(Ill)(c)). For example, a person could include/group the first optimal dimension in the optimal set of dimensions.

Step 2A Prong 2 - Does the claim recite additional elements that integrate the judicial exception into a practical application?
No 

Step 2B - Does the claim recite additional elements that amount to significantly more than the judicial exception?
No 
	
	
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-5, 8-13, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”).

Regarding claim 1, Sewak teaches A system for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the system further comprising: cloud-based storage circuitry configured to store: the [sparsely] labeled dataset, a language model, wherein the language model has been trained separately from the sparsely labeled dataset (Sewak Fig 1 shows labelling system that comprises cloud-based storage circuitry configure to store (Cloud Service 199 contains storage 180); Sewak Paragraph 0082; "An NLG model taken from repository 162 and employed by labeling service 142 to perform a step in a label scoring service 168 is generally trained over a natural language corpus that is unlabeled." Sewak Paragraph 0083; "method 300 begins at step 303 when the labeling service 142 serves a display page to labeling application 110. At step 305, labeling service 142 receives a text string defining candidate text from a document in corpus 154 or from the labeling application 110. At step 310, the labeling service 142 receives a text string defining a label, e.g. from labeling application 110." Examiner notes that cloud-based circuitry (Cloud service) receives labeled dataset (text string defining candidate text and text string defining label) meaning it has capabilities to store labeled dataset; language model (NLG model) is trained separately from the sparsely labeled dataset (trained using unlabeled corpus))
cloud-based control circuitry configured to: receiving the sparsely labeled dataset, (Sewak Paragraph 0083; "method 300 begins at step 303 when the labeling service 142 serves a display page to labeling application 110. At step 305, labeling service 142 receives a text string defining candidate text from a document in corpus 154 or from the labeling application 110. At step 310, the labeling service 142 receives a text string defining a label, e.g. from labeling application 110." Examiner notes that cloud-based circuitry (Cloud service) receives labeled dataset (text string defining candidate text and text string defining label))
receiving a labeling task, (Examiner references previous mapping to note that labeling service receives labeling task (text strings))
determining a first distance between the first projection and a second projection in the projected graph, (Sewak Paragraph 0091; "This might be obtained by computing similarity, e.g. cosine similarity between the embedding vectors" Examiner notes that a first distance is determined from a cosine similarity between first projection and second projection (embedding vectors) in the projected graph via equation cosine distance = 1 – cosine similarity)
and cloud-based input/output circuitry configured to: generating for display, on a user interface (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that paragraph 0056 and Fig 1 shows display 120 (user interface) used to display)

Sewak does not teach sparsely labeled dataset 
a recommendation to stop the labeling task. 
However, Ahmet does teach sparsely labeled dataset (Ahmet Page 5075 Paragraph 3; "The training set consists of 50k images… Evaluation is performed with 50, 100, 200, and 400 labeled images per classes, corresponding to l = 500, 1k, 2k, and 4k label images in total." Examiner notes that training dataset is sparsely labeled dataset)
a recommendation to stop the labeling task. (Ahmet Page 5076 Paragraph 1; "The training is performed for 180 epochs in total." Examiner notes that the total number of epochs is a recommendation to stop the labeling task)


It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Sewak in view of Ahmet does not teach and an artificial intelligence model, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs; 
processing the semantic graph in the artificial intelligence model, 
receiving a first output from the artificial intelligence model, 
based on the first output, determining an optimal set of dimensions for the labeling task, 
However, Dane does teach and an artificial intelligence model, wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs; (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding…In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that storage circuitry is configure to store an artificial intelligence model (machine learning model), wherein the model is trained to output optimal sets of dimensions (classifier layer is trained to output probability distribution) for labeling tasks based on inputted semantic graphs (graph embedding); The term “dimensions” is broad. For examining purposes, the “dimensions” will be interpreted as “label/identifier”.)
processing the semantic graph in the artificial intelligence model, (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding… inputting the graph embedding representing the subgraph into a graph neural network;" Examiner notes that artificial intelligence model (graph neural network) is processing the semantic graph (machine learning model))
receiving a first output from the artificial intelligence model, (Dane Paragraph 0031; "performing cross-attention between the language model and the graph neural network to form the combined representation." Examiner notes that first output (combined representation) is received from AI model (machine learning model))
based on the first output, determining an optimal set of dimensions for the labeling task, (Dane Paragraph 0031; "In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that based on the first output (combined representation), determine an optimal set of dimensions for labeling task (probability distribution over the set of possible biological entity identifiers; shown in Fig 4 410))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Ahmet in further view of Dane does not teach determining a semantic graph by embedding the sparsely labeled dataset using the language model,
determining a projected graph by projecting the semantic graph into the optimal set of dimensions,
selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the sparsely labeled dataset,
However, Malden does teach determining a semantic graph by embedding the sparsely labeled dataset using the language model, (Malden Paragraph 0012; "A semantic graph embedding system may generate a continuous semantic space from discrete examples of data." Examiner notes that semantic graph embedding system (language model) determines a semantic graph (semantic space) by embedding the sparsely labeled dataset (from discrete examples of data))
determining a projected graph by projecting the semantic graph into the optimal set of dimensions, (Malden Paragraph 0032; "At operation 304, the semantic graph embedding system 116 constructs a Markov chain. The Markov chain comprises a plurality of nodes. Each node in the plurality of nodes represents a data value in the dataset." Malden Paragraph 0035; "At operation 310, the distance (e.g., path cost) determined in operation 308 is stored as a dimension in a vector… then the vector representation of the target node would be: [A,B,C,D,E]… the semantic graph embedding system 116 reduces the vector (e.g., an n-dimensional vector) to a three-dimensional vector using principal component analysis (PCA)." Examiner notes that the projected graph (three-dimensional vector) is determined by projection (reduction by semantic embedding) the semantic graph into the optimal set of dimension (cost vector))
selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the sparsely labeled dataset, (Malden Fig 3 step 306 shows selecting first projection (target node) and corresponds to the first datapoint in the semantic graph; the semantic graph is determined by sparsely labeled dataset meaning the first datapoint in the semantic graph corresponds to first text input from the sparsely labeled dataset)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, and Malden. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, and Malden to understand how the data within the dataset is being used “The semantic graph embedding system may further understanding the semantics of the dataset.” (Malden).

Sewak in view of Ahmet in further view of Dane in further view of Malden does not teach determining a first likelihood the first projection has a first label,
determining a second likelihood the first projection has a second label based in part on the first distance to the second projection
and comparing the first likelihood to the second likelihood;
However, Li does teach determining a first likelihood the first projection has a first label, (Li Paragraph 0029; "(4.1) using the t-SNE algorithm to perform dimensionality reduction visualization, specifically: step 1: assuming that a data set X has a total of N data points, and the dimension of each data point x.sub.i is D, reducing the dimensions to two dimensions" Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that t-SNE is used to determine the projection of data points and then calculate/determine a first likelihood (conditional probability) the first projection has a first label)
determining a second likelihood the first projection has a second label based in part on the first distance to the second projection (Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that a second likelihood the first projection has a second label is  calculated/determined (conditional probability) based in part on the first distance to the second projection (Euclidean distance))
and comparing the first likelihood to the second likelihood; (Li Paragraph 0034; "minimizing a difference in the conditional probabilities, that is, making the conditional probability Q.sub.j|i approximate to P.sub.j|i; it is achieved by minimizing the Kullback-Leibler divergence between the two conditional probability distributions" Examiner notes that minimizing a difference in the conditional probabilities using Kullback-Leibler is comparing the first likelihood to the second likelihood)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, and Li. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, and Li to maintain a local structure between datapoints to aid in decision making “The main advantage of t-SNE is the ability to maintain a local structure.” (Li Paragraph 0077).

Regarding claim 2, Sewak teaches A method for propagating labels through a sparsely labeled dataset using a supervised projection of a semantic embedding, the method further comprising: receiving a dataset; (Sewak Paragraph 0083; "method 300 begins at step 303 when the labeling service 142 serves a display page to labeling application 110. At step 305, labeling service 142 receives a text string defining candidate text from a document in corpus 154 or from the labeling application 110. At step 310, the labeling service 142 receives a text string defining a label, e.g. from labeling application 110." Examiner notes receives labeled dataset (text string defining candidate text and text string defining label))
receiving a labeling task; (Examiner references previous mapping to note that labeling service receives labeling task (text strings incites a labeling task))
determining a first distance between a first projection and a second projection in the projected graph, wherein the second projection has a second label; (Sewak Paragraph 0091; "This might be obtained by computing similarity, e.g. cosine similarity between the embedding vectors" Examiner notes that a first distance is determined from a cosine similarity between first projection and second projection (embedding vectors) in the projected graph via equation cosine distance = 1 – cosine similarity)
and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that in response to the comparing, paragraph 0056 and Fig 1 shows display 120 (user interface) used to display prompt)

Sewak does not teach a first recommendation to stop the labeling task. 
However, Ahmet does teach a first recommendation to stop the labeling task. (Ahmet Page 5076 Paragraph 1; "The training is performed for 180 epochs in total." Examiner notes that the total number of epochs is a recommendation to stop the labeling task)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Sewak in view of Ahmet does not teach processing the semantic graph in an artificial intelligence model to generate a first output,
wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;
based on the first output, determining an optimal set of dimensions for the labeling task;
However, Dane does teach processing the semantic graph in an artificial intelligence model to generate a first output, (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding… inputting the graph embedding representing the subgraph into a graph neural network;" Examiner notes that artificial intelligence model (graph neural network) is processing the semantic graph (machine learning model))
wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs; (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding…In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that storage circuitry is configure to store an artificial intelligence model (machine learning model), wherein the model is trained to output optimal sets of dimensions (classifier layer is trained to output probability distribution) for labeling tasks based on inputted semantic graphs (graph embedding))
based on the first output, determining an optimal set of dimensions for the labeling task; (Dane Paragraph 0031; "In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that based on the first output (combined representation), determine an optimal set of dimensions for labeling task (probability distribution over the set of possible biological entity identifiers; shown in Fig 4 410))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Ahmet in further view of Dane does not teach determining a semantic graph by embedding the dataset using a language model;
determining a projected graph by projecting the semantic graph into the optimal set of dimensions;
However, Malden does teach determining a semantic graph by embedding the dataset using a language model; (Malden Paragraph 0012; "A semantic graph embedding system may generate a continuous semantic space from discrete examples of data." Examiner notes that semantic graph embedding system (language model) determines a semantic graph (semantic space) by embedding the sparsely labeled dataset (from discrete examples of data))
determining a projected graph by projecting the semantic graph into the optimal set of dimensions; (Malden Paragraph 0032; "At operation 304, the semantic graph embedding system 116 constructs a Markov chain. The Markov chain comprises a plurality of nodes. Each node in the plurality of nodes represents a data value in the dataset." Malden Paragraph 0035; "At operation 310, the distance (e.g., path cost) determined in operation 308 is stored as a dimension in a vector… then the vector representation of the target node would be: [A,B,C,D,E]… the semantic graph embedding system 116 reduces the vector (e.g., an n-dimensional vector) to a three-dimensional vector using principal component analysis (PCA)." Examiner notes that the projected graph (three-dimensional vector) is determined by projection (reduction by semantic embedding) the semantic graph into the optimal set of dimension (vector))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, and Malden. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, and Malden to understand how the data within the dataset is being used “The semantic graph embedding system may further understanding the semantics of the dataset.” (Malden).

Sewak in view of Ahmet in further view of Dane in further view of Malden does not teach determining a first likelihood the first projection has a first label,
determining a second likelihood the first projection has a second label based in part on the first distance to the second projection
and comparing the first likelihood to the second likelihood;
However, Li does teach determining a first likelihood the first projection has a first label, (Li Paragraph 0029; "(4.1) using the t-SNE algorithm to perform dimensionality reduction visualization, specifically: step 1: assuming that a data set X has a total of N data points, and the dimension of each data point x.sub.i is D, reducing the dimensions to two dimensions" Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that t-SNE is used to determine the projection of data points and then calculate/determine a first likelihood (conditional probability) the first projection has a first label)
determining a second likelihood the first projection has a second label based in part on the first distance to the second projection (Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that a second likelihood the first projection has a second label is  calculated/determined (conditional probability) based in part on the first distance to the second projection (Euclidean distance))
and comparing the first likelihood to the second likelihood; (Li Paragraph 0034; "minimizing a difference in the conditional probabilities, that is, making the conditional probability Q.sub.j|i approximate to P.sub.j|i; it is achieved by minimizing the Kullback-Leibler divergence between the two conditional probability distributions" Examiner notes that minimizing a difference in the conditional probabilities using Kullback-Leibler is comparing the first likelihood to the second likelihood)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, and Li. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, and Li to maintain a local structure between datapoints to aid in decision making “The main advantage of t-SNE is the ability to maintain a local structure.” (Li Paragraph 0077).

Regarding claim 3, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches the limitations of claim 2.
Sewak does not teach The method of claim 2, wherein determining a first distance between the first projection and a second projection in the projected graph further comprises: selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the dataset.
However, Malden does teach The method of claim 2, wherein determining a first distance between the first projection and a second projection in the projected graph further comprises: selecting a first projection from the projected graph, wherein the first projection corresponds to a first datapoint in the semantic graph, and wherein the first datapoint corresponds to a first text input from the dataset. (Malden Fig 3 step 306 shows selecting first projection (target node) and corresponds to the first datapoint in the semantic graph; the semantic graph is determined by sparsely labeled dataset meaning the first datapoint in the semantic graph corresponds to first text input from the sparsely labeled dataset)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, and Malden. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, and Malden to understand how the data within the dataset is being used “The semantic graph embedding system may further understanding the semantics of the dataset.” (Malden).

Regarding claim 4, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches limitation of claim 2
Sewak further teaches The method of claim 2, wherein comparing the first likelihood to the second likelihood further comprises: in response to comparing the first likelihood to the second likelihood, generating for display, on the user interface, (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that in response to the comparing, paragraph 0056 and Fig 1 shows display 120 (user interface) used to display prompt)

Sewak does not teach a second recommendation to continue the labeling task.
However, Ahmet does teach a second recommendation to continue the labeling task. (Ahmet Page 5076 Paragraph 1; "The training is performed for 180 epochs in total." Examiner notes that when epoch is less than 18, it is a second recommendation to continue the labeling task)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Regarding claim 5, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 2
Sewak does not teach The method of claim 2, wherein determining the second likelihood the first projection has the second label is based in part on a third likelihood the second projection has the second label. 
However, Li does teach The method of claim 2, wherein determining the second likelihood the first projection has the second label is based in part on a third likelihood the second projection has the second label. (Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that a second likelihood the first projection has a second label is calculated/determined (conditional probability) based in part on a third likelihood (calculating a conditional probability) the second projection has the second label (similarity between the datapoints); For examination purposes, the examiner will interpret the claim as the second likelihood and third likelihood are associated with each other.)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, and Li. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, and Li to maintain a local structure between datapoints to aid in decision making “The main advantage of t-SNE is the ability to maintain a local structure.” (Li Paragraph 0077).

Regarding claim 8, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 2
Sewak further teaches The method of claim 2, further comprising: determining a first outlier score based on the first distance; (Sewak Paragraph 0106; "Now assume that the search score of the search engine is cosine similarity between the documents in a semantic space yielding associated scores (GR-EX-L1=0.5, GR-EX-AL2=0.3, GR-EX-AL1=0.21, GR-EX-L2=0.05)." Examiner notes that a first outlier score (search score) is determined based on the first distance (cosine similarity))
comparing the first outlier score to a first threshold outlier score; (Sewak Paragraph 0106; "if a search score threshold of 0.08 is used." Examiner notes that first outlier score (search score) is compared to a first threshold outlier score (search score threshold))
and selecting a recommendation from a plurality of recommendations based on comparing the first outlier score to a first threshold outlier score. (Sewak Paragraph 0106; "reconciliation rule 3 would choose anti-label if a search score threshold of 0.08 is used." Examiner notes that based on comparing the first outlier score to a first threshold outlier score, select a recommendation from a plurality of recommendations (choose or not choose anti-label))

Regarding claim 9, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 3
Sewak does not teach The method of claim 3, wherein the dataset comprises a first subset having given labels, and a second subset, wherein the second subset makes up between 90 and 99.99 percent of the dataset.
However, Ahmet does teach The method of claim 3, wherein the dataset comprises a first subset having given labels, and a second subset, wherein the second subset makes up between 90 and 99.99 percent of the dataset. (Ahmet Page 5075 Paragraph 3; "The training set consists of 50k images coming from 10 classes, while the test set consists of 10k images from the same 10 classes. All images have resolution 32 × 32. Evaluation is performed with 50, 100, 200, and 400 labeled images per classes, corresponding to l = 500, 1k, 2k, and 4k label images in total." Examiner notes that first subset (labeled images) contains 500 labeled images and second subset (unlabeled images) contains 50000 - 500 = 49500 unlabeled images; second subset makes up 90 percent of the dataset 49500/50000 = 0.99)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Regarding claim 10, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 9
Sewak teaches and wherein comparing the first likelihood to the second likelihood further comprises assigning the first label to the first projection, assigning the first label to the first datapoint, and assigning the first label to the first text input. (Sewak Paragraph 0188; "outputting an indication that the candidate text corresponds to a label corresponding to the class." Examiner notes that outputting an indication/assigning that the candidate text (first text input) corresponds to a label (label corresponding to the class); first text input is associated to first datapoint and associated to first projection so assigning to first text input is assigning to first datapoint and first projection)

Sewak does not teach The method of claim 9, wherein the first likelihood is greater than the second likelihood,
However, Li does teach The method of claim 9, wherein the first likelihood is greater than the second likelihood, (Li Paragraph 0034; "minimizing a difference in the conditional probabilities, that is, making the conditional probability Q.sub.j|i approximate to P.sub.j|i; it is achieved by minimizing the Kullback-Leibler divergence between the two conditional probability distributions" Examiner notes that a non-zero value of Kullback-Leibler divergence means a conditional probability Q (first likelihood) is greater than the conditional probability P (second likelihood))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, and Li. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, and Li to maintain a local structure between datapoints to aid in decision making “The main advantage of t-SNE is the ability to maintain a local structure.” (Li Paragraph 0077).

Regarding claim 11, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 10
Sewak further teaches The method of claim 10, wherein the first subset comprises the first text input [having the first label,] (Sewak Paragraph 0083; "At step 305, labeling service 142 receives a text string defining candidate text from a document in corpus 154 or from the labeling application 110. At step 310, the labeling service 142 receives a text string defining a label, e.g. from labeling application 110." Examiner notes that first text input is text string defining candidate text)

Sewak does not teach [The method of claim 10, wherein the first subset comprises the first text] input having the first label,
and wherein the method further comprises: comparing the first label to a corresponding given label from the given labels of the first subset;
and determining an evaluation of the artificial intelligence model in response to comparing the first label to the corresponding given label.
However, Ahmet does teach [The method of claim 10, wherein the first subset comprises the first text] input having the first label, (Ahmet Page 5075 Paragraph 3; "Evaluation is performed with 50, 100, 200, and 400 labeled images per classes" Examiner notes that first subset (labeled images) comprises the first input (image) having the first label (label per class))
and wherein the method further comprises: comparing the first label to a corresponding given label from the given labels of the first subset; (Ahmet Fig 4 and Page 5077 Paragraph 1; "In Figure 4, we report the progress of the pseudo-label accuracy on unlabeled images XU throughout the training." Examiner notes that comparing the first label (ground truth) to a corresponding given label (predicted pseudo label) from the given labels of the first subset is represented as prediction accuracy)
and determining an evaluation of the artificial intelligence model in response to comparing the first label to the corresponding given label. (Ahmet Page 5077 Paragraph 1; "Diffusion predictions are consistently better than network predictions." Examiner notes that evaluation is determined of the artificial intelligence model (diffusion predication outperforms network predictions) in response to comparing (prediction accuracy))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Regarding claim 12, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 2.
Sewak does not teach The method of claim 2, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.
However, Dane does teach The method of claim 2, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens. (Dane Fig 4 and Paragraph 0111; "The word piece tokens 403 and the positional embeddings 407 may simply be summed to form the input representation of the input text sequence 401… The classification layer 409 is trained to output a probability 410 for each of the possible unique biological target identifiers 411" Examiner notes that dataset (401) has a number of unique tokens (word piece tokens), wherein the semantic graph has original dimensions (possible unique biological target identifier), and wherein the original dimensions have a number less than or equal to the number of unique tokens (10 tokens > 3 identifiers))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Regarding claim 13, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 12.
Sewak does not teach The method of claim 12, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions;
determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph;
comparing the first amount of information to a cut-off score;
and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions.
However, Dane does teach The method of claim 12, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions; (Dane Paragraph 0029; "for each additional entity-linked biological entity of the text sequence, selecting related entities from the knowledge graph according to a relevance score, for example selecting entities where the relevance score of the relationship is above a threshold. The relevance score may be normalized Pointwise Mutual Information (nPMI), where nPMI provides a measure of the extent to which two entities tend to co-occur in the same paragraphs of the biomedical corpus." Examiner notes that determining a first optimal dimension from original dimensions is selecting related entities from knowledge graph, wherein the first optimal dimension has a first amount of information (relationships/edges to other entities and relevance score), and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph (edges connecting the related entity in the knowledge graph))
determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph; (Examiner references previous mapping to show that the first amount of information (relevance score) is compared to a cut off score (threshold))
comparing the first amount of information to a cut-off score; (Dane Paragraph 0029; "selecting related entities from the knowledge graph according to a relevance score" Examiner notes that selected related entity is included in the optimal set of dimensions in response to the comparison)
and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions. (Dane Paragraph 0029; "selecting related entities from the knowledge graph according to a relevance score" Examiner notes that selected related entity is included in the optimal set of dimensions in response to the comparison)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Regarding claim 18, Sewak teaches A non-transitory, computer readable medium storing computer instructions which, when executed by one or more computer processors, cause the one or more computer processors to: (Sewak Paragraph 0191; “The technology described herein may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions, such as program components, being executed by a computer or other machine, such as a personal data assistant or other handheld device.”)
receive a dataset; (Sewak Paragraph 0083; "method 300 begins at step 303 when the labeling service 142 serves a display page to labeling application 110. At step 305, labeling service 142 receives a text string defining candidate text from a document in corpus 154 or from the labeling application 110. At step 310, the labeling service 142 receives a text string defining a label, e.g. from labeling application 110." Examiner notes receives labeled dataset (text string defining candidate text and text string defining label))
receive a labeling task; (Examiner references previous mapping to note that labeling service receives labeling task (text strings incites a labeling task))
determine a first distance between a first projection and a second projection in the projected graph, wherein the second projection has a second label; (Sewak Paragraph 0091; "This might be obtained by computing similarity, e.g. cosine similarity between the embedding vectors" Examiner notes that a first distance is determined from a cosine similarity between first projection and second projection (embedding vectors) in the projected graph via equation cosine distance = 1 – cosine similarity)
and in response to comparing the first likelihood to the second likelihood, generating for display, on a user interface, (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that in response to the comparing, paragraph 0056 and Fig 1 shows display 120 (user interface) used to display prompt)

Sewak does not teach a first recommendation to stop the labeling task. 
However, Ahmet does teach a first recommendation to stop the labeling task. (Ahmet Page 5076 Paragraph 1; "The training is performed for 180 epochs in total." Examiner notes that the total number of epochs is a recommendation to stop the labeling task)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak and Ahmet. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. One of ordinary skill would have motivation to combine Sewak and Ahmet to perform the method on more sparse labeled data for a larger benefit “The proposed approach performs the best out of the pseudo-label based approaches on CIFAR-10. Results in Figure 6 show that our benefit is larger when the number of labels is reduced” (Ahmet Page 5077 Paragraph 4).

Sewak in view of Ahmet does not teach process the semantic graph in an artificial intelligence model to generate a first output,
wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs;
based on the first output, determining an optimal set of dimensions for the labeling task;
However, Dane does teach process the semantic graph in an artificial intelligence model to generate a first output, (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding… inputting the graph embedding representing the subgraph into a graph neural network;" Examiner notes that artificial intelligence model (graph neural network) is processing the semantic graph (machine learning model))
wherein the artificial intelligence model is trained to output optimal sets of dimensions for labeling tasks based on inputted semantic graphs; (Dane Paragraph 0031; "the machine learning model preferably comprises a language model for processing the input representation and a graph model, for example a graph neural network, for receiving the graph embedding…In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that storage circuitry is configure to store an artificial intelligence model (machine learning model), wherein the model is trained to output optimal sets of dimensions (classifier layer is trained to output probability distribution) for labeling tasks based on inputted semantic graphs (graph embedding))
based on the first output, determining an optimal set of dimensions for the labeling task; (Dane Paragraph 0031; "In particular the classifier layer is preferably trained to output a probability distribution over the set of possible biological entity identifiers based on the combined representation." Examiner notes that based on the first output (combined representation), determine an optimal set of dimensions for labeling task (probability distribution over the set of possible biological entity identifiers; shown in Fig 4 410))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Ahmet in further view of Dane does not teach determine a semantic graph by embedding the dataset using a language model;
determining a projected graph by projecting the semantic graph into the optimal set of dimensions;
However, Malden does teach determine a semantic graph by embedding the dataset using a language model; (Malden Paragraph 0012; "A semantic graph embedding system may generate a continuous semantic space from discrete examples of data." Examiner notes that semantic graph embedding system (language model) determines a semantic graph (semantic space) by embedding the sparsely labeled dataset (from discrete examples of data))
determine a projected graph by projecting the semantic graph into the optimal set of dimensions; (Malden Paragraph 0032; "At operation 304, the semantic graph embedding system 116 constructs a Markov chain. The Markov chain comprises a plurality of nodes. Each node in the plurality of nodes represents a data value in the dataset." Malden Paragraph 0035; "At operation 310, the distance (e.g., path cost) determined in operation 308 is stored as a dimension in a vector… then the vector representation of the target node would be: [A,B,C,D,E]… the semantic graph embedding system 116 reduces the vector (e.g., an n-dimensional vector) to a three-dimensional vector using principal component analysis (PCA)." Examiner notes that the projected graph (three-dimensional vector) is determined by projection (reduction by semantic embedding) the semantic graph into the optimal set of dimension (vector))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, and Malden. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, and Malden to understand how the data within the dataset is being used “The semantic graph embedding system may further understanding the semantics of the dataset.” (Malden).

Sewak in view of Ahmet in further view of Dane in further view of Malden does not teach determine a first likelihood the first projection has a first label,
determine a second likelihood the first projection has a second label based in part on the first distance to the second projection
compare the first likelihood to the second likelihood;
However, Li does teach determining a first likelihood the first projection has a first label, (Li Paragraph 0029; "(4.1) using the t-SNE algorithm to perform dimensionality reduction visualization, specifically: step 1: assuming that a data set X has a total of N data points, and the dimension of each data point x.sub.i is D, reducing the dimensions to two dimensions" Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that t-SNE is used to determine the projection of data points and then calculate/determine a first likelihood (conditional probability) the first projection has a first label)
determining a second likelihood the first projection has a second label based in part on the first distance to the second projection (Li Paragraph 0031; ''calculating a conditional probability of similarity between the data points in a high-dimensional space; converting the high-dimensional Euclidean distance between the data points into the conditional probability representative of similarity" Examiner notes that a second likelihood the first projection has a second label is  calculated/determined (conditional probability) based in part on the first distance to the second projection (Euclidean distance))
and comparing the first likelihood to the second likelihood; (Li Paragraph 0034; "minimizing a difference in the conditional probabilities, that is, making the conditional probability Q.sub.j|i approximate to P.sub.j|i; it is achieved by minimizing the Kullback-Leibler divergence between the two conditional probability distributions" Examiner notes that minimizing a difference in the conditional probabilities using Kullback-Leibler is comparing the first likelihood to the second likelihood)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, and Li. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, and Li to maintain a local structure between datapoints to aid in decision making “The main advantage of t-SNE is the ability to maintain a local structure.” (Li Paragraph 0077).

Regarding claim 19, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 18.
Sewak does not teach The non-transitory, computer readable medium of claim 18, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens.
However, Dane does teach The non-transitory, computer readable medium of claim 18, wherein the dataset has a number of unique tokens, wherein the semantic graph has original dimensions, and wherein the original dimensions have a number less than or equal to the number of unique tokens. (Dane Fig 4 and Paragraph 0111; "The word piece tokens 403 and the positional embeddings 407 may simply be summed to form the input representation of the input text sequence 401… The classification layer 409 is trained to output a probability 410 for each of the possible unique biological target identifiers 411" Examiner notes that dataset (401) has a number of unique tokens (word piece tokens), wherein the semantic graph has original dimensions (possible unique biological target identifier), and wherein the original dimensions have a number less than or equal to the number of unique tokens (10 tokens > 3 identifiers))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Regarding claim 20, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 19.
Sewak does not teach The non-transitory, computer readable medium of claim 19, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions;
determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph;
comparing the first amount of information to a cut-off score; and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions.
However, Dane does teach The non-transitory, computer readable medium of claim 19, wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions; (Dane Paragraph 0029; "for each additional entity-linked biological entity of the text sequence, selecting related entities from the knowledge graph according to a relevance score, for example selecting entities where the relevance score of the relationship is above a threshold. The relevance score may be normalized Pointwise Mutual Information (nPMI), where nPMI provides a measure of the extent to which two entities tend to co-occur in the same paragraphs of the biomedical corpus." Examiner notes that determining a first optimal dimension from original dimensions is selecting related entities from knowledge graph, wherein the first optimal dimension has a first amount of information (relationships/edges to other entities and relevance score), and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph (edges connecting the related entity in the knowledge graph))
determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph; (Examiner references previous mapping to show that the first amount of information (relevance score) is compared to a cut off score (threshold))
comparing the first amount of information to a cut-off score; (Dane Paragraph 0029; "selecting related entities from the knowledge graph according to a relevance score" Examiner notes that selected related entity is included in the optimal set of dimensions in response to the comparison)
and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions. (Dane Paragraph 0029; "selecting related entities from the knowledge graph according to a relevance score" Examiner notes that selected related entity is included in the optimal set of dimensions in response to the comparison)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Claim(s) 6 is rejected under 35 U.S.C. 103 as being unpatentable over Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”) in further view of Zhao et al; “Cyclic label propagation for graph semi-supervised learning” published on Jun 24, 2021 (hereinafter “Zhao”).

Regarding claim 6, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 2.
Sewak further teaches and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on the user interface, (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that in response to the comparing, paragraph 0056 and Fig 1 shows display 120 (user interface) used to display prompt)
Sewak does not teach The method of claim 2, further comprising: determining a composite likelihood from the first likelihood and the second likelihood;
determining a first entropy score from the composite likelihood, wherein the first entropy score indicates randomness of label components in the composite likelihood;
comparing the first entropy score to a first threshold entropy score;
[and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on the user interface,] a third recommendation to review the first label being assigned to the first projection.
However, Zhao does teach The method of claim 2, further comprising: determining a composite likelihood from the first likelihood and the second likelihood; (Zhao Page 710 Paragraph 3 and Equation 6; "where fik denotes the probability of node vi belonging to class k." Examiner notes that determining a composite likelihood from the first likelihood and the second likelihood is performing a summation of fik probabilities) 
    PNG
    media_image1.png
    58
    322
    media_image1.png
    Greyscale

determining a first entropy score from the composite likelihood, wherein the first entropy score indicates randomness of label components in the composite likelihood; (Zhao Page 710 Paragraph 3; "The regularizer is composed of a Shannon entropy function H (·)" Examiner notes that a first entropy score (output of Shannon entropy function) is determined from the composite likelihood (summation of probabilities as referenced in previous mapping), wherein the first entropy score indicates randomness of label components in the composite likelihood (Shannon entropy function is a measure of randomness))
comparing the first entropy score to a first threshold entropy score; (Zhao Page 710 Paragraph 3; "If the Shannon entropy of fi is smaller than the threshold, we set ϕi as 1 to indicate that node vi can be utilized as a label context." Examiner notes that comparing the first entropy score (Shannon entropy fi) to a first threshold entropy score (threshold))
[and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on the user interface,] a third recommendation to review the first label being assigned to the first projection. (Zhao Page 710 Paragraph 3; "The binary value of ϕi indicates whether node vi’s learned label is reliable or not and λ acts as a threshold to distinguish the informative labels from the uninformative labels." Examiner notes that binary value determined from Shannon entropy is a third recommendation to review the first label being assigned to the first projection (because it is not reliable))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Zhao. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Zhao teaches using Shannon entropy function in label learning task. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Zhao to leverage Shannon entropy to select some highly reliable node labels in each training iteration “If the Shannon entropy of fi is smaller than the threshold, we set ϕi as 1 to indicate that node vi can be utilized as a label context. As the training process goes on, λ is gradually increased such that more learned highly reliable labels can be included in graph embedding procedure to adaptively update node embeddings.” (Zhao Page 710 Paragraph 3).

Claim(s) 7 is rejected under 35 U.S.C. 103 as being unpatentable over Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”) in further view of Jason; “Semi-Supervised Learning With Label Propagation” available on Dec 06, 2022 (hereinafter “Jason”).

Regarding claim 7, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 2.
Sewak further teaches The method of claim 2, further comprising: determining a first predicted label [for the first projection]; (Sewak Paragraph 0188; "outputting an indication that the candidate text corresponds to a label corresponding to the class." Examiner notes that a first predicted label is determined for the candidate text)
in response to comparing the first consistency score to the first threshold consistency score, filtering [the first projection] to a first group; (Sewak Paragraph 0188; "outputting an indication that the candidate text corresponds to a label corresponding to the class." Examiner notes that a first predicted label is determined for the candidate text; assigning label to candidate text is grouping/filtering the text with the group/label)
and generating for display, on the user interface (Sewak Paragraph 0056; "A labeling application 110 in the operating environment 100 may present a prompt to the user on a display 120." Examiner notes that in response to the comparing, paragraph 0056 and Fig 1 shows display 120 (user interface) used to display prompt)

Sewak does not teach [The method of claim 2, further comprising: determining a first predicted label for] the first projection;
[in response to comparing the first consistency score to the first threshold consistency score, filtering] the first projection [to a first group];
However, Malden does teach [The method of claim 2, further comprising: determining a first predicted label for] the first projection; (Malden Paragraph 0012; "A semantic graph embedding system may generate a continuous semantic space from discrete examples of data." Examiner notes that semantic graph embedding system (language model) determines a semantic graph (semantic space) by embedding the sparsely labeled dataset (from discrete examples of data) which contains first projection of the candidate text)
[in response to comparing the first consistency score to the first threshold consistency score, filtering] the first projection [to a first group]; (Malden Paragraph 0012; "A semantic graph embedding system may generate a continuous semantic space from discrete examples of data." Examiner notes that semantic graph embedding system (language model) determines a semantic graph (semantic space) by embedding the sparsely labeled dataset (from discrete examples of data) which contains first projection of the candidate text)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, and Malden. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, and Malden to understand how the data within the dataset is being used “The semantic graph embedding system may further understanding the semantics of the dataset.” (Malden).

Sewak in view of Malden does not teach determining a first consistency score for the first projection based on a comparison of the first predicted label and the first label, wherein the first consistency score indicates a degree of consensus between the first predicted label and the first label;
comparing the first consistency score to a first threshold consistency score;
[and generating for display, on the user interface,] a fourth recommendation to use the first group as a training sample for a supervised learning task.
However, Jason does teach determining a first consistency score for the first projection based on a comparison of the first predicted label and the first label, wherein the first consistency score indicates a degree of consensus between the first predicted label and the first label; (Jason Section "Label Propagation for Semi-Supervised Learning" Paragraph 13; "model = LabelPropagation() # fit model on training dataset model.fit(X_train_mixed, y_train_mixed) # make predictions on hold out test set yhat = model.predict(X_test) # calculate score for test set score = accuracy_score(y_test, yhat)" Examiner notes that a first consistency score (accuracy score) is determined for the first projection based on comparison of the first predicted label (yhat) and the first label (y_test), wherein the first consistency score indicates a degree of consensus between the first predicted label and the first label (as shown from accuracy score))
comparing the first consistency score to a first threshold consistency score; (Jason Section "Label Propagation for Semi-Supervised Learning" Paragraph 16; "we can see that the label propagation model achieves a classification accuracy of about 85.6 percent, which is slightly higher than a logistic regression fit only on the labeled training dataset that achieved an accuracy of about 84.8 percent." Examiner notes that the first consistency score (85.6 percent) is compared to a first threshold consistency score (84.8 percent))
[and generating for display, on the user interface,] a fourth recommendation to use the first group as a training sample for a supervised learning task. (Jason Section "Label Propagation for Semi-Supervised Learning" Paragraph 16; "So far, so good. Another approach we can use with the semi-supervised model is to take the estimated labels for the training dataset and fit a supervised learning model." Examiner notes that using the first group as a training sample for a supervised learning task (take the estimated labels for the training dataset and fit a supervised learning model.))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Jason. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Jason teaches semi-supervised learning with label propagation. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Jason to use the approach to improve the classification accuracy of the model “In this case, we can see that this hierarchical approach of the semi-supervised model followed by supervised model achieves a classification accuracy of about 86.2 percent on the holdout dataset, even better than the semi-supervised learning used alone that achieved an accuracy of about 85.6 percent.” (Jason Section “Label Propagation for Semi-Supervised Learning” Paragraph 25).

Claim(s) 14 is rejected under 35 U.S.C. 103 as being unpatentable over Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”) in further view of Sebastian; “Finding Correlation Between Many Variables (Multidimensional Dataset) with Python” available on Jan 21 2021 (hereinafter “Sebastian”).

Regarding claim 14, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 13.
Sewak does not teach The method of claim 13, wherein determining the first optimal dimension from the original dimensions comprises: determining an original vector for each original dimension of the original dimensions;
determining a first optimal vector based on the correlation;
and determining the first optimal dimension from the first optimal vector.
However, Dane does teach The method of claim 13, wherein determining the first optimal dimension from the original dimensions comprises: determining an original vector for each original dimension of the original dimensions; (Dane Paragraph 0058; "comprising one or more sentences with one or more masked biological entities, as a vector representation;" Examiner notes that an original vector (vector representation) for each original dimension (entities) of the original dimensions is determined)
determining a first optimal vector [based on the correlation]; (Dane Paragraph 0013; "Preferably a representation comprises a feature vector, i.e. a vector encoding important distinguishing attributes of the input data." Examiner notes that a first optimal vector will be determined based on input)
and determining the first optimal dimension from the first optimal vector. (Dane Paragraph 0113; "The input representation is passed to the trained transformer encoder stack 408… unique identifier “Q936106” associated with biological targets SLC5A2, is identified as the highest probability biological target to fill the masked biological target in the input sequence 401." Examiner notes that the first optimal dimension (biological target) is determined from first optimal vector (input representation is a vector/embedding))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Dane does not teach determining a correlation between the original vector and all other original vectors, wherein the correlation comprises a measure of shared information;
[determining a first optimal] vector based on the correlation;
However, Sebastian does teach determining a correlation between the original vector and all other original vectors, wherein the correlation comprises a measure of shared information; (Sebastian Paragraph 1; "Correlation is any of a broad class of statistical relationships involving dependence." Sebastian Paragraph 5; "The “corr()” method evaluates the correlation between all the features, then it can be graphed with a color coding:" Examiner notes that corr() method is determining a correlation between the original vector and all other original vectors (correlation between all the features), wherein the correlation comprises a measure of shared information (correlation showing relational dependence incites a measure of shared information))
[determining a first optimal] vector based on the correlation; (Examiner refers to correlation data is used as input to determine a first optimal vector)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Nathanael. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Nathanael teaches an algorithm to find the most highly correlated pair of variables. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Nathanael to find which variables in a dataset are most highly correlated to aid in machine learning tasks “For several tasks in machine learning it is useful to know which two (or few1 ) variables in a dataset are most highly correlated.” (Nathanael Page 1 Paragraph 1).

Claim(s) 15 and 16 are rejected under 35 U.S.C. 103 as being unpatentable over Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”) in further view of Atsushi; US 20180060448 A1 filed on Mar 27, 2015 (hereinafter “Atsushi”).

Regarding claim 15, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 13.
Sewak does not teach The method of claim 13, further comprising: determining a second optimal dimension from the original dimensions, wherein the second optimal dimension has a second amount of information, and wherein the second amount of information is less than the first amount of information;
and in response to comparing the running total amount of information to the cut-off score, discarding the second optimal dimension.
However, Dane does teach The method of claim 13, further comprising: determining a second optimal dimension from the original dimensions, wherein the second optimal dimension has a second amount of information, and wherein the second amount of information is less than the first amount of information; (Dane Paragraph 0029; "for each additional entity-linked biological entity of the text sequence, selecting related entities from the knowledge graph according to a relevance score, for example selecting entities where the relevance score of the relationship is above a threshold. The relevance score may be normalized Pointwise Mutual Information (nPMI), where nPMI provides a measure of the extent to which two entities tend to co-occur in the same paragraphs of the biomedical corpus." Examiner notes that determining a second optimal dimension from original dimensions is selecting related entities from knowledge graph, wherein the second optimal dimension has a second amount of information (relationships/edges to other entities and relevance score), and wherein the second amount of information is less than the  first amount of information (lower relevance score means less relationships/edges))
and in response to comparing the running total amount of information to the cut-off score, discarding the second optimal dimension. (Dane Fig 12 and Paragraph 0166; "FIG. 12 is a schematic diagram showing an example of the pre-training stage. The pre-training stage is followed optionally by fine-tuning and evaluation stage. The pre-trained model weights are carried forth during the optional fine-tuning and evaluation for predicting the unique identifier of a masked entity in the sentences of an entity-linked corpus." Examiner notes that in fine tuning phase, second optimal dimension is discarded/not considered (Stephen hawking entity is set to 0.00))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Dane does not teach combining the first amount of information and the second amount of information into a running total amount of information; 
However, Atsushi does teach combining the first amount of information and the second amount of information into a running total amount of information; (Atsushi Paragraph 0196; "If an edge is added to the edge list 1210, the spanning tree creating section 116 counts the number of edges included in the edge list by incrementing the counter (S2707)." Examiner notes that combining the first amount of information and the second amount of information (adding edges to the edge list) into a running total amount of information (number of edges))
comparing the running total amount of information to the cut-off score; (Atsushi Paragraph 0089; "the control factor calculation section 11 calculates a threshold value for adjusting the number of edges included in the graph data" Examiner notes that running total amount of information (number of edges) is compared to the cut-off score (threshold value))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Atsushi. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Atsushi teaches a method for creating graph data for analysis. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Atsushi to reduce the amount of data and perform a fast graph processing “it is possible to reduce the data amount and perform a fast graph processing such as a correlation analysis or a principal component analysis while maintaining necessary accuracy.” (Atsushi Paragraph 0012).

Regarding claim 16, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 13.
Sewak does not teach The method of claim 13, further comprising: receiving a first user input;
However, Dane does teach The method of claim 13, further comprising: receiving a first user input; (Dane Paragraph 0136; " the system includes a user device 901 providing a user interface for a user to input a query defining a biological context for which a biological target is sought." Examiner notes that system receives a first user input from user interface)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, and Dane. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. One of ordinary skill would have motivation to combine Sewak, Ahmet, and Dane to leverage the advantages of a knowledge graph to enhance the ability of the model “This method therefore benefits from the advantages associated with knowledge graph inference and language models to further enhance the ability of the model to determine biological entities of interest for a given user-specified biological context.” (Dane Paragraph 0026).

Sewak in view of Dane does not teach and determining the cut-off score based on the first user input. 
However, Atsushi does teach and determining the cut-off score based on the first user input. (Atsushi Paragraph 0089; "the control factor calculation section 11 calculates a threshold value for adjusting the number of edges" Examiner notes that the control factor calculation determines/calculates a cut-off score (threshold value) based on the first user input)

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Atsushi. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Atsushi teaches a method for creating graph data for analysis. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Atsushi to reduce the amount of data and perform a fast graph processing “it is possible to reduce the data amount and perform a fast graph processing such as a correlation analysis or a principal component analysis while maintaining necessary accuracy.” (Atsushi Paragraph 0012).

Claim(s) 17 is rejected under 35 U.S.C. 103 as being unpatentable over  Sewak; US 20220414137 A1 filed on Apr 1, 2022 (hereinafter “Sewak”) in view of Ahmet et al; “Label Propagation for Deep Semi-Supervised learning” published on 2019 (hereinafter “Ahmet”) in further view of Dane et al; US 20250022615 A1 filed on Nov 14, 2022 (hereinafter “Dane”) in further view of Malden; US 20210042471 A1 filed on Jul 27, 2020 (hereinafter “Malden”) in further view of Li et al; US 20220157468 A1 filed on Dec 17, 2021 (hereinafter “Li”) in further view of Nitzan et al; US 20170204455 A1 filed on Jul 17, 2015 (hereinafter “Nitzan”).

Regarding claim 17, Sewak in view of Ahmet in further view of Dane in further view of Malden in further view of Li teaches claim 13.
Sewak does not teach The method of claim 13, wherein the cut-off score is between ninety-five and ninety-nine percent of the amount of information in the semantic graph.
However, Nitzan does teach The method of claim 13, wherein the cut-off score is between ninety-five and ninety-nine percent of the amount of information in the semantic graph. (Nitzan Paragraph 0040; "the cumulative distribution function (CDF) value of that genetic variant reaches a predefined threshold value (CDF_thresh) of 0.99, 0.995, 0.999, 0.9999, 0.99999 or greater." Examiner notes that the cut-off score (threshold value) is between ninety-five and ninety-nine percent of the amount of information in the semantic graph (0.99))

It would have obvious to one of ordinary skill in the art before the effective filing date of the present application to combine Sewak, Ahmet, Dane, Malden, Li, and Nitzan. Sewak teaches a system for automatic labeling of text data. Ahmet teaches a method for label propagation for deep semi-supervised learning. Dane teaches using knowledge graphs to predict new biological targets. Malden teaches a method of embedding graphs into a semantic multidimensional space by receiving a dataset. Li teaches a method of visualizing data. Nitzan teaches using a probability distribution model to determine a frequency threshold. One of ordinary skill would have motivation to combine Sewak, Ahmet, Dane, Malden, Li, and Nitzan to perform identification of entities with improved statistical confidence “The method also allows for the identification of the presence of genetic mutations at low frequencies with improved statistical confidence.” (Nitzan Paragraph 0032).

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to DANIEL DUC TRAN whose telephone number is (571)272-6870. The examiner can normally be reached Mon-Fri 8:00-5:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached at (571) 270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/D.D.T./Examiner, Art Unit 2147                                                                                                                                                                                                        
/ERIC NILSSON/Primary Examiner, Art Unit 2151
Read full office action
SYSTEMS AND METHODS FOR LABEL PROPAGATION USING SUPERVISED PROJECTIONS OF SEMANTIC EMBEDDINGS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

SYSTEMS AND METHODS FOR LABEL PROPAGATION USING SUPERVISED PROJECTIONS OF SEMANTIC EMBEDDINGS

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email