Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
This Office Action is sent in response to Application’s Communication received on 01/23/2023 for application number 18/158060. The Office hereby acknowledges receipt of the following and placed of record in file: Specification, Drawing, Abstract, Oath/Declaration, and Claims.
Claims 1, (2-14) and (18-20) are presented for examination.
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 07/23/2025, was filed prior to current Office Action. The submission is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-2, 4-13, 15, 18-20 are rejected under AIA 35 U.S.C. 103(a) as being unpatentable over Huang Zhen. Foreign Application Publication CN 113806494 B (hereinafter Zhen) in view of Kozlowski C et al. Foreign Application Publication CN 116157834 A (hereinafter Kozlowski) and further in view of Munro et al. US Patent Application Publication US 20190361966 A1 (hereinafter Munro) and further in view of Meng Xiaojun et al. WO 2023040742 A1 (hereinafter Meng).
Regarding claim 1, Zhen teaches A system for improving label quality in datasets using a quality filter based on a consistency…, the system comprising (Claims 2-3 text, page. 3, paragraph 6n wherein Zhen describes improving the labelling prediction by processing input data and generating prediction result and the consistency of the data real labels, wherein the process includes filtering false labels data set with high quality and preventing error of the prediction).
Zhen incorporates the consistency but it does not teach the consistency score.
However in analogous art of label quality assurance using consistency scores, Kozlowski teaches using consistency scores (Page. 24, paragraph 6, page. 25, paragraphs 1-2, wherein Kozlowski incorporates label prediction based on comparing labels using threshold and determining the consistency scores between labels) compare the first consistency score to a first threshold consistency score, and in response to comparing the first consistency score to the first threshold consistency score, filter the first text string to a first group (page. 24, paragraph 6, page. 25, paragraph 1, wherein Kozlowski uses comparison data to determine the accuracy of the consistency score based on incorporating a machine learning model to determine the label prediction accuracy by using a threshold and scoring function to compare labels and analyses the label prediction and classification).
It would have been obvious to a person in the ordinary skill in the art before the effective filing date of the claimed invention to combine Kozlowski with Zhen by incorporating the method of using consistency scores; compare the first consistency score to a first threshold consistency score, and in response to comparing the first consistency score to the first threshold consistency score, filter the first text string to a first group of Kozlowski into the method of improving label quality in datasets using a quality filter based on a consistency of Zhen for the purpose of incorporating a scoring function (e.g., quantization by the machine learning model predicted label and true value label or true label deviation degree) (Kozlowski: Page. 24, paragraph 6).
Zhen does not teach cloud-based storage circuitry configured to store; wherein the first consistency score indicates a measure of agreement between predicted labels; and cloud-based input/output circuitry configured to: generate for display, on a user interface, a recommendation to use the first group as a training sample for a supervised learning task.
However in analogous art of label quality assurance using consistency scores, Munro teaches cloud-based storage circuitry configured to store ([0141], [0146], [0150] wherein Munro describes Cloud-based storage systems with circuitry for label predictions) wherein the first consistency score indicates a measure of agreement between predicted labels ([0104-0107] wherein Munro incorporates an agreement score between predicted labels) and cloud-based input/output circuitry configured to: generate for display, on a user interface, a recommendation to use the first group as a training sample for a supervised learning task (FIGS. 5-6, 11, [0017], [0068], [0071-0072], [0131], [0103] wherein Munro displays suggestions on an interface for different group as illustrated in FIGS. 5-6, 11, wherein the interface is an agreement interface includes a learning curve pane, such learning curve displays a graphical representation of the number of annotations received for a particular label or task and the agreement among those annotations for the accuracy of that particular label or task).
It would have been obvious to a person in the ordinary skill in the art before the effective filing date of the claimed invention to combine Munro with Zhen by incorporating the method of cloud-based storage circuitry configured to store; wherein the first consistency score indicates a measure of agreement between predicted labels; and cloud-based input/output circuitry configured to: generate for display, on a user interface, a recommendation to use the first group as a training sample for a supervised learning task of Munro into the method of improving label quality in datasets using a quality filter based on a consistency of Zhen for the purpose of aggregating annotation agreement score representing the overall accuracy of the ontology as determined by the annotation agreements across all labels or tasks of the ontology) (Munro: [0017]).
Zhen does not teach an artificial intelligence model, wherein the artificial intelligence model is trained to output predicted labels given inputted text strings; receive a first given label for a first text string, process the first text string in the artificial intelligence model.
However in analogous art of label quality assurance using consistency scores, Meng teaches wherein the artificial intelligence model is trained to output predicted labels given inputted text strings (Abstract, page. 2, paragraph 8, page. 3, paragraph 1, page. 5, paragraph 3, page. 7, paragraphs 4-6 wherein Meng incorporates artificial intelligence for processing input text and outputting the precited labels) receive a first given label for a first text string, process the first text string in the artificial intelligence model; determine a first predicted label for the first text string (Abstract, page. 5, paragraph 3, page. 6, paragraph 5, page. 7, paragraphs 4-6 wherein Meng processes each text and receives first label for the first text) determine a first consistency score for the first text string based on a comparison of the first predicted label and the first given label (Abstract, page. 3, paragraph 6, page. 4, paragraph 1, page. 5, paragraph 3, page. 6, paragraph 5, page. 7, paragraphs 4-6 wherein Meng matches text with labels and generates a matching degree between the text and labels).
It would have been obvious to a person in the ordinary skill in the art before the effective filing date of the claimed invention to combine Meng with Zhen by incorporating the method of wherein the artificial intelligence model is trained to output predicted labels given inputted text strings; receive a first given label for a first text string, process the first text string in the artificial intelligence model; determine a first predicted label for the first text string; determine a first consistency score for the first text string based on a comparison of the first predicted label and the first given label of Meng into the method of improving label quality in datasets using a quality filter based on a consistency of Zhen for the purpose of incorporating semantics at the level of text and characters to improve the accuracy of labels prediction) (Meng: page. 5, paragraph 3).
Regarding claim 2, the claim is similar in scope to claim 1 therefore the claim is rejected under similar rationale.
Regarding claim 4, Zhen as modified by Kozlowski, Munro and Meng teaches wherein receiving a first given label for a first text string further comprises: processing the first text string in a first machine learning model trained to predict labels for text string inputs, wherein the first machine learning model has been trained separately from the artificial intelligence model; and determining a first given label for the first text string ([0050], [0076] wherein Munro incorporates an artificial intelligence to analyze an entire corpus of documents to rapidly identify the general trends and commentary across all documents and recognize keywords, syntax, and relation within documents in a much more timely fashion. By categorizing a collection of documents into topics, with a series of descriptive labels and tasks describing each document, a natural language modeling engine can build an ontology of documents demonstrative of overall sentiments and underlying meaning of trends across all documents), (Abstract, page. 2, paragraph 3, page. 5, paragraph 3, page. 6, paragraph 5, page. 7, paragraphs 4-6, page. 8, paragraph 2 wherein Meng discloses processing first text via machine learning and predicts the first label).
Regarding claim 5, Zhen as modified by Kozlowski, Munro and Meng teaches wherein receiving a first given label for a first text string further comprises: receiving a first user input; and determining a first given label for the first text string based on the first user input (FIGS. 5-6, 11, [0011-0012], [0017-0018], [0068], [0071-0076], [0131], [0103] wherein Munro displays suggestions on an interface for different group as illustrated in FIGS. 5-6, 11, wherein the interface is an agreement interface includes a learning curve pane, such learning curve displays a graphical representation of the number of annotations received for a particular label or task and the agreement among those annotations for the accuracy of that particular label or task. Wherein Munro provides tools via an interface to allow users to provide input and determine the label of a text).
Regarding claim 6, Zhen as modified by Kozlowski, Munro and Meng teaches wherein comparing the first consistency score to the first threshold consistency score in a quality filter further comprises: in response to comparing the first consistency score to the first threshold consistency score, filtering the first text string to a second group; and generating for display, on a user interface, a recommendation to assign a second given label to the first text string (page. 24, paragraph 6, page. 25, paragraph 1, wherein Kozlowski uses comparison data to determine the accuracy of the consistency score based on incorporating a machine learning model to determine the label prediction accuracy by using a threshold and scoring function to compare labels and analyses the label prediction and classification), (Claims 2-3 text, page. 3, paragraph 6n wherein Zhen describes improving the labelling prediction by processing input data and generating prediction result and the consistency of the data real labels, wherein the process includes filtering false labels data set with high quality and preventing error of the prediction), (FIGS. 5-6, 11, [0017], [0068], [0071-0072], [0131], [0103] wherein Munro displays suggestions on an interface for different group as illustrated in FIGS. 5-6, 11, wherein the interface is an agreement interface includes a learning curve pane, such learning curve displays a graphical representation of the number of annotations received for a particular label or task and the agreement among those annotations for the accuracy of that particular label or task).
Regarding claim 7, Zhen as modified by Kozlowski, Munro and Meng teaches receiving a second given label for the first text string; and determining a second consistency score for the first text string based on a comparison of the first predicted label and the second given label (page. 24, paragraph 6, page. 25, paragraph 1 wherein Kozlowski describes a scoring function for comparing consistency between a true label and the predicted labels).
Regarding claim 8, Zhen as modified by Kozlowski, Munro and Meng teaches wherein the second consistency score for the first text string is based on a comparison of the first predicted label, the first given label, and the second given label (page. 24, paragraph 6, page. 25, paragraph 1 wherein Kozlowski describes a scoring function for comparing consistency between a true label and the predicted labels, wherein the true label is the first label and the predicted label is the second label).
Regarding claim 9, Zhen as modified by Kozlowski, Munro and Meng teaches wherein the first predicted label comprises a first probability distribution having a first set of possible labels with first likelihoods, and wherein the first given label comprises a second probability distribution having a second set of possible labels with second likelihoods ([0014], [0082-0084], [0089-0090], [0103-0104] wherein a natural language modeling engine's prediction of a label or task of a document Munro populates the given labels and provides means for the user to select the possible labels)
Regarding claim 10, Zhen as modified by Kozlowski, Munro and Meng teaches wherein determining the first consistency score further comprises: determining a first measure of center and a first variance metric from the first probability distribution; determining a second measure of center and a second variance metric from the second probability distribution; and determining the degree of consensus based on a comparison of the first measure of center with the second measure of center, combined with a comparison of the first variance metric with the second variance metric ([0104-0105], [0126], [0130-0131] wherein Munro describes a computed learning curve metric and computes metric and provides labels feedback pane that is populated with each label of an ontology, wherein agreement scores are compared and creating an interface for side by side comparison of agreements if labels were asked to be distinguished from one another or if they were combined), (Abstract, page. 3, paragraph 6, page. 4, paragraph 1, page. 5, paragraph 3, page. 6, paragraph 5, page. 7, paragraphs 4-6 wherein Meng matches text with labels and generates a matching degree between the text and labels).
Regarding claim 11, Zhen as modified by Kozlowski, Munro and Meng teaches wherein determining the first consistency score further comprises: determining an intersection between the first probability distribution and the second probability distribution, wherein the intersection comprises a set of possible labels shared by the first probability distribution and the second probability distribution; and determining the degree of consensus based on a cardinality of the intersection, wherein the degree of consensus increases along with a number of possible labels shared by the first probability distribution and the second probability distribution ([0018], [0021], [0125], [0127] wherein Munro describes analysis of the natural language process engine ontology and indicate trends and sentiments contained within the collection of documents for that ontology with a certain degree of reliance (the annotation agreement score most readily serving as a proxy for reliance of the ontology). Wherein an ontology comprising a collection of thousands of “tweets” from a Twitter hashtag of #Tesla could divide the tweets into labels based on common words across the tweets, such as “battery,” “autonomous,” and “Elon Musk” with tasks related to each label such as “positive” or “negative” and display the number of tweets that fall within each label and task and the number of annotations to each tweet and the annotation agreement amongst annotator to give a fast overview of the general disposition of the tweets within the ontology. Munro teaches about common labels that shared between multiple collections based on agreement score).
Regarding claim 12, Zhen as modified by Kozlowski, Munro and Meng teaches wherein determining the degree of consensus further comprises: comparing the first set of possible labels with the second set of possible labels; in response to comparing the first set of possible labels with the second set of possible labels, determining a set of shared labels; and determining a divergence between the first probability distribution and the second probability distribution, based on a comparison of the first likelihoods with the second likelihoods in the set of shared labels (page. 24, paragraph 6, page. 25, paragraph 1 wherein Kozlowski describes a scoring function for comparing consistency between a true label and the predicted labels, wherein the true label is the first label and the predicted label is the second label), ([0018], [0021], [0125], [0127] wherein Munro describes analysis of the natural language process engine ontology and indicate trends and sentiments contained within the collection of documents for that ontology with a certain degree of reliance (the annotation agreement score most readily serving as a proxy for reliance of the ontology). Wherein an ontology comprising a collection of thousands of “tweets” from a Twitter hashtag of #Tesla could divide the tweets into labels based on common words across the tweets, such as “battery,” “autonomous,” and “Elon Musk” with tasks related to each label such as “positive” or “negative” and display the number of tweets that fall within each label and task and the number of annotations to each tweet and the annotation agreement amongst annotator to give a fast overview of the general disposition of the tweets within the ontology. Munro teaches about common labels that shared between multiple collections based on agreement score).
Regarding claim 13, Zhen as modified by Kozlowski, Munro and Meng teaches wherein the first given label is a first hard label, wherein the first predicted label is a second hard label, and wherein determining the first consistency score further comprises: determining an edit distance between the first hard label and the second hard label, wherein the edit distance comprises a measure of single-character edits needed to change the first hard label into the second hard label; and determining the degree of consensus based on the edit distance (page. 22, paragraphs 5-7 wherein Meng describes using Hamming loss function to indicate the similarity between the prediction result corresponding to the text to be processed and the expected result corresponding to the text to be processed, wherein Hamming measures the distance).
Regarding claim 15, Zhen as modified by Kozlowski, Munro and Meng teaches determining a first outlier score based on the distance; comparing the first outlier score to a first threshold outlier score; and selecting a recommendation from a plurality of recommendations based on comparing the first outlier score to a first threshold outlier score (page. 24, paragraph 6, page. 25, paragraph 1, wherein Kozlowski uses comparison data to determine the accuracy of the consistency score based on incorporating a machine learning model to determine the label prediction accuracy by using a threshold and scoring function to compare labels and analyses the label prediction and classification), (Claims 2-3 text, page. 3, paragraph 6n wherein Zhen describes improving the labelling prediction by processing input data and generating prediction result and the consistency of the data real labels, wherein the process includes filtering false labels data set with high quality and preventing error of the prediction), (FIGS. 5-6, 11, [0017], [0068], [0071-0072], [0131], [0103] wherein Munro displays suggestions on an interface for different group as illustrated in FIGS. 5-6, 11, wherein the interface is an agreement interface includes a learning curve pane, such learning curve displays a graphical representation of the number of annotations received for a particular label or task and the agreement among those annotations for the accuracy of that particular label or task).
Regarding claim 18, the claim is similar in scope to claim 1 therefore the claim is rejected under similar rationale.
Regarding claim 19, the claim is similar in scope to claim 3 therefore the claim is rejected under similar rationale.
Regarding claim 20, the claim is similar in scope to claim 6 therefore the claim is rejected under similar rationale.
Claims 3, 14, 16-17 are rejected under AIA 35 U.S.C. 103(a) as being unpatentable over Huang Zhen. Foreign Application Publication CN 113806494 B (hereinafter Zhen) in view of Kozlowski C et al. Foreign Application Publication CN 116157834 A (hereinafter Kozlowski) and further in view of Munro et al. US Patent Application Publication US 20190361966 A1 (hereinafter Munro) and further in view of Meng Xiaojun et al. WO 2023040742 A1 (hereinafter Meng) and further in view of Shah et al.US Patent Application Publication US 20230161964 A1 (hereinafter Shah).
Regarding claim 3, Zhen, Kozlowski, Munro and Meng do not teach wherein determining the first predicted label for the first text string further comprises: determining a first datapoint by embedding the first text string in a semantic graph using a language model; receiving a second label for a second text string; embedding the second text string in the semantic graph using the language model, wherein the second text string is represented by a second datapoint; determining a distance between the first datapoint and the second datapoint; and determining the first predicted label based in part on the distance.
However in analogous art of label quality assurance using consistency scores, Shah teaches wherein determining the first predicted label for the first text string further comprises: determining a first datapoint by embedding the first text string in a semantic graph using a language model; receiving a second label for a second text string; embedding the second text string in the semantic graph using the language model, wherein the second text string is represented by a second datapoint; determining a distance between the first datapoint and the second datapoint; and determining the first predicted label based in part on the distance (Abstract, Claims 1, 8, 14-15 text, [0003], [0005], [0024], [0034], [0053], [0060-0067] wherein Shah predicts a set of labels for each sentence using a multi-label classifier, the multi-label classifier including a self-attended contextual word embedding backbone layer, a bank of trainable unigram convolutions, a bank of trainable bigram convolutions, and a fully connected layer the multi-label classifier trained using a weakly labeled data set; and labeling the document based on the set of labels. Shah describes a training data generation process that extract sentences from contextual documents and generate contextual sentence embedding for each sentence using output of the last layer of universal sentence encoder (or similar model). As a result, the process can generate a large number of sentence embeddings and clustering each batch of sentence embeddings that is capable of handling complex clustering that is done by spectral decomposition of a graph into subgraphs, wherein the data points are sentences, wherein the spectral clustering is used to ease calculation of intra-sentence distances and finding graph neighborhoods).
It would have been obvious to a person in the ordinary skill in the art before the effective filing date of the claimed invention to combine Shah with Zhen, Kozlowski, Munro and Meng by incorporating the method of wherein determining the first predicted label for the first text string further comprises: determining a first datapoint by embedding the first text string in a semantic graph using a language model; receiving a second label for a second text string; embedding the second text string in the semantic graph using the language model, wherein the second text string is represented by a second datapoint; determining a distance between the first datapoint and the second datapoint; and determining the first predicted label based in part on the distance of Shah into the method of improving label quality in datasets using a quality filter based on a consistency of Zhen, Kozlowski, Munro and Meng in order to include clustering each batch of sentence embeddings using a spectral clustering routine that is capable of handling complex clustering since the clustering is done by spectral decomposition of a graph into subgraphs, where the nodes or data points are the sentences. The spectral clustering can be used to ease the calculation of intra-sentence distances (e.g., cosine distances) and finding graph neighborhoods) (Shah: [0067]).
Regarding claim 14, Zhen as modified by Kozlowski, Munro, Meng and Shah teaches wherein the first predicted label has a first likelihood, wherein the first given label has a second likelihood, and wherein the method comprises: determining a composite likelihood from the first likelihood and the second likelihood; determining a first entropy score from the composite likelihood, wherein the first entropy score indicates randomness of label components in the composite likelihood; comparing the first entropy score to a first threshold entropy score; and in response to comparing the first entropy score to the first threshold entropy score, generating for display, on a user interface, a recommendation to review the first predicted label ([0080] wherein Shah teaches method can include training a machine learning model (e.g., multi-label classifier 300) using the weakly supervised training data generated using the processes described in FIG. 7. In an embodiment, the method can use a binary cross-entropy loss and convert the set of labels per sentence into multi-label binarized vectors. In some embodiments, the method can use Adam as the gradient descent method. Various alternative training processes can be used. During the training, the method updates various weights used by the model (e.g., neuron weights) based on comparing the predicted labels to the labels generated using the weakly supervised training data generation process)
Regarding claim 16, Zhen as modified by Kozlowski, Munro, Meng and Shah teaches determining an optimal set of dimensions based on the semantic graph; determining a projected graph by projecting the semantic graph into the optimal set of dimensions; selecting a first projection from the projected graph, wherein the first projection corresponds to the first datapoint in the semantic graph; selecting a second projection from the projected graph, wherein the second projection corresponds to the second datapoint in the semantic graph; determining a projected distance between the first projection and the second projection; and determining the first predicted label based in part on the projected distance between the first projection and the second projection (Abstract, Claims 1, 8, 14-15 text, [0003], [0005], [0024], [0034], [0053], [0060-0067] wherein Shah predicts a set of labels for each sentence using a multi-label classifier, the multi-label classifier including a self-attended contextual word embedding backbone layer, a bank of trainable unigram convolutions, a bank of trainable bigram convolutions, and a fully connected layer the multi-label classifier trained using a weakly labeled data set; and labeling the document based on the set of labels. Shah describes a training data generation process that extract sentences from contextual documents and generate contextual sentence embedding for each sentence using output of the last layer of universal sentence encoder (or similar model). As a result, the process can generate a large number of sentence embeddings and clustering each batch of sentence embeddings that is capable of handling complex clustering that is done by spectral decomposition of a graph into subgraphs, wherein the data points are sentences, wherein the spectral clustering is used to ease calculation of intra-sentence distances and finding graph neighborhoods).
Regarding claim 17, Zhen as modified by Kozlowski, Munro, Meng and Shah teaches wherein the semantic graph has original dimensions, and wherein determining the optimal set of dimensions further comprises: determining an amount of information present in the semantic graph, wherein the amount of information is unevenly distributed among the original dimensions; determining a first optimal dimension from the original dimensions, wherein the first optimal dimension has a first amount of information, and wherein the first amount of information comprises a portion of the amount of information present in the semantic graph; comparing the first amount of information to a cut-off score; and in response to comparing the first amount of information to the cut-off score, including the first optimal dimension in the optimal set of dimensions (Page. 24, paragraph 6, page. 25, paragraphs 1-2, wherein Kozlowski incorporates label prediction based on comparing labels using threshold and determining the consistency scores between labels) compare the first consistency score to a first threshold consistency score, and in response to comparing the first consistency score to the first threshold consistency score, filter the first text string to a first group (page. 24, paragraph 6, page. 25, paragraph 1, wherein Kozlowski uses comparison data to determine the accuracy of the consistency score based on incorporating a machine learning model to determine the label prediction accuracy by using a threshold and scoring function to compare labels and analyses the label prediction and classification).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to HASSAN MRABI whose telephone number is (571)272-8875. The examiner can normally be reached on Monday-Friday, 7:30am-5pm. Alt, Friday, EST.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached on 571-270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR. Status information for unpublished applications is available through Private PAIR only. For more information about the PAIR system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/HASSAN MRABI/Examiner, Art Unit 2144