DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1,
Step 1: Is the claim to a process, machine, manufacture or composition of matter?
Yes, the claim is directed to a method/process.
Step 2A Prong One: Does the claim recite an abstract idea, law of nature, or natural phenomenon?
The limitations of:
annotating the unlabeled dataset by labeling pairs of examples which are labeled with binary "YES" or "No" answers; (mental judgement, a human can, using their head or pen/paper look at data and label it manually)
generating, in response to annotating the unlabeled dataset, an annotated labeled dataset having annotated examples; (mental evaluation, after mentally labeling the data, the human would have the annotated set of data that they labeled themself)
propagating constraints based on a preconfigured rule in a manner such that a user can go back to previously annotated examples from the annotated labeled dataset; (mental judgement, using arbitrary rules or constraints, a human can enforce those constraints on the data)
labeling the previously annotated examples correctly when the user determines that the previously annotated examples are mis-labeled; (mental evaluation, a human can decide if a constraint is violated and make a subsequent correction to the data)
Step 2A Prong Two: Does the claim recite additional elements that integrate the judicial exception into a practical application?
The limitations of:
receiving, by a machine learning data model, an unlabeled dataset; (mere data gathering, insignificant extra-solution activity that ties the use to generic high-level machine learning, MPEP 2106.05(g))
querying, by utilizing a user interface, the machine learning data model to fetch the unlabeled dataset; (mere data gathering, insignificant extra-solution activity executed on a generic computer MPEP 2106.05(g))
automatically training the machine learning data model with correct labeling of the annotated examples (high-level recitation of generic machine learning, instructions to use a machine learning model and apply it, 2106.05(f))
Step 2B: Does the claim recite additional elements that amount to significantly more than the judicial exception?
The limitations of:
receiving, by a machine learning data model, an unlabeled dataset; (mere data gathering, insignificant extra-solution activity that ties the use to generic high-level machine learning, MPEP 2106.05(g), transmitting data is well-understood, routine, and conventional, MPEP 2106.05(d)(II)(i))
querying, by utilizing a user interface, the machine learning data model to fetch the unlabeled dataset; (mere data gathering, insignificant extra-solution activity executed on a generic computer MPEP 2106.05(g), transmitting data is well-understood, routine, and conventional, MPEP 2106.05(d)(II)(i))
automatically training the machine learning data model with correct labeling of the annotated examples (high-level recitation of generic machine learning, instructions to use a machine learning model and apply it, 2106.05(f))
Note that independent claims 12 and 20 recite the same substantial subject matter as independent claim 1, only differing in embodiment. The differences in embodiments, a system and non-transitory computer as opposed to a method do not meaningfully change the above analysis and therefore the claims are subject to the same rejection.
Dependent claim 2 recites filtering examples, a mental evaluation.
Dependent claims 3 recites filtering out non-English prose, mental evaluation.
Dependent claim 4 recites determining whether data belongs to a class and annotating it, mental evaluation.
Dependent claim 5 recites chaining questions to generate a label, a mental process of asking questions.
Dependent claim 6 recites generating default answers from the model, insignificant extra-solution activity, MPEP 2106.05(d)(II)(i)
Dependent claim 7 recites fetching, storing, and propagating, generic computer to carry out the abstract idea, MPEP 2106.05(f) and 2106.05(d)(II)(iv).
Dependent claim 8 recites receiving user input and storing it, mental evaluation and generic computer components to carry out the abstract idea, MPEP 2106.05(f) and 2106.05(d)(II)(iv).
Dependent claim 9 recites caching and undoing, insignificant extra-solution activity, 2106.05(d)(II)(iv).
Dependent claim 10 recites active learning, well-understood routine and conventional concepts as evidenced by the prior art, see Munro1 and Munro2 abstract.
Dependent claim 11 recites a classification model, applying the abstract idea to a particular filed of use. 2106.05(f).
Dependent claims 13-19 corresponding to dependent claims 2-8.
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claim(s) 1, 4-8, 10-12, and 15-20 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Munro et al. US 2019/0361966 [herein Munro1].
Regarding claims 1, 12, and 20, Munro1 teaches “a method for annotating an unlabeled dataset with constraint propagation by utilizing one or more processors along with allocated memory, the method comprising: receiving, by a machine learning data model, an unlabeled dataset” (abstract “receive confirmation of classifications made by a natural language modeling engine to improve organization of a collection of documents into an hierarchical structure.” wherein receiving classifications implies it was not classified before, i.e. receiving unlabeled data);
“querying, by utilizing a user interface, the machine learning data model to fetch the unlabeled dataset” ([0056] “In some embodiments, natural language modeling engine 210 comprises […] an input/output (I/O) module configured to receive and transmit information throughout interface system 200”);
“annotating the unlabeled dataset by labeling pairs of examples which are labeled [with binary "YES" or "No" answers]” ([0050] “By categorizing a collection of documents into topics, with a series of descriptive labels and tasks describing each document, a natural language modeling engine can build an ontology of documents demonstrative of overall sentiments and underlying meaning of trends across all documents” i.e. annotating the data with labels);
“generating, in response to annotating the unlabeled dataset, an annotated labeled dataset having annotated examples” (previous citation, by labeling the data, the annotated labeled dataset is generated);
“propagating constraints based on a preconfigured rule in a manner such that a user can go back to previously annotated examples from the annotated labeled dataset” ([0053] “Database 115 may store a collection of documents for server machine 110 to access and create an ontology around, or may store artificial intelligence rules for server machine 110 to access and apply to sorting a collection of documents” rules or a constraint);
“labeling the previously annotated examples correctly when the user determines that the previously annotated examples are mis-labeled” [0082] “Examples of such human readable prompt include, but are not limited to, […] “does this label apply to the document?” with yes and no labels in label pane 830” yes or no are binary labels; and
“automatically training the machine learning data model with correct labeling of the annotated examples” ([0123] “An annotation agreement score displays the overall agreement between annotators of the whole collection of documents within the ontology being analyzed, and can indicate the general disposition or accuracy of the entire ontology and whether further by-label or by-task analysis or manipulation is warranted given the overall agreement score or whether annotators should retrained on the definition of particular categories.”)
Note that independent claims 12 and 20 recite the same substantial subject matter as independent claim 1, only differing in embodiment. The differences in embodiments, a system and non-transitory medium are taught by Munro1 as well, [0141] and [0139] respectively.
Regarding claims 4 and 15, Munro1 teaches “in annotating the unlabeled dataset, the method further comprising: determining whether examples presented in the unlabeled dataset belong to same class or not” ([0005] “Artificial intelligence tools can attempt to analyze and classify these information sources through metadata or other identifiers” determining whether unlabeled data belong to a class or not is classification); and
“annotating the examples accordingly when it is determined that the examples presented in the unlabeled dataset belong to the same class” ([0006] “An annotation to a document or subset of a document (referred to as a “span”) generally includes information indicating how the document or span should be classified into one or more topics or categories”)
Regarding claims 5 and 16, Munro1 teaches “in annotating the unlabeled dataset, the method further comprising: chaining multiple questions to generate multi-label annotations in a manner such that, based on an answer to a current question, the user can follow up with another question” ([0116] “In some embodiments, the human readable prompt is generated by an intelligent queuing module of a natural language modeling engine. In some embodiments, the human readable prompt is a question requesting selection of the most applicable label or task assigned in the label pane at 1232 for the document assigned in the document pane at 1231. In some embodiments, the human readable prompts requests selection of all applicable labels or tasks assigned in the label pane of a work unit interface at 1232. One having skill in the art can envision additional human readable prompts requesting a task of a document. At 1234, the generated human readable prompt is assigned to a prompt pane of the work unit interface.”)
Regarding claims 6 and 17, Munro1 teaches “in annotating the unlabeled dataset, the method further comprising: generating default answers from the machine learning model” ([0050] “By categorizing a collection of documents into topics, with a series of descriptive labels and tasks describing each document, a natural language modeling engine can build an ontology of documents demonstrative of overall sentiments and underlying meaning of trends across all documents” the output of the classification model is interpreted as the “default” answer)
Regarding claims 7 and 18, Munro1 teaches “when it is determined that a certain default answer among the generated default answers is acceptable, the method further comprising: fetching a default value corresponding to the default answer; storing the default value as a new default value; and propagating constraints based on the new default value” ([0053] “Database 115 […]may store artificial intelligence rules for server machine 110 to access and apply to sorting a collection of documents, or may store guidelines and labels or tasks that have been used in other ontologies that server machine 110 can access to build additional ontologies.”)
Regarding claims 8 and 19, Munro1 teaches “when it is determined that a certain default answer among the generated default answers is not acceptable, the method further comprising: displaying a "Yes" or "No" icon onto the user interface to receive user input corresponding to the labeling pairs of examples as a new value” ([0082] “Examples of such human readable prompt include, but are not limited to, […] “does this label apply to the document?”);
“storing the new value as a new default value” ([0053] “Database 115 may store a collection of documents for server machine 110 to access and create an ontology around, or may store artificial intelligence rules for server machine 110 to access and apply to sorting a collection of documents, or may store guidelines and labels or tasks that have been used in other ontologies that server machine 110 can access to build additional ontologies.”); and
“propagating constraints based on the new default value” (previous citation “Database 115 […]may store artificial intelligence rules for server machine 110 to access and apply to sorting a collection of documents, or may store guidelines and labels or tasks that have been used in other ontologies that server machine 110 can access to build additional ontologies.”)
Regarding claim 10, Munro1 teaches “further comprising: implementing an active learning algorithm to generate a next sample for annotation” ([0003] “The subject matter disclosed herein generally relates to creating one or more interfaces for processing human verification of natural language model accuracies” which is what active learning is)
Regarding claim 11, Munro1 teaches “wherein the machine learning model is a classification model” ([0006] “An annotation to a document or subset of a document (referred to as a “span”) generally includes information indicating how the document or span should be classified into one or more topics or categories”)
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim(s) 2 and 13 is/are rejected under 35 U.S.C. 103 as being unpatentable over Munro1 further in view of Munro et al. US 2016/0162462 [herein Munro2].
Regarding claims 2 and 13, the Munro1 reference has been addressed above. Munro1 does not explicitly teach the claim limitations. Munro2 however teaches “further comprising: filtering out duplicate examples from the unlabeled dataset prior to annotating the unlabeled dataset” (Munro2 [0069] “according to some example embodiments, the seed set of documents may be selected such that they are evenly distributed among different document types. According to some example embodiments, exact duplicates and/or near duplicates may be removed from the seed set of documents”).
It would have been obvious to one having ordinary skill in the art at the time that the invention was filed to combine the teachings of Munro1 with that of Munro2 since a combination of known methods would yield predictable results. As shown in Munro2, it is known in the art to filter out duplicate results in order to improve processing of data. The removal of duplicate entries would operate the same in any system in a predictable manner, such as the system of above.
Claim(s) 3 and 14 is/are rejected under 35 U.S.C. 103 as being unpatentable over Munro1 further in view of Ganesh et al. USPAT 11,921,768.
Regarding claims 3 and 14, the Munro1 reference has been addressed above. Munro1 does not explicitly teach the claim limitations. Ganesh however teaches “further comprising: filtering out non-English prose from the unlabeled dataset prior to annotating the unlabeled dataset” (Ganesh col. 8 ¶3 “The predictive power of machine learning models, like natural language processing (NLP) models, is strongly related to the pre-processing 304 steps that are used. In NLP, this can translate to various pre-processes, such as […] removal of non-English documents (if the model is for use with English text”)
It would have been obvious to one having ordinary skill in the art at the time that the invention was filed to combine the teachings of Munro1 with that of Ganesh since a combination of known methods would yield predictable results. One would want non-English data to be removed since it could negatively affect the data classification and as shown in Ganesh this technique is known. Therefore, it would operate in a known and predictable manner with the systems above.
Claim(s) 9 is/are rejected under 35 U.S.C. 103 as being unpatentable over Munro1 in view of Segler, Jr. et al. USPAT 10,726,041.
Regarding claim 9, the Munro1 reference has been addressed above. Munro1 does not explicitly teach the claim limitations. Segler however teaches “further comprising: implementing a caching algorithm to keep a dictionary of edges which are induced by an annotation in a manner such that when the user "undo" an annotated edge, the user can revert the edges that are induced by the edge” (Segler col. 24 ¶2 “For example, the computing device 104 may identify the particular node or individual edit revision that is associated with an undesirable condition. The computing device 104 may then utilize the change list of the baseline of the data structure to identify changes or change commands associated with the identified entity, and the computing device 104 may “undo” the change by performing a change command inverse to the change command that caused the undesirable condition to occur” which would entail storing the changes)
It would have been obvious to one having ordinary skill in the art at the time that the invention was filed to combine the teachings of Munro1 with that of Segler since a combination of known methods would yield predictable results. All modern computers have an undo function and as shown in Segler, this can be utilized to undo an edge. Therefore this standard computer feature would operate in a known and predictable manner with the system above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Chai, Chengliang, and Guoliang Li. "Human-in-the-loop Techniques in Machine Learning." IEEE Data Eng. Bull. 43.3 (2020): 37-52.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KEVIN W FIGUEROA whose telephone number is (571)272-4623. The examiner can normally be reached Monday-Friday, 10AM-6PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, MIRANDA HUANG can be reached at (571)270-7092. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
KEVIN W FIGUEROA
Primary Examiner
Art Unit 2124
/Kevin W Figueroa/ Primary Examiner, Art Unit 2124