DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claims 1-6 and 8 are presented for examination.
Information Disclosure Statement
The information disclosure statement (IDS) submitted on March 30, 2023 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Specification
Examiner objects to the specification for containing various grammatical informalities. Examiner has attached a marked-up copy of the specification indicating where errors have occurred. To the extent that the markings are not self-explanatory and are not corrected, Examiner will enumerate the remaining objections in a subsequent Office Action.
The title of the invention is not descriptive. A new title is required that is clearly indicative of the invention to which the claims are directed.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 2-4 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
The term “improve” in claim 2 is a relative term which renders the claim indefinite. The term “improve” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. Neither the specification nor the claim delineates how the performance is improved or what metrics are used to determine improvement, and Examiner is unaware of any generally art-accepted definition of “improve[ment]” in this context. Moreover, it is unclear what the baseline is relative to which the performance is improved.
All claims dependent on a claim rejected hereunder are also rejected for being dependent on a rejected base claim.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-6 and 8 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. The analysis of the claims will follow the 2019 Revised Patent Subject Matter Eligibility Guidance, 84 Fed. Reg. 50 (“2019 PEG”).
Claim 1
Step 1: The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A Prong 1: The claim recites, inter alia, “learning, based on the plurality of input data sets, an estimation model for estimating a parameter of a topic model from a smaller amount of data than an amount of data included in the plurality of data sets.” Given that the claim does not indicate that the estimation model is a machine learning model, this limitation could encompass learning a mental model for estimating a parameter of a topic model.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim further recites “inputting a plurality of data sets.” This limitation is directed to the insignificant extra-solution activity of mere data gathering and output. MPEP § 2106.05(g).
Step 2B: The claim does not contain significantly more than the judicial exception. The claim further recites “inputting a plurality of data sets.” This limitation is directed to the well-understood, routine, and conventional activity of receiving or transmitting data over a network. MPEP § 2106.05(d)(II); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network). As an ordered whole, the claim is directed to a mentally performable process of developing a model for estimating a parameter of a topic model. Nothing in the claim provides significantly more than this. As such, the claim is not patent eligible.
Claim 2
Step 1: A process, as above.
Step 2A Prong 1: The claim recites:
[G]enerating a first data set for estimating the parameter of the topic model and a second data set for evaluating the parameter of the topic model based on one data set included in the plurality of data sets: This limitation could encompass mentally generating the two datasets.
[E]stimating the parameter of the topic model, the parameter of the topic model conforming to the first data set and a prior distribution of the parameter of the topic model: This limitation could encompass mentally estimating the parameter of the topic model in accordance with the claimed criteria.
[E]valuating performance of the topic model having the estimated parameter based on the second data set: This limitation could encompass mentally evaluating the topic model performance.
[U]pdating a parameter of the estimation model based on the evaluation to improve the performance of the topic model: This limitation could involve mentally updating the parameter of the estimation model; note again that neither the estimation model nor the topic model is necessarily a machine learning model.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. See claim 1 analysis.
Step 2B: The claim does not contain significantly more than the judicial exception. See claim 1 analysis.
Claim 3
Step 1: A process, as above.
Step 2A Prong 1: The claim recites, inter alia:
[T]he learning includes calculating a representation of the first data set … based on the first data set: This limitation could encompass mentally calculating the representation of the dataset; alternatively, this represents a mathematical concept.
[C]alculating the prior distribution … based on the first data set and the representation: This limitation could encompass mentally calculating the prior distribution.
[T]he updating includes updating the parameter of the estimation model: This limitation could encompass mentally updating the parameter.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim further recites that “the estimation model includes at least a first neural network and a second neural network” and that the parameters updated are “of the first neural network and … of the second neural network”. However, these are mere instructions to apply the judicial exception using a generic computer programmed with a generic class of computer algorithm. MPEP § 2106.05(f).
Step 2B: The claim does not contain significantly more than the judicial exception. The claim further recites that “the estimation model includes at least a first neural network and a second neural network”, that the calculating steps are performed by neural networks, and that the parameters updated are “of the first neural network and … of the second neural network”. However, these are mere instructions to apply the judicial exception using a generic computer programmed with a generic class of computer algorithm. MPEP § 2106.05(f).
Claim 4
Step 1: A process, as above.
Step 2A Prong 1: The claim recites “generating the first data set and the second data set by setting a first value and a second value obtained by randomly dividing a value of data included in the one data set as a value of data included in the first data set and a value of data included in the second data set, respectively”. This limitation could encompass mentally generating the data sets by setting the two values; random division is also a mathematical concept.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. See claim 2 analysis.
Step 2B: The claim does not contain significantly more than the judicial exception. See claim 2 analysis.
Claim 5
Step 1: The claim recites a method; therefore, it is directed to the statutory category of processes.
Step 2A Prong 1: The claim recites, inter alia, “estimating, based on the input data set, a parameter of a topic model by an estimation model learned in advance by use of a plurality of data sets including a larger amount of data than an amount of data included in the data set”. Since, as noted above, the claim does not specify that the models are machine learning models, this could encompass mentally estimating a parameter of a topic model using a mental estimation model.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim further recites “inputting a data set”, which is directed to the insignificant extra-solution activity of mere data gathering and output. MPEP § 2106.05(g).
Step 2B: The claim does not contain significantly more than the judicial exception. The claim further recites “inputting a data set”, which is directed to the well-understood, routine, and conventional activity of receiving or transmitting data over a network. MPEP § 2106.05(d)(II); OIP Techs., Inc., v. Amazon.com, Inc., 788 F.3d 1359, 1363, 115 USPQ2d 1090, 1093 (Fed. Cir. 2015) (sending messages over a network). As an ordered whole, the claim is directed to a mentally performable process of estimating a parameter of a topic model. Nothing in the claim provides significantly more than this. As such, the claim is not patent eligible.
Claim 6
Step 1: The claim recites a learning device comprising a processor and a memory; therefore, it is directed to the statutory category of machines.
Step 2A Prong 1: The claim recites the same judicial exceptions as in claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The analysis at this step mirrors that of claim 1, except insofar as these claims are directed to a “learning device comprising; a processor; and a memory that includes instructions, which when executed, cause the processor to execute [the method]”. However, this amounts to a mere instruction to apply the judicial exception using a generic computer. MPEP § 2106.05(f).
Step 2B: The claim does not contain significantly more than the judicial exception. The analysis at this step mirrors that of claim 1, except insofar as these claims are directed to a “learning device comprising; a processor; and a memory that includes instructions, which when executed, cause the processor to execute [the method]”. However, this amounts to a mere instruction to apply the judicial exception using a generic computer. MPEP § 2106.05(f).
Claim 8
Step 1: The claim recites a non-transitory computer-readable recording medium; therefore, it is directed to the statutory category of articles of manufacture.
Step 2A Prong 1: The claim recites the same judicial exceptions as in claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. The claim further recites a “non-transitory computer-readable recording medium storing a program that causes a computer to execute the learning method”. However, this amounts to a mere instruction to apply the judicial exception using a generic computer. MPEP § 2106.05(f).
Step 2B: The claim does not contain significantly more than the judicial exception. The claim further recites a “non-transitory computer-readable recording medium storing a program that causes a computer to execute the learning method”. However, this amounts to a mere instruction to apply the judicial exception using a generic computer. MPEP § 2106.05(f).
Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1-2, 5-6, and 8 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Azarbonyad et al., “Hierarchical Re-Estimation of Topic Models for Measuring Topical Diversity,” in arXiv preprint arXiv:1701.04273 (2017) (“Azarbonyad”).
Regarding claim 1, Azarbonyad discloses “[a] learning method executed by a computer (stopwords in a standard stop word list from Python’s NLTK package are removed [implying that the method is carried out using a computer comprising a processor and a memory] – Azarbonyad, sec. 4, last paragraph before “Measuring topical diversity”), the learning method comprising:
inputting a plurality of data sets (hierarchical re-estimation process for topic models outperforms current state-of-the-art on a publicly available dataset commonly used for evaluating document diversity – Azarbonyad, sec. 1, last paragraph; RCV1, 20-NewsGroups, and Ohsumed datasets were used [i.e., a plurality of datasets] – id. at sec. 6, subsection entitled “Datasets and metrics”); and
learning, based on the plurality of input data sets, an estimation model for estimating a parameter of a topic model from a smaller amount of data than an amount of data included in the plurality of data sets (hierarchical re-estimation process for making the distributions P(w|d), P(w|t), and P(t|d), where w is a word, d is a document, and t is a topic, more sparse; the parameters of these distributions are re-estimated [i.e., an estimation model for re-estimating these parameters is learned] so that general, collection-wide items are removed and only salient items are kept [i.e., the re-estimation is performed using a smaller amount of data than contained in the datasets] – Azarbonyad, sec. 1, penultimate paragraph).”
Claim 6 is a device claim corresponding to method claim 1 and is rejected for the same reasons as given in the rejection of that claim.
Regarding claim 2, Azarbonyad discloses that “the learning includes
generating a first data set for estimating the parameter of the topic model and a second data set for evaluating the parameter of the topic model based on one data set included in the plurality of data sets (synthetic dataset of documents with high and low diversity is generated [high-diversity dataset = first dataset; low-diversity dataset = second dataset], and topic models are re-estimated [including their parameters] and the re-estimated models are used to measure topical diversity of documents [i.e., evaluate the parameter of the model]; of 300,000 documents selected, 20 journals are selected to create ten pairs of journals, and for each pair of journals 50 articles are selected [i.e., the evaluation is performed based on one data set in the plurality] – Azarbonyad, sec. 4, first three paragraphs),
estimating the parameter of the topic model, the parameter of the topic model conforming to the first data set and a prior distribution of the parameter of the topic model (hierarchical topic model re-estimation (HiTR) includes document re-estimation, which re-estimates P(w|d), topic re-estimation, which re-estimates P(w|t), and topic assignment re-estimation, which re-estimates P(t|d) [these three probabilities = parameters of the topic model; note that, by Bayes’ theorem, these three conditional probabilities depend on/conform to the prior distributions P(w) and P(t)] – Azarbonyad, sec. 3; the model is tested against, inter alia, a set of documents with high topical diversity [i.e., the parameters are used to evaluate, i.e., conform to, the first data set] – id. at sec. 4, second paragraph),
evaluating performance of the topic model having the estimated parameter based on the second data set (to measure the performance of topic models on the topical diversity task, ROC curves are used and AUC values are reported; coherence is estimated using normalized pointwise mutual information among the topic N words within a topic – Azarbonyad, sec. 4, second paragraph under “Measuring topical diversity” [note that the evaluation is based on both the high-diversity and the low-diversity datasets, i.e., based at least in part on the second dataset]), and
updating a parameter of the estimation model based on the evaluation to improve the performance of the topic model (10-fold cross-validation is performed, including one fold to tune [update] the parameter lambda for document re-estimation, topic re-estimation, and topic assignment re-estimation [note that the adjustment of the parameter is intended to improve the performance of the model and is based on the prior training/evaluation of the models] – Azarbonyad, penultimate paragraph before sec. 5).”
Regarding claim 5, Azarbonyad discloses “[a]n estimation method executed by a computer, the estimation method comprising:
inputting a data set (hierarchical re-estimation process for topic models outperforms current state-of-the-art on a publicly available dataset commonly used for evaluating document diversity – Azarbonyad, sec. 1, last paragraph); and
estimating, based on the input data set, a parameter of a topic model by an estimation model learned in advance by use of a plurality of data sets including a larger amount of data than an amount of data included in the data set (hierarchical re-estimation process for making the distributions P(w|d), P(w|t), and P(t|d), where w is a word, d is a document, and t is a topic, more sparse; the parameters of these distributions are re-estimated [i.e., an estimation model for re-estimating these parameters is learned] so that general, collection-wide items are removed and only salient items are kept [i.e., the estimation of the parameters of the topic model uses fewer data points than the data points used to train/learn the estimation model] – Azarbonyad, sec. 1, penultimate paragraph; see also sec. 3, subsections entitled “Document re-estimation” and “Topic re-estimation” (disclosing that the model is trained after having removed the stopwords and high- and low-frequency words and that topic re-estimation removes general words thereafter, i.e., the topic re-estimation uses fewer data than used to train the model)).”
Regarding claim 8, Azarbonyad disclsoes “[a] non-transitory computer-readable recording medium storing a program that causes a computer to execute the learning method according to claim 1 (stopwords in a standard stop word list from Python’s NLTK package are removed [implying that the method is carried out using a computer comprising a non-transitory computer-readable medium comprising a program for carrying out the method] – Azarbonyad, sec. 4, last paragraph before “Measuring topical diversity”).”
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Azarbonyad in view of Voinea et al. (US 11797705) (“Voinea”).
Regarding claim 3, the rejection of claim 2 is incorporated. Azarbonyad further discloses that “the learning includes
calculating a representation of the first data set … based on the first data set (for each pair of journals, 50 articles are selected to create 50 probability distributions [representations] over topics; thus, for each pair of journals 50 documents with a high diversity value are generated [i.e., the probability distributions/representations are based on the first data set] – Azarbonyad, sec. 4, third paragraph), and
calculating the prior distribution … based on the first data set and the representation (for each pair of journals, 50 articles are selected to create 50 probability distributions [representations] over topics; thus, for each pair of journals 50 documents with a high diversity value are generated [i.e., the probability distributions/representations are based on the first data set and that the distribution over topics P(t) is the prior distribution of the topic assignment P(t|d)] – Azarbonyad, sec. 4, third paragraph), and
the updating includes
updating the parameter of the estimation model (10-fold cross-validation is performed, including one fold to tune [update] the parameter lambda for document re-estimation, topic re-estimation, and topic assignment re-estimation – Azarbonyad, penultimate paragraph before sec. 5) ….”
Azarbonyad appears not to disclose explicitly the further limitations of the claim. However, Voinea discloses that “the … model includes at least a first neural network and a second neural network (GAN may include generator [first neural network] and discriminator [second neural network] – Voinea, col. 9, l. 56-col. 10, l. 59),
the learning includes
calculating a [first value] by the first neural network (generator [first neural network] may train to generate [calculate] synthetic sensitive data to simulate the training set which are known to contain sensitive data – Voinea, col. 9, l. 56-col. 10, l. 59) …, and
calculating the [second value] by the second neural network (discriminator [second neural network] may receive synthetic data and real data and train to distinguish synthetic vs. real data by providing [calculating] a binary label – Voinea, col. 9, l. 56-col. 10, l. 59) …, and
the updating includes
updating the parameter … including a parameter of the first neural network and a parameter of the second neural network (generator [first neural network] and discriminator [second neural network] of a GAN may update their respective parameters with backpropagation based on gradient descent – Voinea, col. 11, ll. 6-30).”
Voinea and the instant application both relate to neural networks and are analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Azarbonyad to employ two neural networks to perform the method and update the parameters of both, as disclosed by Voinea, and an ordinary artisan could reasonably expect to have done so successfully. Doing so would allow the system to use a known system for performing the relevant operations, reducing the need for manual programming. See Voinea, col. 9, l. 56-col. 10, l. 59.
Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Azarbonyad in view of Amad (US 20220092406) (“Amad”).
Regarding claim 4, the rejection of claim 2 is incorporated. Azarbonyad further discloses that “the generating includes
generating the first data set and the second data set by setting a first value and a second value obtained by … dividing a value of data included in the one data set as a value of data included in the first data set and a value of data included in the second data set, respectively1 (from the 300,000 documents selected [one data set], 20 journals are selected, for each pair of journals, 50 articles are selected to generate 50 documents with a high diversity value, and the procedure is repeated to generate non-diverse documents [i.e., the first and second datasets are selected by dividing the one data set into two subsets] – Azarbonyad, sec. 4, third paragraph).”
Azarbonyad appears not to disclose explicitly the further limitations of the claim. However, Amad discloses “generating the first data set and the second data set … by randomly dividing … the one data set (splitting a complete dataset may include randomly dividing the complete dataset into equal groupings for the first dataset, the plurality of first data, and the plurality of second data – Amad, paragraph 3) ….”
Amad and the instant application both relate to machine learning and are analogous. It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified Azarbonyad to divide the dataset randomly, as disclosed by Amad, and an ordinary artisan could reasonably expect to have done so successfully. Doing so would reduce processing needs by reducing the need for the application of a separate deterministic rule for dividing the data. See Amad, paragraph 3.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to RYAN C VAUGHN whose telephone number is (571)272-4849. The examiner can normally be reached M-R 7:00a-5:00p ET.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kamran Afshar, can be reached at 571-272-7796. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/RYAN C VAUGHN/ Primary Examiner, Art Unit 2125
1 This language is inscrutable at best. The specification does not shed any light on what “randomly dividing a value of data included in the one data set as a value of data included in the first data set and a value of data included in the second data set” means, and at most paragraph 28 discloses that the document sets are randomly selected out of a larger superset of documents. For purposes of examination, then, Examiner will construe this language as meaning that the first and second datasets are randomly selected out of the “one data set”.