DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Drawings
The drawings are objected to because the text in the boxes is difficult to read because of the shading in Figure 10. Applicant is requested to amend Figure 10 so that the text in the boxes can be easily read.
Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office Action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, Applicant will be notified and informed of any required corrective action in the next Office Action. The objection to the drawings will not be held in abeyance.
Specification
The abstract of the disclosure is objected to because it is not in narrative form. MPEP 608.01 I. C. states that the abstract should avoid the form and legal phraseology of patent claims and should be in narrative form. Applicant’s abstract has the form and legal phraseology of a patent claim, and is not in narrative form. Applicant should submit a new abstract in narrative form on a separate sheet as required by 37 CFR 1.72(b). A corrected abstract of the disclosure is required and must be presented on a separate sheet, apart from any other text. See MPEP §608.01(b).
The disclosure is objected to because of the following informalities:
Applicant’s Specification includes numerous instances of ungrammatical language characteristic of a deficient translation to English. A Substitute Specification may be appropriate to correct these informalities. Applicant may discover additional grammatical errors, but these errors include:
On page 2, lines 1 to 4, “A method enabling . . .” is not a complete grammatical sentence because it lacks a proper verb.
On page 2, lines 27 to 29, “In one first aspect . . .” is not a complete grammatical sentence because lacks a proper verb.
On page 3, line 12, “is” should be “being”.
On page 3, lines 13 to 16, “The analyzing each . . .” is not a complete grammatical sentence because it lacks a proper verb.
On page 8, line 15, “are” should be deleted.
On page 9, line 4, “does” should be “do”.
On page 9, lines 26 to 28, “Preferably, an algorithm . . .” is not a grammatical sentence, but “having” could be changed to “has”.
On page 9, line 29, “label” should be “labelling”.
On page 9, line 35, “may identical words used” should be “identical words may be used”.
On page 10, lines 13 to 16, “In other examples . . .” is not a proper grammatical sentence, but could be corrected by deleting “may”.
On page 10, lines 27 to 28, “the training the dictionary” should be “training the dictionary”.
On page 12, line 5, “statistic” should be “statistics”.
On page 12, line 12, “used to for” should be “used for”.
On page 12, lines 32 to 33, “miss leading” should be “misleading”.
On page 14, line 20, “to segmenting” should be “to segment” or “for segmenting”.
On page 14, line 25, “An initial step . . .” is not a complete grammatical sentence because it does not have a proper verb.
On page 14, line 27, “open-text based question” should be “an open-text based question”.
On page 15, lines 25 to 26, “one type of sources” should be “one type of source”.
On page 16, line 10, “one type of sources” should be “one type of source”.
On page 16, lines 13 to 14, “Thereby generating . . .” is not a complete grammatical sentence because it lacks a proper verb.
On page 16, line 18, “may be to setting” should be “may be to set”.
On page 16, line 35 to page 17, line 1, “the words that has not been” should be “the words that have not been”.
On page 18, line 19, “the categories does” should be “the categories do”.
On page 19, line 11, “has been found 8 times” should be “have been found 8 times”.
On page 19, line 16, “there for” should be “therefore”.
On page 19, line 16, “has” should be “have”.
On page 19, line 20, “has” should be “have”.
On page 20, line 27, “relate to subject of interest” should be “relate to subjects of interest”.
On page 21, line 7, “date” should be “data”.
On page 21, line 8, “withing” should be “within”.
On page 21, line 10, “wither” should be “whether”.
On page 21, line 16, “and extract” should be “and extracting”.
On page 21, line 21, “and extract” should be “and extracting”.
On page 21, lines 28 to 32, “For example . . .” is not a complete grammatical sentence because it lacks a proper verb.
On page 22, lines 1 to 6, “For example . . .” is not a complete grammatical sentence because it lacks a proper verb.
On page 22, line 2, “determining the determining” is redundant, but could be “determining the”.
On page 22, line 10, “date” should be “data”.
On page 22, lines 14 to 15, it is not clear that a vector is being described in Figure 8, but a vector is described in Figure 4.
On page 22, line 22, “is the typical quote selected” should be “the typical quote is selected”.
On page 22, line 24, “matching” should be “matches”.
On page 22, lines 30 to 31, “may the method generating” should be “the method may generate”.
On page 23, lines 3 to 4, “may the typical quote be determined” should be “the typical quote may be determined”.
On page 23, lines 6 to 7, “may the typical quote be determined” should be “the typical quote may be determined”.
On page 23, lines 9 to 10, “may the typical quote be determined” should be “the typical quote may be determined”.
On page 23, line 29, “data point” should be “data points”.
On page 23, line 34, “where” should be “were”.
On page 24, line 3, there are two double quotes with “absence””, which should be “absence”.
Appropriate correction is required.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 15 to 16 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.
The claims do not fall within at least one of the four categories of patent eligible subject matter because they set forth computer programs can be construed as ‘a computer program per se’ and ‘signals per se’. Claim 15 can be construed as a computer program per se and claim 16 can be construed as a signal per se. The USPTO takes the position that claims to computer programs and computer readable media should be broadly interpreted in light of the Specification for purposes of determining patent-eligible subject matter under 35 U.S.C. §101. Here, Applicant’s Specification does not limit a computer program or a computer readable medium to embodiments that are non-transitory. Patent case law has held that transitory forms of signal transmission (for example, a propagating electrical or electromagnetic signal per se) represent non-patent eligible subject matter under 35 U.S.C. §101. See In re Nuijten, 500 F.3d 1346, 1357, 84 USPQ2d 1495, 1503 (Fed. Cir. 2007) and MPEP §2106 II and §2106.03. Moreover, computer programs per se do not fall within one of the four statutory categories of invention as an article of manufacture if they are not embodied in a computer readable medium. Computer programs per se that are not claimed as being embodied or stored in a computer-readable medium are not patent-eligible subject matter. Applicant can overcome this rejection by amending claim 15 to set forth “A computer program product stored in a non-transitory computer readable medium . . .” and claim 16 to set forth “A non-transitory computer-readable medium . . . .”
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claim 10 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 10 sets forth the phrase “such as”, which renders the claim indefinite because it is unclear whether the limitations following the phrase are part of the claimed invention. See MPEP §2173.05(d). Here, it is unclear if a vector system requires the vector to be 2-dimensional due to the limitation of “such as”. (This claim language is being broadly construed as not requiring that a vector be two-dimensional for purposes of examination.)
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1 to 7 and 14 to 16 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Dillard et al. (U.S. Patent No. 9,672,555).
Regarding independent claim 1, Dillard et al. discloses a method of extracting quotes from customer reviews, comprising:
“obtaining a dataset comprising a plurality of text-based data from a plurality of users or sources, wherein each text-based data of said plurality of text-based data is provided by a user or a source from said plurality of users or sources, said text-based data is unstructured” – individual sentences or phrases contained in customer reviews regarding an item are parsed into a collection of sentences (Abstract); customer review data 132 may include any free-form text comments (“said text-based data is unstructured”) in any format regarding items (column 4, lines 39 to 61: Figure 1); customer reviews 202 contain comments 212 that may include free form text provided by customer 102 regarding the associated item (column 6, lines 21 to 33: Figure 2); here, customer reviews (“a dataset”) are obtained from a plurality of customers (“from a plurality of users or sources”);
“categorizing said plurality of text-based data by at least determining a typical category and a typical sentiment, wherein said typical sentiment is associated to said typical category” – a list of topics (“categories”) is generated from the collection of sentences, and the most relevant topics from the list of topics are identified for a particular item (Abstract); a collection of sentences is analyzed to determine an overall majority sentiment regarding a topic to extract specific sentences or phrases expressing a particular sentiment for display to a customer (column 2, lines 41 to 46); each quote 302 may contain topic assignments 306 and sentiment indicator 308; topic assignments 306 may indicate one or more general topics regarding the item to which the sentences in excerpt 304 are directed, and sentiment indicator 308 may provide an indication of the sentiment expressed by the excerpt; topics assignments 306 and sentiment indicator 308 for the extracted quote may be established by quote extraction module 134 in a quote extraction process (column 7, lines 3 to 18: Figure 3); a quote extraction process may produce more salient topics; the most relevant topics (“a typical category”) determined for a group of items may include the reliability of the item, the quality of construction of the item, and the price or value of the item (column 8, lines 1 to 15: Figure 4: Step 410); quote extraction module 134 selects the most relevant topics from the list of topics for a particular item; the most relevant topics are the topics most discussed in customer reviews 202 associated with that item; quote extraction module 134 selects one or more representative sentences from among sentences parsed from customer reviews 202; representative sentences for a topic are those sentences that are representative both in terms of sentiment and in terms of subject matter (“a typical category and a typical sentiment, wherein said typical sentiment is associated to said typical category”); in order to select the most representative sentences for a topic, quote extraction module 134 first determines the majority sentiment (positive or negative) from the sentences assigned to that topic; from among those sentences expressing the majority sentiment (“a typical sentiment”), quote extraction module 134 then selects the one or more sentences that are most relevant to the topic (column 9, line 44 to column 10, line 5: Figure 4: Steps 408 to 410); here, “said typical sentiment is associated to said typical category” because most relevant topics and majority sentiments are associated with a same representative sentence; “determining a typical quote representing said plurality of text-based data based on said typical category and said typical sentiment” – quote extraction module 134 scans customer reviews in order to extract representative comments or ‘quotes’ for items that summarize the information contained in the customer reviews for the items both as to content and sentiment; quote extraction module 134 may determine the sentiment expressed by the extracted quotes (column 4, line 62 to column 5, line 31: Figure 1); extracted quote data 136 may contain extracted quotes 302A to 302N containing excerpt 304; excerpt 304 may contain one or more representative sentences or phrases extracted from customer reviews 202 for an item or group of items that summarizes the information contained in the customer reviews for the items both as to content and sentiment (column 6, lines 50 to 62: Figure 3); quote extraction module 134 selects one or more representative sentences from customer reviews 202 that are representative in terms of sentiment and subject matter (column 9, lines 58 to 65: Figure 4: Step 410).
Regarding claim 2, Dillard et al. discloses that topic assignment module 306 may utilize word stemming to select the most used and relevant term or sequence of terms from among sentences assigned to the topic as the topic label (“at least one category word”) (column 10, lines 39 to 52: Figure 3); routine 500 takes sentences manually labeled in terms of sentiment and learns which words and sequence of words make a sentence positive, negative, mixed, or neutral (“at least one sentiment word”); routine 500 then utilizes classifiers trained on the sentiment classifications of these words and sequences to determine a sentiment for each sentence or phrase in the collection of sentences (column 12, lines 29 to 40: Figure 5). Consequently, determining representative topics and sentiments are both based on words or terms in sentences (“wherein determining said typical category and said typical sentiment includes finding at least one category word in each text-based data of said plurality of text-based data related to at least one category and at least one sentiment word associated with each of said at least one category word found in each text-based data of said plurality of text-based data”).
Regarding claims 3 to 4 and 7, Dillard et al. discloses that quote extraction module 134 selects representative sentences that may be filtered in order to extract more salient quotes; quote extraction module 134 may filter selected sentences for a minimum specificity in order to remove sentences with broad language, and favor more specific sentences; extracted sentences may be filtered based on a number of words in the sentence and the minimum average word length (column 10, lines 12 to 26: Figure 4); here, “a typical length” is defined as “a median or average value of said length of each text-based data of said plurality of text-based data”; consequently, a minimum average word length is “a typical length” that is “defined as . . . average value of said length of each text-based data”. Selecting sentences for a quote based on a minimum average word length is equivalent to “wherein said typical quote is selected from said plurality of text-based data based on which of said text-based data closest matching at least said typical length of said plurality of text-based data” because sentences for a quote that satisfy a filtering criterion of a minimum average word length are “closest matching” to a minimum average word length.
Regarding claim 5, Dillard et al. discloses that quote extraction module 134 selects one or more sentences that are most relevant to a topic using cosine similarity with term frequency-inverse document frequency (TF-IDF) weighting (column 10, lines 2 to 9: Figure 4); quote extraction module 134 may utilize word stemming and TF-IDF to select the most used and relevant term or sequence of terms from among the sentences assigned to the topic as the topic label (column 10, lines 48 to 52: Figure 4). Here, a topic is determined by a cosine similarity and TF-IDF for specific terms (words) or sequence of terms (sequence of words), and a cosine similarity and TF-IDF ‘quantify’ a degree to which words (“at least one category word”) represent a topic (“wherein categorizing includes quantifying said at least one category word in said plurality of text-based data to determine said typical category”).
Regarding claim 6, Dillard et al. discloses that routine 500 takes sentences manually labeled in terms of sentiment and learns which words and sequence of words make a sentence positive, negative, mixed, or neutral (“at least one sentiment word”); routine 500 then utilizes classifiers trained on the sentiment classifications of these words and sequences to determine a sentiment for each sentence or phrase in the collection of sentences (column 12, lines 29 to 40: Figure 5); quote extraction module 134 applies machine learning techniques to score each word or term in the list of terms as to positive, negative, mixed, and neutral sentiment in which the words and terms occur (column 13, lines 50 to 55: Figure 5). Consequently, scoring words associated with a sentiment to produce a positive, negative, mixed, or neural sentiment is “wherein categorizing includes quantifying said sentiment words in said plurality of text-based data to determine said typical sentiment.”
Regarding claim 14, Dillard et al. discloses that quote extraction module 134 selects one or more sentences that are most relevant to a topic using cosine similarity with term frequency-inverse document frequency (TF-IDF) weighting (column 10, lines 2 to 9: Figure 4); quote extraction module 134 may utilize word stemming and TF-IDF to select the most used and relevant term or sequence of terms from among the sentences assigned to the topic as the topic label (column 10, lines 48 to 52: Figure 4). Here, a topic determined by TF-IDF is “based on statistical information of said dataset”. That is, TF-IDF represents “statistical information of said dataset”. Quote extraction module 134 applies machine learning techniques to score each word or term in the list of terms as to positive, negative, mixed, and neutral sentiment in which the words and terms occur (column 13, lines 50 to 55: Figure 5). Here, scoring each word or term according to sentiment is “based on scoring of each text-based data based on quantification of each text-based data of said plurality of text-based data.”
Regarding claims 15 to 16, Dillard et al. discloses that the invention may be implemented as a computer-readable storage medium (column 2, lines 47 to 53); mass storage device 28 may store application programs including quote extraction module 134 (column 18, lines 22 to 26: Figure 7).
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claim 8 is rejected under 35 U.S.C. 103 as being unpatentable over Dillard et al. (U.S. Patent No. 9,672,555) in view of Galitsky et al. (U.S. Patent Publication 2009/0282019).
Dillard et al. discloses generating quotes for typical categories and typical sentiments, but does not specifically disclose “generating a synthetic quote by combining a name of said typical category and a name of said typical sentiment.” However, Galitsky et al. teaches similar sentiment extraction from consumer reviews with a feature wherein the recommendation is accompanied by a quotation expressing a sentiment about a feature of an item. (Abstract) Specifically, Galitsky et al. teaches features of quotes can include ‘good for children’, ‘good for pets’, ‘safe for female travelers’, and ‘safe for teenagers’. Here, ‘good’ and ‘safe’ are “said typical sentiment” and ‘children’, ‘female travelers’, and ‘teenagers’ are “said typical category’. An objective is to provide user quotes from documents relevant to features of interest to a user. (¶[0002]) It would have been obvious to one having ordinary skill in the art to extract quotes in Dillard et al. by combining a name of a typical category and a name of a typical sentiment as taught by Galitsky et al. for a purpose of providing user quotes from documents relevant to features of interest to a user.
Claim 9 is rejected under 35 U.S.C. 103 as being unpatentable over Dillard et al. (U.S. Patent No. 9,672,555) in view of Sundaresan et al. (U.S. Patent No. 9,514,156).
Dillard et al. discloses extracting quotes from customer reviews according to a words of a category (topic) and a words of a sentiment, but does not appear to disclose “wherein a distance between said category word and an associated sentiment word is used as a parameter when selecting said quote from said plurality of text-based data.” However, Sundaresan et al. teaches topic extraction and opinion mining that identifies, from a plurality of polarity words included in a document, a document polarity word based on a syntactic distance between a dominant polarity word and the topic in a syntactic tree. (Abstract) Syntax analyzer 222 builds a syntactic tree and an impact assignment may be a factor which indicates how much impact a polarity word has on a given topic. An impact score may be determined by a syntactic distance between the word and the topic in the syntactic tree. (Column 10, Line 31 to Column 11, Line 27) An objective is to identify essential topics using business judgement to identify polarity of comments in text including key phrases and community reaction to product launches and initiatives. (Column 2, Line 66 to Column 3, Line 16) It would have been obvious to one having ordinary skill in the art to extract quotes using category words and sentiment words in Dillard et al. according to a distance between a category word and a sentiment word as taught by Sundaresan et al. for a purpose of judging reactions to product launches and initiatives.
Claim 10 is rejected under 35 U.S.C. 103 as being unpatentable over Dillard et al. (U.S. Patent No. 9,672,555) in view of Zhao et al. (U.S. Patent Publication 2020/0159829).
Dillard et al. discloses extracting quotes based on categorizing text-based data, but does not disclose using a vector system. However, Zhao et al. teaches displaying a sentiment of user text comments comprising a sequence of words and providing a vector sequence representing the sequence of words to a sentiment configured to output of sequence of sentiment scores as a vector sequence. (Abstract) Topic module 224 determines relevance score 226 for a vector representation of text, and outputs relevance scores for each individual vector of the vector representation. Relevance scores include an indication of which keyword is the most relevant to each vector. (¶[0044] - ¶[0046]) An objective is to use machine learning techniques to analyze text comments. (¶[0001]) It would have been obvious to one having ordinary skill in the art to use a vector system to categorize text-based data as taught by Zhao et al. to extract quotes in Dillard et al. for a purpose of analyzing text comments using machine learning techniques.
Claims 11 to 12 are rejected under 35 U.S.C. 103 as being unpatentable over Dillard et al. (U.S. Patent No. 9,672,555) in view of Dow et al. (U.S. Patent Publication 2020/0125966).
Dillard et al. discloses that words and terms are associated with topics and that words and terms are associated with sentiments, but does not disclose “a dictionary is created for at least one of category words and unique words, and related sentiment words and/or expressions surrounding said at least one of category words and unique words” and “wherein said dictionary is built using open-source data.” However, Dow et al. teaches a Linguistic Inquiry and Word Count (LIWC) dictionary that is an open source dictionary used as a means for determining sentiment and emotion in structured and unstructured data. (¶[0005]) A method includes identifying parts of speech, key words, and polarity using at least on LIWC dictionary. (¶[0019]) An objective is to capture knowledge or actionable intelligence from structured or unstructured data sources including information of a negative sentiments about a topic of a new product release. (Abstract; ¶[0007]) It would have been obvious to one having ordinary skill in the art to determine category words and sentiment words in Dillard et al. using an open source dictionary as taught by Dow et al. for a purpose of capturing knowledge and actionable intelligence from unstructured data sources.
Claim 13 is rejected under 35 U.S.C. 103 as being unpatentable over Dillard et al. (U.S. Patent No. 9,672,555) in view of Dow et al. (U.S. Patent Publication 2020/0125966) as applied to claims 1 and 11 above, and further in view of Beller et al. (U.S. Patent No. 10,303,763).
Dow et al. teaches an open-source dictionary for discovering actionable intelligence from sentiment and topics, but does not provide for “adapting said dictionary depending on an area of said plurality of text-based data.” However, Beller et al. teaches domain adaptation of dictionary activities that identifies a corpus of documents of an evaluation domain and generates a lexicon based on the corpus of documents. (Abstract) Semi-autonomous natural language processing domain adaptation for domains having a dynamically changing corpus of documents determines a sufficiency of domain adaptation for a dictionary of terms that span one or more subject areas. (Column 2, Lines 49 to 65) An objective is to determine a sufficiency of domain adaptation for domains that may be regularly changing in a certain field or subject area. (Column 1, Lines 18 to 35) It would have been obvious to one having ordinary skill in the art to adapt a dictionary depending on an area of a corpus of documents as taught by Beller et al. in an open-source dictionary of topics and sentiments of Dow et al. for a purpose of determining a sufficiency of domain adaptation for domains that may be regularly changing in a certain field or subject area.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARTIN LERNER whose telephone number is (571) 272-7608. The examiner can normally be reached Monday-Thursday 8:30 AM-6:00 PM.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Richemond Dorvil can be reached at (571) 272-7602. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MARTIN LERNER/Primary Examiner
Art Unit 2658 December 1, 2025