DETAILED ACTION
This communication is a Final Office Action rejection on the merits. Claims 1-20 are currently pending and have been addressed below.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant's arguments filed on 03/16/2026 (related to the 103 Rejection) have been fully considered but are moot in view of new grounds of rejection. Applicant's amendments necessitated the new ground(s) of rejection presented in this Office action. Rejection based on a newly cited reference(s) follows.
Applicant's arguments filed on 03/16/2026 (related to the 101 Rejection) have been fully considered but they are not persuasive.
Applicant states, on pages 9-17, that examples 47-49 of the 2024 AI Guidance illustrate how AI-related claims can be eligible when they involve specific hardware implementations or when they apply abstract ideas using particular machines or techniques that improve computer functionality or other technology. These examples collectively illustrate how the USPTO is approaching the eligibility analysis of AI-related inventions. They emphasize that while certain applications of AI may not be eligible, claims that demonstrate specific technical improvements or practical applications of Al technology can meet the eligibility requirements under 35 U.S.C. 101. As with prior guidance, practitioners will have to navigate these new examples when prosecuting their clients' AI related inventions. The present claims certainly fall into these categories.
Examiner respectfully disagrees with Applicant. Although Applicant mentions various examples in the arguments (examples 47, 49, and 37 of the 101 Guidance), Examiner notes that the example that is most similar to amended claim 1 is Example 47, claim 2 of the 2024 AI Guidance Update on Patent Subject Matter Eligibility.
Independent claims 1, 16, and 20 are considered to be abstract ideas because they are directed to “certain methods of organizing human activity” which include “managing personal behavior.” In this case, “generating a summary review for each feature based on a worthiness score” is merely filtering content (see MPEP 2106.04(a)(2)). Also, the limitations of “conducting a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter,“ “determining when different features that are topically unrelated have any relational correspondence,” “generating statistical insights relating to said plurality of features and features reviews,” “assigning a worthiness score to each feature depending on…,” “ranking each feature …,” and “generating a summary review …” are merely analysis steps recited at a high level of generality such that they could practically be performed in the human mind, which are considered a “mental process” (see MPEP 2106.04(a), a claim to collecting information, analyzing it, and displaying certain results of the collection and analysis, where the data analysis steps are recited at a high level of generality such that they could practically be performed in the human mind). If a claim limitation, under its broadest reasonable interpretation, covers managing personal behavior or evaluations, then it falls within the “method of organizing human activity” or “mental processes” grouping of abstract ideas. Accordingly, the claims recite an abstract idea.
The additional element of a self-learning machine learning is merely used to: collect data (e.g., search a plurality of user submissions relating to a particular subject matter and having information relating to said subject matter), analyze the data (e.g., assign a worthiness score to each feature and generate statistical insights relating to said plurality of features and features reviews), and display certain results of the collection and analysis (e.g., generate a summary review based on the worthiness score). Those are functions that the courts have described as merely indicating a field of use or technological environment in which to apply a judicial exception (see MPEP 2106.05(h)). Also, although the self-learning machine learning includes an intelligent dictionary that is updated over time, the “updated” step is recited at a high level of generality, which results in “apply it.” In this case, the plain meaning of the “learning/updating” step is merely describing how the machine learning is receiving continuous data to iteratively adjust the values/parameters included in a dictionary. However, the claim and specification do not provide any details about how the self-learning machine engine operates (see 2024 AI Guidance, Example 47, Claim 2). Lastly, the limitation of “executing corrective action” is merely updating a dictionary with new data (see Paragraph 0069 of Applicant’s specification, feedback may include dictionary update such as synonymous features). This is considered a well-understood, routine, and conventional function since it's just “receiving or transmitting data over a network” and “performing repetitive calculations” (MPEP 2106.05(d)). Therefore, the claims do not demonstrate any specific technical improvement over previous self-learning machine learning.
The claims fail to recite any improvements to another technology or technical field, improvements to the functioning of the computer itself, use of a particular machine, effecting a transformation or reduction of a particular article to a different state or thing, adding unconventional steps that confine the claim to a particular useful application, and/or meaningful limitations beyond generally linking the use of an abstract idea to a particular environment. See 84 Fed. Reg. 55. Viewed individually or as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself. The claim is not patent eligible.
Claims 2-15 and 17-19 are rejected for having the same deficiencies as those set forth with respect to the claims that they depend from, independent claims 1 and 16.
Claim Rejections - 35 USC § 101
Claims 16 and 20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter. The claim(s) does/do not fall within at least one of the four categories of patent eligible subject matter because the computer readable-medium as claimed and described does not explicitly exclude transitory media. Although the claim and specification recite a tangible storage medium, Examiner notes that some tangible medium may still be transient. Examiner recommends to change “tangible storage medium” to “non-transitory storage medium.”
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., an abstract idea) without reciting significantly more.
Independent Claim 1
Step One - First, pursuant to step 1 in the January 2019 Revised Patent Subject Matter Eligibility Guidance (“2019 PEG”) on 84 Fed. Reg. 53, the claim 1 is directed to a method which is a statutory category.
Step 2A, Prong One - Claim 1 recites: A method for providing an automatic review summary of user submissions, comprising: obtaining a plurality of user submissions relating to a particular subject matter and having information relating to said subject matter; extracting at least one features and at least one feature review relating to each of said plurality of user submissions; providing said extracted features to create a dictionary for every domain; conducting a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter; determining when different features that are topically unrelated have any relational correspondence, especially those retrieved using said search; generating statistical insights relating to said plurality of features and features reviews, including those features that have relational correspondence to one another, and other related information relating from said user submissions; determining a feature worthiness score based on one or more sub-features and merging any duplicate features; assigning a worthiness score to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features; ranking each feature according to said worthiness score and removing features that fall below a certain score; updating each of said features according to their rank; and generating a summary review pertaining to said plurality of user submissions; wherein said summary review includes information about each feature, feature review and statistical insights; executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement. These claim elements are considered to be abstract ideas because they are directed to “certain methods of organizing human activity” which include “managing personal behavior.” In this case, “generating a summary review for each feature based on a worthiness score” is merely filtering content. Also, the limitations of “conducting a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter,“ “determining when different features that are topically unrelated have any relational correspondence,” “generating statistical insights relating to said plurality of features and features reviews,” “assigning a worthiness score to each feature depending on…,” “ranking each feature …,” and “generating a summary review …” are merely analysis steps recited at a high level of generality such that they could practically be performed in the human mind, which are considered a “mental process” (see MPEP 2106.04(a), a claim to collecting information, analyzing it, and displaying certain results of the collection and analysis, where the data analysis steps are recited at a high level of generality such that they could practically be performed in the human mind). If a claim limitation, under its broadest reasonable interpretation, covers managing personal behavior or evaluations, then it falls within the “method of organizing human activity” or “mental processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.
Step 2A Prong 2 - The judicial exception is not integrated into a practical application. Claim 1 includes additional elements: a self-learning machine engine to create an intelligent dictionary for every domain for training an Artificial Intelligence (AI) engine.
The self-learning machine engine is merely used to: determine the feature worthiness score of extracted features based on various parameters (Paragraph 0063); generate an intelligent summary review that has the features, reviews and statistical insights (Paragraph 0067); and learn new context of interpretations of certain text when processing so better future processing can be conducted (Paragraphs 0044 & 0069). Merely stating that the step is performed by a computer component (e.g., self-learning machine engine) results in “apply it” on a computer (MPEP 2106.05f). This element of “self-learning machine engine” is recited at a high level of generality such that it amounts no more than mere instructions to apply the exception using a generic computer element. Although the self-learning machine learning includes an intelligent dictionary that is updated over time, the “updated” step is recited at a high level of generality, which results in “apply it.” In this case, the plain meaning of the “learning/updating” step is merely describing how the machine learning is receiving continuous data to iteratively adjust the values/parameters included in a dictionary. However, the claim and specification do not provide any details about how the self-learning machine engine operates (see 2024 AI Guidance, Example 47, Claim 2). Accordingly, alone and in combination, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. Therefore, the claim is directed to an abstract idea.
Step 2B - The claim does not include additional elements that are sufficient to amount significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the claims describe how to generally “apply” the concept of generating a summary review pertaining to said plurality of user submissions based on a worthiness score (e.g. filtering data). The specification shows that the self-learning machine engine is merely used to: determine the feature worthiness score of extracted features based on various parameters (Paragraph 0063); generate an intelligent summary review that has the features, reviews and statistical insights (Paragraph 0067); and learn new context of interpretations of certain text when processing so better future processing can be conducted (Paragraphs 0044 & 0069). Also, the new limitation of “executing corrective action” is merely updating a dictionary with new data (see Paragraph 0069 of Applicant’s specification, feedback may include dictionary update). This is considered a well-understood, routine, and conventional function since it's just “receiving or transmitting data over a network” and “performing repetitive calculations” (MPEP 2106.05(d)). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Independent claims 16 is directed to a system at step 1, which is a statutory category. Claim 16 recites similar limitations as claim 1 and is rejected for the same reasons at step 2a, prong one; step 2a, prong 2; and step 2b. Claim 16 further recites: a processor, a computer-readable memory, and a computer-readable tangible storage medium – which are treated as just an explicit “processor/computer” for executing the operations and are treated under MPEP 2106.05f in the same manner as claim 1. Accordingly, these additional elements are viewed as “apply it on a computer” at step 2a, prong 2 and step 2b. Thus, the claim is ineligible.
Independent claims 20 is directed to an article of manufacture at step 1, which is a statutory category. Claim 20 recites similar limitations as claim 1 and is rejected for the same reasons at step 2a, prong one; step 2a, prong 2; and step 2b. Claim 20 further recites: a processor and a computer-readable storage medium – which are treated as just an explicit “processor/computer” for executing the operations and are treated under MPEP 2106.05f in the same manner as claim 1. Accordingly, these additional elements are viewed as “apply it on a computer” at step 2a, prong 2 and step 2b. Thus, the claim is ineligible.
Dependent claims 2-3 and 17 are directed to additional elements such as: an intelligent dictionary database. The intelligent dictionary database is merely used to store rules and policies that will be useful in analyzing the responses that may be obtained. For example, chunk grammar rules to extract "feature" and "feature review" for a domain (Paragraph 0029-0031). However, using a database is considered “field of use” MPEP 2106.05h at Step 2A, Prong 2, since the database is not improved, and that data is just placed there. At Step 2B, this is conventional still, storing information in a memory (see MPEP 2106.05d). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Dependent claims 4-11 are not directed to any additional claim elements. Rather, these claims offer further descriptive limitations of the abstract idea mentioned above - such as: wherein said features and feature reviews are mapped together; wherein each feature and feature review are provided with a worthiness score; wherein said worthiness score is calculated based on frequency, reviewer rating, and characteristics of said features and feature reviews; wherein one or more augmenting multiples can be given to a particular feature or feature review; wherein same or synonymous feature and feature reviews are extracted and provided only once in said summary review; wherein user reviews about same or synonymous features will include worthiness score of a feature or feature review; and wherein said user reviews can be obtained from a multiple of platforms. These processes are similar to the abstract idea noted in the independent claim because they further the limitations of the independent claim which are directed to certain methods of organizing human activity or mental processes (see MPEP 2106.04(a), evaluations recited at a high level of generality such that they could practically be performed in the human mind). In addition, no additional elements are integrated into the abstract idea. Therefore, the claims still recite an abstract idea that can be grouped into certain methods of organizing human activity or mental processes.
Dependent claims 12-15 and 18-19 are directed to additional elements such as: a summary graph. The summary graph is merely used to show a variety of different types of nodes - "feature," "feature review," "statistics," etc. (Paragraph 0087). Merely stating that the step is performed by a computer component results in “apply it” on a computer (MPEP 2106.05f) being applicable at both Step 2A, Prong 2 and Step 2B. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Further, instructions to display and/or arrange information in a graphical user interface may not be sufficient to show an improvement in computer-functionality (MPEP 2106.05a). Thus, nothing in the claim adds significantly more to the abstract idea. The claim is ineligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-2, 4-5, and 9-20 rejected under 35 U.S.C. 103 as being unpatentable over Devanathan et al. (US 2018/0260860 A1), in view of Sundaresan et al. (US 2011/0078167 A1), in further view of Kirwin (US 2022/0129958 A1).
Regarding claim 1 (Currently Amended), Devanathan et al. discloses a method for providing an automatic review summary of user submissions, comprising (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques—all incorporated into a single automated system. Our approach overcomes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualisation techniques like treemaps. It therefore offers the following benefits):
obtaining a plurality of user submissions relating to a particular subject matter and having information relating to said subject matter (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g.—in case of smartphones—battery, or camera, or display, or processor) that is being described in the review);
extracting at least one features and at least one feature review relating to each of said plurality of user submissions (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, Examiner interprets: the “battery” as the at least one “feature”; and “heated up” as the “feature review”);
providing said extracted features to a self-learning machine engine to create an intelligent dictionary for every domain for training an Artificial Intelligence (AI) engine using said self-learning machine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0044 of Applicant’s specification, a machine learning is a self-learning. Also, Examiner interprets the lexicon files as the intelligent dictionary since it’s trained to extract specific keywords from reviews);
conducting by said self-learning machine engine a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0062, In every review sentence, the presence of aspect and sentiment words are searched. After parsing the sentence, the sentiment word which is closest to the aspect word is selected and the sentence is tagged with the corresponding aspect, sentiment tuple Examiner interprets the battery as at least one feature associated to said particular matter, wherein the said particular matter is the smartphone);
determining, using said self-learning machine engine, when different features that are topically unrelated have any relational correspondence (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Examiner notes that the words “issue” and “big pain” are topically unrelated but both words are used to express a negative review for the battery), especially those retrieved using said search conducted by said self-learning machine engine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps);
generating, using said self-learning machine engine and said intelligent dictionary and statistical insights relating to said plurality of features and features reviews, including those features that have relational correspondence to one another, and other related information relating from said user submissions (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0041 of Applicant’s specification, statistical insights may provide information of how many reviews were positive. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses statistical insights since it provides information of how many users provided a positive, neutral, or negative review. See FIG. 4, 55 users provided a positive review of the camera, wherein in the positive review the users stated that the camera is the best camera);
said self-learning machine determining a feature worthiness … based on one or more sub-features and merging any duplicate features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0054 of Applicant’s specification, reviews that mean the same are merged into one category for generating the summary. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses “merging into one category” since it can merge/cluster different phrases that are considered the same (e.g., big pain and issue) into one category (e.g., negative reviews) for generating the summary);
assigning a worthiness … to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0063 of Applicant’s specification, the worthiness may be based on various parameters like frequency of occurrence of the features in reviews, reviewer rating, and other parameters. Based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses a worthiness since it can identify important features based on term-frequency and defined key-phrases for a particular type of feature. In this case, “heated up" and “too hot” are key phrases for detecting the overall negative sentiment of the subject);
ranking each feature according to said worthiness … and removing features that fall below a certain … (see Figure 4 and related text in Paragraph 0041, 2. The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, only the key phrases and/or keywords are included in the summary); …
and generating a summary review pertaining to said plurality of user submissions, using said self-learning machine engine, wherein said summary review includes information about each feature, feature review and statistical insights (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0116, E.g. Smartphone user reviews; Paragraph 0117, 1. There are over thousands of reviews for each smartphone product across various e-commerce websites; Paragraph 0118, 2. Each smartphone can be considered as being composed of the following 4 attributes (A1 to A4)—namely camera, battery, display and processor; Paragraph 0119, 3. Each of these reviews may describe one or more of the above attributes and may have a positive or negative polarity associated with it; Paragraph 0120, 4. Each review is processed by the sentiment analysis algorithm which detects the said attributes per review and the associated polarity with those attributes. The algorithm also detects the keywords that generate the above polarity/attribute combination (see FIG. 2); Paragraph 0121, 5. The clustering algorithm uses the detected keywords as a basis to perform a semantic clustering of the reviews; Paragraph 0122, 6. Each semantically generated cluster is named appropriately based on its constituent elements; Paragraph 0123, 7. The final data set—with reviews grouped under attribute/polarity type and sub-grouped by well-named semantic clusters—is displayed as a treemap visualization; Paragraph 0124, 8. The entire information of the reviews is available in a single treemap that can be easily interpreted by users (see FIG. 4); Examiner interprets the number of reviews in Fig. 4 that are positive, neutral, or negative as the statistical insights);
executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below: Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; Paragraph 0063, In case if multiple similar tags get associated with a sentence, fine tuning is carried out with the aspect and sentiment tags; Examiner interprets “fine tuning” as the corrective action to improve any feature).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and defined key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses said self-learning machine determining a feature worthiness score based on one or more sub-features and merging any duplicate features (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents. Once a group of essential ranking documents is identified, the method applies a clustering technique to the group of documents, which infers a relationship(s) among topics that belong to that group; Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0039, The sentiment analyzer 222, and modules therein, may access information stored in files relating to the topic, such as a file 203 of topics and opinions. The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together; Examiner interprets the “weight” as the “score.” Also, Examiner interprets “grouping synonyms to generate one pattern” as “merging any duplicate feature”);
assigning a worthiness score to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Examiner interprets the “weight” as the “score”);
ranking each feature according to said worthiness score and removing features that fall below a certain score (Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0043, According to some embodiments, the phrases generated from the various KEA methods are ranked as a function of weights applied by at least one KEA method. The phrase rankings are evaluated with respect to a threshold, those phases having ranks that exceed the threshold are considered essential topics, at operation 408; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together);
… said self-learning machine engine of each of said features according to their rank (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents; Paragraph 0038, The resultant classification is used to understand opinions and expressions of sentiment about the topic. To this end, a polarity dictionary 230 may be used to identify specific polarity words, such as "good" or "horrible." The sentiment analyzer 222 includes a polarity detection unit 224, used with the polarity dictionary 230, to identify key phrases which indicate a sentiment or opinion; Paragraph 0039, The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights);
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The combination of Devanathan et al. and Sundaresan et al. discloses a self-learning machine engine for providing an automatic review summary of user submissions, wherein the self-learning machine learning is trained to identify important features according to their rank (e.g., important words and/or key phrases). Although the combination of Devanathan et al. and Sundaresan et al. further disclose a lexical dictionary with key phrases that indicate a sentiment or opinion (see Devanathan et al., Paragraph 0055-0056, lexicon files, see Sundaresan et al., Paragraph 0039, lexical dictionary), the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose how the lexical dictionary is updated (see Paragraphs 0044 & 0069 of Applicant’s specification, the dictionary may be updated based on the user or business’s feedback).
However, Kirwin discloses updating said self-learning machine engine of each of said features according to their rank; …; executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0079, That is, the NLP 1430 engine, which is representative of the NLP engines mentioned earlier, parses the text and determines a semantic meaning 1435 for the unstructured sentiment data 1425; Paragraph 0081, The NLP 1430 engine is able to parse the text provided in the unstructured sentiment data to determine that text's semantic meaning. To do so, the NLP 1430 engine is able to compare and contrast the text against a repository, storehouse, or database of other sentiment data to determine whether the combination of words reflects positive, negative, or neutral feelings towards the product. The NLP 1430 engine may be trained over time to continuously improve its semantic understanding of text, words, and phrases. In some cases, a lexical or semantic dictionary may be maintained by the NLP 1430 engine and may be updated over time to reflect an improved understanding of language meaning).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature in a dictionary (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. and Sundaresan et al. to further specify wherein the dictionary is updated over time of the invention of Kirwin because doing so would allow the method to update a lexical or semantic dictionary over time to reflect an improved understanding of language meaning (see Kirwin, Paragraph 0081). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claims 2 and 17 (Original), which are dependent of claims 1 and 16, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claims 1 and 16. Devanathan et al. further discloses wherein said extracting is performed by retrieving information from an intelligent dictionary database specific rules and policies for dealing with unstructured text submission (Paragraph 0038, Step 1 – Analysis of Reviews Using Sentiment Engine; Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the 31isualization. The machine learning techniques are used to do sentiment analysis of the user reviews; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords – “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster. Every cluster has a unique cluster ID, and a number of elements associated with it (six in the above case). The clusters detected above, are named, in an intuitive way so that the user is able to understand easily; Paragraph 0056, Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are sed to do data annotation in reviews; Examiner interprets the lexicon files as the intelligent dictionary since it’s trained to extract specific keywords from reviews).
Devanathan et al. discloses an intelligent dictionary (e.g., lexicon files), wherein the intelligent dictionary clusters/aggregates similar data (e.g., detects words that have the same meaning, such as heating and too hot). Although Devanathan et al. discloses rules for clustering/aggregating reviews, Devanathan et al. does not specifically disclose wherein the rules are stored in a database.
However, Sundaresan et al. discloses wherein said extracting is performed by retrieving information from an intelligent dictionary database, wherein said dictionary has domain specific rules and policies for dealing with unstructured text submission (Paragraph 0039, A syntactic parser 226 receives the polarity information from the polarity detection unit 224, and applies a syntactic parsing operation to the received information. The syntactic parser 226 may be used to build syntactic tree of a sentence or portion of text, and may apply heuristic rules to identify or filter particular portions of the sentence or portion of text. The syntactic parser 226 receives the text that must be analyzed as a set of sentences or strings. It mainly includes word tokenization, part of speech tagging and phrase chunking and phrase relation recognition components. Finally, a sentence is represented as a syntactic tree structure. The results from the syntactic parser 226 are applied to a lexical pattern matcher 228. The sentiment analyzer 222, and modules therein, may access information stored in files relating to the topic, such as a file 203 of topics and opinions. The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions, wherein a machine learning is used to convert unstructured data into structured data based on rules of the invention of Devanathan et al. to further specify wherein the rules are stored in a dictionary database of the invention of Sundaresan et al. because doing so would allow the method to use information from a lexical dictionary, which includes terms organized and grouped according to relationships of synonyms and so forth (see Sundaresan et al., Paragraph 0039). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 4 (Original), which is dependent of claim 1, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 1. Devanathan et al. further discloses wherein said features and feature reviews are mapped together (Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4).
Regarding claims 5 and 19 (Previously Presented), which are dependent of claims 1 and 18, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claims 1 and 18. Devanathan et al. further discloses wherein each feature and feature review are provided with a worthiness … also based on reviews are provided in said summary review see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0051 of Applicant’s specification, the worthiness of the feature includes filtering out of the less important features based on rules. Based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses a worthiness since it’s only selecting the most important phrases).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and defined key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses wherein each feature and feature review are provided with a worthiness score also based on reviews are provided in said summary review (Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0043, According to some embodiments, the phrases generated from the various KEA methods are ranked as a function of weights applied by at least one KEA method. The phrase rankings are evaluated with respect to a threshold, those phases having ranks that exceed the threshold are considered essential topics, at operation 408; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 9 (Original), which is dependent of claim 5, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 5. Devanathan et al. further discloses wherein same or synonymous feature and feature reviews are extracted and provided only once in said summary review (Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords—“battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster. Every cluster has a unique cluster ID, and a number of elements associated with it (six in the above case). The clusters detected above, are named, in an intuitive way so that the user is able to understand easily).
Regarding claim 10 (Original), which is dependent of claim 1, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 1. Devanathan et al. further discloses wherein user reviews about same or synonymous features will include worthiness … of a feature or feature review (Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords—“battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster. Every cluster has a unique cluster ID, and a number of elements associated with it (six in the above case). The clusters detected above, are named, in an intuitive way so that the user is able to understand easily; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; It can be noted that the claim language is written in alternative form. The limitation taught by Devanathan et al. is based on “feature review”).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and defined key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses wherein user reviews about same or synonymous features will include worthiness score of a feature or feature review (Paragraph 0039, The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 11 (Original), which is dependent of claim 1, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 1. Devanathan et al. further discloses wherein said user reviews can be obtained from a multiple of platforms (Paragraph 0003, To complicate the issue further, finding relevant information has become increasing more difficult with the sheer volume of information now available on the internet combined with the information being made available on a daily basis on internet and other systems; Paragraph 0017, User reviews have been an ubiquitous fixture ever since the advent of online commerce and user-generated content on the internet).
Regarding claims 12 and 18 (Original), which are dependent of claims 1 and 16, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claims 1 and 16. Devanathan et al. further discloses generating a summary graph from said summary review (Paragraph 0036, The information in user reviews can easily be mined for insights by using the herein disclosed automated system, and these insights could be presented in an easily-understandable graphical manner to the user—thereby allowing to instantly receive the full depth of knowledge and information about a product (as contained in its reviews), without having to manually process all the information; Paragraph 0116, E.g. Smartphone user reviews; Paragraph 0117, 1. There are over thousands of reviews for each smartphone product across various e-commerce websites; Paragraph 0118, 2. Each smartphone can be considered as being composed of the following 4 attributes (A1 to A4)—namely camera, battery, display and processor; Paragraph 0119, 3. Each of these reviews may describe one or more of the above attributes and may have a positive or negative polarity associated with it; Paragraph 0120, 4. Each review is processed by the sentiment analysis algorithm which detects the said attributes per review and the associated polarity with those attributes. The algorithm also detects the keywords that generate the above polarity/attribute combination (see FIG. 2); Paragraph 0121, 5. The clustering algorithm uses the detected keywords as a basis to perform a semantic clustering of the reviews; Paragraph 0122, 6. Each semantically generated cluster is named appropriately based on its constituent elements; Paragraph 0123, 7. The final data set—with reviews grouped under attribute/polarity type and sub-grouped by well-named semantic clusters—is displayed as a treemap visualization; Paragraph 0124, 8. The entire information of the reviews is available in a single treemap that can be easily interpreted by users (see FIG. 4); Examiner interprets the “treemap visualization” as the “summary graph”).
Regarding claim 13 (Previously Presented), which is dependent of claim 12, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 2. Devanathan et al. further discloses wherein said summary graph presents relation between any feature(s) and any feature review(s) with statistical insights (Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4; Examiner interprets “displaying the number of users that provided a positive, neutral, or negative review” as the “statistical insights” (see FIG. 4), wherein the data is displayed by feature (e.g., camera) and feature review (e.g., very good)).
Regarding claim 14 (Previously Presented), which is dependent of claim 13, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 13. Devanathan et al. further discloses generating an objective automated summary to be provided with said summary graph, wherein said automated summary is generated by traversing through an intelligent summary graph to provide additional textual information (Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4; Examiner notes that Fig. 4 provides additional textual information since it provides an additional summary of the positive and negative reviews).
Regarding claim 15 (Previously Presented), which is dependent of claim 12, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 12. Devanathan et al. further discloses wherein any information obtained in generating said summary graph that is deemed not to be in a dictionary … (Paragraph 0041, 2. The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; Examiner interprets the lexicon files as the intelligent dictionary since it’s trained to extract specific keywords from reviews).
Devanathan et al. discloses a dictionary (e.g., lexicon files), wherein the dictionary clusters/aggregates similar data (e.g., detects words that have the same meaning, such as heating and too hot). Although Devanathan et al. discloses rules for clustering/aggregating reviews, Devanathan et al. does not specifically disclose updating the dictionary database.
However, Kirwin discloses wherein any information obtained in generating said summary graph that is deemed not to be in said dictionary is stored in said dictionary and said dictionary is update (Paragraph 0079, That is, the NLP 1430 engine, which is representative of the NLP engines mentioned earlier, parses the text and determines a semantic meaning 1435 for the unstructured sentiment data 1425; Paragraph 0081, The NLP 1430 engine is able to parse the text provided in the unstructured sentiment data to determine that text's semantic meaning. To do so, the NLP 1430 engine is able to compare and contrast the text against a repository, storehouse, or database of other sentiment data to determine whether the combination of words reflects positive, negative, or neutral feelings towards the product. The NLP 1430 engine may be trained over time to continuously improve its semantic understanding of text, words, and phrases. In some cases, a lexical or semantic dictionary may be maintained by the NLP 1430 engine and may be updated over time to reflect an improved understanding of language meaning)
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature in a dictionary (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the dictionary is updated over time of the invention of Kirwin because doing so would allow the method to update a lexical or semantic dictionary over time to reflect an improved understanding of language meaning (see Kirwin, Paragraph 0081). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 16 (Currently Amended), Devanathan et al. discloses a computer system for providing an automatic review summary of user submissions, comprising (Paragraph 0001, Method and automated system for performing consumer research; Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques—all incorporated into a single automated system. Our approach overcomes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualisation techniques like treemaps. It therefore offers the following benefits):
one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage medium, and program instructions stored on at least one of the one or more tangible storage medium for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising (Paragraph 0028, In another embodiment there is provided a computer program product comprising at: least one non-transitory computer-readable medium containing program instructions that can be executed by a computer or other device, causing it to perform a disclosed method essentially as described herein; Claim 14, one processor to implement the steps):
obtaining a plurality of user submissions relating to a particular subject matter and having information relating to said subject matter (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g.—in case of smartphones—battery, or camera, or display, or processor) that is being described in the review);
extracting at least one features and at least one feature review relating to each of said plurality of user submissions (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, Examiner interprets: the “battery” as the at least one “feature”; and “heated up” as the “feature review”);
providing said extracted features to a self-learning machine engine to create an intelligent dictionary for every domain for training an Artificial Intelligence (AI) engine using said self-learning machine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0044 of Applicant’s specification, a machine learning is a self-learning. Also, Examiner interprets the lexicon files as the intelligent dictionary since it’s trained to extract specific keywords from reviews);
conducting by said self-learning machine engine a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0062, In every review sentence, the presence of aspect and sentiment words are searched. After parsing the sentence, the sentiment word which is closest to the aspect word is selected and the sentence is tagged with the corresponding aspect, sentiment tuple Examiner interprets the battery as at least one feature associated to said particular matter, wherein the said particular matter is the smartphone);
determining, using said self-learning machine engine, when different features that are topically unrelated have any relational correspondence (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Examiner notes that the words “issue” and “big pain” are topically unrelated but both words are used to express a negative review for the battery), especially those retrieved using said search conducted by said self-learning machine engine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps);
generating, using said self-learning machine engine and said intelligent dictionary and statistical insights relating to said plurality of features and features reviews, including those features that have relational correspondence to one another, and other related information relating from said user submissions (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0041 of Applicant’s specification, statistical insights may provide information of how many reviews were positive. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses statistical insights since it provides information of how many users provided a positive, neutral, or negative review. See FIG. 4, 55 users provided a positive review of the camera, wherein in the positive review the users stated that the camera is the best camera);
said self-learning machine determining a feature worthiness … based on one or more sub-features and merging any duplicate features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0054 of Applicant’s specification, reviews that mean the same are merged into one category for generating the summary. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses “merging into one category” since it can merge/cluster different phrases that are considered the same (e.g., big pain and issue) into one category (e.g., negative reviews) for generating the summary);
assigning a worthiness … to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0063 of Applicant’s specification, the worthiness may be based on various parameters like frequency of occurrence of the features in reviews, reviewer rating, and other parameters. Based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses a worthiness since it can identify important features based on term-frequency and defined key-phrases for a particular type of feature. In this case, “heated up" is defined as a key phrase for detection of “battery negative class”);
ranking each feature according to said worthiness … and removing features that fall below a certain … (see Figure 4 and related text in Paragraph 0041, 2. The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, only the key phrases and/or keywords are included in the summary); …
and generating a summary review pertaining to said plurality of user submissions, using said self-learning machine engine, wherein said summary review includes information about each feature, feature review and statistical insights (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0116, E.g. Smartphone user reviews; Paragraph 0117, 1. There are over thousands of reviews for each smartphone product across various e-commerce websites; Paragraph 0118, 2. Each smartphone can be considered as being composed of the following 4 attributes (A1 to A4)—namely camera, battery, display and processor; Paragraph 0119, 3. Each of these reviews may describe one or more of the above attributes and may have a positive or negative polarity associated with it; Paragraph 0120, 4. Each review is processed by the sentiment analysis algorithm which detects the said attributes per review and the associated polarity with those attributes. The algorithm also detects the keywords that generate the above polarity/attribute combination (see FIG. 2); Paragraph 0121, 5. The clustering algorithm uses the detected keywords as a basis to perform a semantic clustering of the reviews; Paragraph 0122, 6. Each semantically generated cluster is named appropriately based on its constituent elements; Paragraph 0123, 7. The final data set—with reviews grouped under attribute/polarity type and sub-grouped by well-named semantic clusters—is displayed as a treemap visualization; Paragraph 0124, 8. The entire information of the reviews is available in a single treemap that can be easily interpreted by users (see FIG. 4); Examiner interprets the number of reviews in Fig. 4 that are positive, neutral, or negative as the statistical insights);
executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below: Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; Paragraph 0063, In case if multiple similar tags get associated with a sentence, fine tuning is carried out with the aspect and sentiment tags; Examiner interprets “fine tuning” as the corrective action to improve any feature).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and defined key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses said self-learning machine determining a feature worthiness score based on one or more sub-features and merging any duplicate features (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents. Once a group of essential ranking documents is identified, the method applies a clustering technique to the group of documents, which infers a relationship(s) among topics that belong to that group; Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0039, The sentiment analyzer 222, and modules therein, may access information stored in files relating to the topic, such as a file 203 of topics and opinions. The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together; Examiner interprets the “weight” as the “score.” Also, Examiner interprets “grouping synonyms to generate one pattern” as “merging any duplicate feature”);
assigning a worthiness score to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Examiner interprets the “weight” as the “score”);
ranking each feature according to said worthiness score and removing features that fall below a certain score (Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0043, According to some embodiments, the phrases generated from the various KEA methods are ranked as a function of weights applied by at least one KEA method. The phrase rankings are evaluated with respect to a threshold, those phases having ranks that exceed the threshold are considered essential topics, at operation 408; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together);
… said self-learning machine engine of each of said features according to their rank (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents; Paragraph 0038, The resultant classification is used to understand opinions and expressions of sentiment about the topic. To this end, a polarity dictionary 230 may be used to identify specific polarity words, such as "good" or "horrible." The sentiment analyzer 222 includes a polarity detection unit 224, used with the polarity dictionary 230, to identify key phrases which indicate a sentiment or opinion; Paragraph 0039, The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights);
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The combination of Devanathan et al. and Sundaresan et al. discloses a self-learning machine engine for providing an automatic review summary of user submissions, wherein the self-learning machine learning is trained to identify important features according to their rank (e.g., important words and/or key phrases). Although the combination of Devanathan et al. and Sundaresan et al. further disclose a lexical dictionary with key phrases that indicate a sentiment or opinion (see Devanathan et al., Paragraph 0055-0056, lexicon files, see Sundaresan et al., Paragraph 0039, lexical dictionary), the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose how the lexical dictionary is updated (see Paragraphs 0044 & 0069 of Applicant’s specification, the dictionary may be updated based on the user or business’s feedback).
However, Kirwin discloses updating said self-learning machine engine of each of said features according to their rank; …; executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0079, That is, the NLP 1430 engine, which is representative of the NLP engines mentioned earlier, parses the text and determines a semantic meaning 1435 for the unstructured sentiment data 1425; Paragraph 0081, The NLP 1430 engine is able to parse the text provided in the unstructured sentiment data to determine that text's semantic meaning. To do so, the NLP 1430 engine is able to compare and contrast the text against a repository, storehouse, or database of other sentiment data to determine whether the combination of words reflects positive, negative, or neutral feelings towards the product. The NLP 1430 engine may be trained over time to continuously improve its semantic understanding of text, words, and phrases. In some cases, a lexical or semantic dictionary may be maintained by the NLP 1430 engine and may be updated over time to reflect an improved understanding of language meaning).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature in a dictionary (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. and Sundaresan et al. to further specify wherein the dictionary is updated over time of the invention of Kirwin because doing so would allow the method to update a lexical or semantic dictionary over time to reflect an improved understanding of language meaning (see Kirwin, Paragraph 0081). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 20 (Currently Amended), Devanathan et al. discloses a computer program product for providing an automatic review summary of user submissions, comprising (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques—all incorporated into a single automated system. Our approach overcomes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualisation techniques like treemaps. It therefore offers the following benefits; Paragraph 0028, In another embodiment there is provided a computer program product comprising at: least one non-transitory computer-readable medium containing program instructions that can be executed by a computer or other device, causing it to perform a disclosed method essentially as described herein):
one or more computer-readable storage medium and program instructions stored on at least one of the one or more tangible storage medium, the program instructions executable by a processor, the program instructions comprising (Paragraph 0028, In another embodiment there is provided a computer program product comprising at: least one non-transitory computer-readable medium containing program instructions that can be executed by a computer or other device, causing it to perform a disclosed method essentially as described herein; Claim 14, one processor to implement the steps):
obtaining a plurality of user submissions relating to a particular subject matter and having information relating to said subject matter (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g.—in case of smartphones—battery, or camera, or display, or processor) that is being described in the review);
extracting at least one features and at least one feature review relating to each of said plurality of user submissions (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualisation. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, Examiner interprets: the “battery” as the at least one “feature”; and “heated up” as the “feature review”);
providing said extracted features to a self-learning machine engine to create an intelligent dictionary for every domain for training an Artificial Intelligence (AI) engine using said self-learning machine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0044 of Applicant’s specification, a machine learning is a self-learning. Also, Examiner interprets the lexicon files as the intelligent dictionary since it’s trained to extract specific keywords from reviews);
conducting by said self-learning machine engine a search that related to different features in a plurality of other user submissions that have adjacent correspondence to said particular subject matter (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps; Paragraph 0040, The product attribute is detected (e.g. - in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination - (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0062, In every review sentence, the presence of aspect and sentiment words are searched. After parsing the sentence, the sentiment word which is closest to the aspect word is selected and the sentence is tagged with the corresponding aspect, sentiment tuple Examiner interprets the battery as at least one feature associated to said particular matter, wherein the said particular matter is the smartphone);
determining, using said self-learning machine engine, when different features that are topically unrelated have any relational correspondence (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Examiner notes that the words “issue” and “big pain” are topically unrelated but both words are used to express a negative review for the battery), especially those retrieved using said search conducted by said self-learning machine engine (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system. Our approach over comes the drawbacks of information overload in user reviews, by automatically mining information from the entire body of reviews, aggregating, grouping this information and displaying it using easily comprehensible visualization techniques like treemaps);
generating, using said self-learning machine engine and said intelligent dictionary and statistical insights relating to said plurality of features and features reviews, including those features that have relational correspondence to one another, and other related information relating from said user submissions (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0044, At the beginning of this step, the generated list of reviews for each product that are grouped by sentiment polarity and attribute type. For e.g., under “battery negative” which may have over 300 reviews, while under “display positive” may have another 500. These 300 reviews are also too many to process visually, even though they have been organized thematically. Therefore, at this step, we further simplify the structure of the data by grouping the reviews under each attribute/sentiment combination using a clustering algorithm. The clustering algorithm does a semantic clustering of the reviews under each attribute sentiment combination, using the highlighted text fragment as inputs; Paragraph 0048, The data thus annotated, is now ready to be displayed on a treemap visualization (see working examples as shown in FIGS. 2 & 4). The tree map clearly conveys the data about all reviews. Users can click on a particular cluster and navigate to read the full text of reviews under that cluster, if they choose to. The summary visualization encapsulates all the information in the reviews in a succinct manner; Paragraph 0124, The entire information of the reviews is available in a single treemap that can be easily interpreted by users, see FIG. 4; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; As stated in Paragraph 0041 of Applicant’s specification, statistical insights may provide information of how many reviews were positive. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses statistical insights since it provides information of how many users provided a positive, neutral, or negative review. See FIG. 4, 55 users provided a positive review of the camera, wherein in the positive review the users stated that the camera is the best camera);
said self-learning machine determining a feature worthiness … based on one or more sub-features and merging any duplicate features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0045, For e.g, if there are 6 reviews which have the following sets of detected keywords — “battery gets heated up”, “heating problem in battery”, “battery too hot”, “extreme heating battery”, “battery heating is a big pain”, “major battery heating issue” etc, they will be assigned to the same cluster; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0054 of Applicant’s specification, reviews that mean the same are merged into one category for generating the summary. Therefore, based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses “merging into one category” since it can merge/cluster different phrases that are considered the same (e.g., big pain and issue) into one category (e.g., negative reviews) for generating the summary);
assigning a worthiness … to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (see Figure 2; Paragraph 0022, A pipeline is described herein for the analysis of reviews which includes steps like preprocessing of the reviews to clean them, identify key-phrases from the reviews, sentence boundary detection, semi-supervised labelling of reviews, training machine learning classifier to compute the prediction scores and computing the sentiment scores of review; Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. "battery gets heated up" can be defined as a key phrase for detection of “battery negative class”; Paragraph 0066, Aspect and sentiment classifier — The machine learning approaches is used to predict the aspect class and sentiment class by using labelled review sentences in following step; Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0108, iii. For each cluster the text data closest to its centroid is selected. The selected text data are sorted according to sentiment classifier confidence score and at maximum 20 reviews are selected; As stated in Paragraph 0063 of Applicant’s specification, the worthiness may be based on various parameters like frequency of occurrence of the features in reviews, reviewer rating, and other parameters. Based on broadest reasonable interpretation in light of the specification, Devanathan et al. discloses a worthiness since it can identify important features based on term-frequency and defined key-phrases for a particular type of feature. In this case, “heated up" is defined as a key phrase for detection of “battery negative class”);
ranking each feature according to said worthiness … and removing features that fall below a certain … (see Figure 4 and related text in Paragraph 0041, 2. The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; In this case, only the key phrases and/or keywords are included in the summary); …
and generating a summary review pertaining to said plurality of user submissions, using said self-learning machine engine, wherein said summary review includes information about each feature, feature review and statistical insights (Paragraph 0024, Therefore such as herein described there is provided a method for interpreting the information in user reviews, using natural language processing, machine learning (clustering) and data visualization techniques all incorporated into a single automated system; Paragraph 0116, E.g. Smartphone user reviews; Paragraph 0117, 1. There are over thousands of reviews for each smartphone product across various e-commerce websites; Paragraph 0118, 2. Each smartphone can be considered as being composed of the following 4 attributes (A1 to A4)—namely camera, battery, display and processor; Paragraph 0119, 3. Each of these reviews may describe one or more of the above attributes and may have a positive or negative polarity associated with it; Paragraph 0120, 4. Each review is processed by the sentiment analysis algorithm which detects the said attributes per review and the associated polarity with those attributes. The algorithm also detects the keywords that generate the above polarity/attribute combination (see FIG. 2); Paragraph 0121, 5. The clustering algorithm uses the detected keywords as a basis to perform a semantic clustering of the reviews; Paragraph 0122, 6. Each semantically generated cluster is named appropriately based on its constituent elements; Paragraph 0123, 7. The final data set—with reviews grouped under attribute/polarity type and sub-grouped by well-named semantic clusters—is displayed as a treemap visualization; Paragraph 0124, 8. The entire information of the reviews is available in a single treemap that can be easily interpreted by users (see FIG. 4); Examiner interprets the number of reviews in Fig. 4 that are positive, neutral, or negative as the statistical insights);
executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below: Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; Paragraph 0063, In case if multiple similar tags get associated with a sentence, fine tuning is carried out with the aspect and sentiment tags; Examiner interprets “fine tuning” as the corrective action to improve any feature).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses said self-learning machine determining a feature worthiness score based on one or more sub-features and merging any duplicate features (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents. Once a group of essential ranking documents is identified, the method applies a clustering technique to the group of documents, which infers a relationship(s) among topics that belong to that group; Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0039, The sentiment analyzer 222, and modules therein, may access information stored in files relating to the topic, such as a file 203 of topics and opinions. The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together; Examiner interprets the “weight” as the “score.” Also, Examiner interprets “grouping synonyms to generate one pattern” as “merging any duplicate feature”);
assigning a worthiness score to each feature depending on a plurality of factors including number of reviews, type of features and any relational correspondence to one or more other features (Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Examiner interprets the “weight” as the “score”);
ranking each feature according to said worthiness score and removing features that fall below a certain score (Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Paragraph 0043, According to some embodiments, the phrases generated from the various KEA methods are ranked as a function of weights applied by at least one KEA method. The phrase rankings are evaluated with respect to a threshold, those phases having ranks that exceed the threshold are considered essential topics, at operation 408; Paragraph 0044, Once the list of topics and subtopics are identified, the process associates documents with corresponding topics at operation 410. For a given topic, those documents in which the topic (e.g., essential key phrase) appears are simply grouped together);
… said self-learning machine engine of each of said features according to their rank (Paragraph 0019, In one embodiment, a natural language process is used to identify key phrases related to the topic of interest among the various documents. Further, such processing may apply a machine learning method to extract key phrases covered in the discussion posts and other documents; Paragraph 0038, The resultant classification is used to understand opinions and expressions of sentiment about the topic. To this end, a polarity dictionary 230 may be used to identify specific polarity words, such as "good" or "horrible." The sentiment analyzer 222 includes a polarity detection unit 224, used with the polarity dictionary 230, to identify key phrases which indicate a sentiment or opinion; Paragraph 0039, The polarity detection unit 224 further uses information from a lexical dictionary 240, which includes terms organized and grouped according to relationships of synonyms and so forth. Operation of the sentiment analyzer 222 is detailed in FIG. 5. The polarity detection unit 224 uses a lexical dictionary containing a set of words with an associated integer (+) or (-) representing its polarity. A sentiment expression may be a combination between polarity words and lexical words. For example, the lexical words, "anymore," "at all," "again," "any longer" may show negative meaning when following "not do . . . . ," although they are not necessarily polarity words. So we will group these synonyms and generate one pattern; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights);
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
The combination of Devanathan et al. and Sundaresan et al. discloses a self-learning machine engine for providing an automatic review summary of user submissions, wherein the self-learning machine learning is trained to identify important features according to their rank (e.g., important words and/or key phrases). Although the combination of Devanathan et al. and Sundaresan et al. further disclose a lexical dictionary with key phrases that indicate a sentiment or opinion (see Devanathan et al., Paragraph 0055-0056, lexicon files, see Sundaresan et al., Paragraph 0039, lexical dictionary), the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose how the lexical dictionary is updated (see Paragraphs 0044 & 0069 of Applicant’s specification, the dictionary may be updated based on the user or business’s feedback).
However, Kirwin discloses updating said self-learning machine engine of each of said features according to their rank; …; executing corrective action using said summary search review relating to each feature to improve any feature that needs improvement (Paragraph 0079, That is, the NLP 1430 engine, which is representative of the NLP engines mentioned earlier, parses the text and determines a semantic meaning 1435 for the unstructured sentiment data 1425; Paragraph 0081, The NLP 1430 engine is able to parse the text provided in the unstructured sentiment data to determine that text's semantic meaning. To do so, the NLP 1430 engine is able to compare and contrast the text against a repository, storehouse, or database of other sentiment data to determine whether the combination of words reflects positive, negative, or neutral feelings towards the product. The NLP 1430 engine may be trained over time to continuously improve its semantic understanding of text, words, and phrases. In some cases, a lexical or semantic dictionary may be maintained by the NLP 1430 engine and may be updated over time to reflect an improved understanding of language meaning).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature in a dictionary (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. and Sundaresan et al. to further specify wherein the dictionary is updated over time of the invention of Kirwin because doing so would allow the method to update a lexical or semantic dictionary over time to reflect an improved understanding of language meaning (see Kirwin, Paragraph 0081). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claim 3 is rejected under 35 U.S.C. 103 as being unpatentable over Devanathan et al. (US 2018/020860 A1), in view of Sundaresan et al. (US 2011/0078167 A1), in further view of Kirwin (US 2022/0129958 A1) and Clark (US 2017/0124575 A1).
Regarding claim 3 (Original), which is dependent of claim 2, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 2. Devanathan et al. further discloses wherein said dictionary provides chunk grammar information for analyzing unstructured text so as to extract features, feature reviews and … (Paragraph 0039, This step converts the unstructured data of reviews into structured data, that can be used for the visualization. The machine learning techniques are used to do sentiment analysis of the user reviews. At the end of this step, we achieve the following; Paragraph 0040, The product attribute is detected (e.g. – in case of smartphones—battery, or camera, or display, or processor) that is being described in the review. For accomplishing this machine learning and natural language processing techniques are used. The polarity of the sentiment (positive/negative/neutral) in the review is also detected. As a result of this step, have every review annotated by the detected attribute class/sentiment class combination – (for e.g. battery negative, camera positive etc.); Paragraph 0041, The text fragments that generate this positive or negative sentiment are simultaneously detected for the detected attribute, using machine learning techniques. For e.g. “battery gets heated up” can be defined as a key phrase for detection of “battery negative class”; Paragraph 0042, Thus at the end of step one, for each product, A list of reviews that is annotated is generated by a combination of attribute-sentiment polarity and the keywords that generated that combination; Paragraph 0055, Creation of sentiment and aspect lexicons—Aspect based sentiment analysis on user reviews is carried out using machine learning and natural language processing. Supervised machine learning algorithms needs labelled data for training. The steps to generate labelled training data in semi-supervised setting are as below; Paragraph 0056, a. Extraction of keywords for all sentiment and aspect classes from reviews to build lexicon files. These lexicons are used to do data annotation in reviews; In this case, Examiner interprets: the “battery” as the at least one “feature”; and “heated up” as the “feature review”).
Although the combination of Devanathan et al. and Sundaresan et al. discloses wherein said dictionary provides chunk grammar information for analyzing unstructured text so as to extract features and feature reviews (see Devanathan et al., battery gets heated up; see Sundaresan et al., Camera is good), the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose analyzing feature values.
However, Clark et al. discloses wherein said dictionary provides chunk grammar information for analyzing unstructured text so as to extract features, feature reviews and feature values (Paragraph 0069, For another example, a product review may state the specific form only and leave the product feature implicit (e.g., “$250 is too much to pay for a hotel room” or “a 1.4 liter engine is too small”); Paragraph 0073, This may include, for example, a computer system analyzing the sentiment associated with a specific form of a product feature not only for determining whether to increase the preliminary sentiment score for the product feature (as is described in operations 303 and 304 of FIG. 3)).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions, wherein a sentiment engine is used to convert unstructured data into structured data by extracting features and feature reviews of the invention of Devanathan et al. and Sundaresan et al. to further incorporate extracting feature values of the invention of Clark et al. because doing so would allow the method to analyze the sentiment associated with a specific form (see Clark et al., Paragraphs 0069 & 0073, 1.4 liter is too small). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Claims 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Devanathan et al. (US 2018/020860 A1), in view of Sundaresan et al. (US 2011/0078167 A1), in further view of Kirwin (US 2022/0129958 A1) and Rashid (Rashid, A. and Huang, C.Y., 2021. Sentiment analysis on consumer reviews of Amazon products. International Journal of Computer Theory and Engineering).
Regarding claim 6 (Previously Presented), which is dependent of claim 1, the combination of Devanathan et al., Sundaresan et al., and Kirwin discloses all the limitations in claim 1. Devanathan et al. further discloses wherein said worthiness … is calculated based on frequency, reviewer rating, and characteristics of said features and feature reviews (Paragraph 0071, e. using term-frequency, inverse document frequency, bigram and key phrases as features for the logistic regression based sentiment classifier; Paragraph 0072, f. selecting those review sentences for which the sentiment classifier prediction agrees with the labelled data which is commonly known as diagonal elements of the classifier confusion matrix; Paragraph 0094, A. The important phrases are extracted in the corpus using data driven approach as mentioned in Kumar (2014) and annotate the corpus with phrases; Paragraph 0118, 2. Each smartphone can be considered as being composed of the following 4 attributes (A1 to A4)—namely camera, battery, display and processor; Paragraph 0119, 3. Each of these reviews may describe one or more of the above attributes and may have a positive or negative polarity associated with it; For example, the words mobile handset becomes mobile_handset etc.; Examiner interprets key-phrases as characteristics of said features).
Although Devanathan et al. discloses a worthiness/importance to each feature (e.g., based on frequency of occurrence and defined key-phrases/words for each type of feature), Devanathan et al. does not specifically disclose wherein the worthiness/importance to each feature is a score.
However, Sundaresan et al. discloses wherein said worthiness score is calculated based on frequency, …, and characteristics of said features and feature reviews (Paragraph 0034, The topic extraction and sentiment analysis module 32 of FIG. 1 may further be described as in FIG. 3, which is a block diagram illustrating a system 200 and apparatus for implementing a topic extraction method, according to an example embodiment. The system 200 extracts topic information by identifying key phrases in documents and other texts, and then ranks the key phrases according to import of the key phrases in identifying sentiment or opinion in the document or text; Paragraph 0035, The target of topic extraction is a set of documents within a given set or corpus. A document as used herein refers to information in a textual form, such as comments submitted to a community forum. The system 200 identifies information related to a specific topic, such as a digital camera, and from this information determines opinions and other sentiment related to the topic. The topic may be broadly defined, and may include multiple subtopics; Paragraph 0042, Continuing with FIG. 4, the text extractor 202 is further to apply a ranking algorithm to key phrases to identify essential key phrases at operation 406. In some embodiments, the ranking is done by Term Frequency-Inverse Document Frequency (TF-IDF) techniques where a weight is used to evaluate the importance, significance or relevance of a word or phrase. A TF-IDF weight is a statistical measure used in information retrieval and text mining. The TF-IDF weight indicates how important a word is within a document in a collection or corpus. In some embodiments, the importance increases proportionally to the number of times a word appears in a given document, which may be weighted against the frequency of the word in the corpus. Some embodiments implement a ranking function that is computed as a function of the TF-IDF weights; Examiner interprets the “weight” as the “score”).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/words for each type of feature (e.g., based on frequency of occurrence and key-phrases) of the invention of Devanathan et al. to further specify wherein the worthiness/importance to each feature is a score (e.g., weight based on the frequency of occurrence) of the invention of Sundaresan et al. because doing so would allow the method to use a weight to evaluate the importance, significance or relevance of a word or phrase (see Sundaresan et al., Paragraph 0042). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Although the combination of Devanathan et al. and Sundaresan et al. discloses a worthiness/importance score based on frequency and characteristics of said features and feature reviews, the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose wherein the score is based on reviewer rating.
However, Rashid discloses wherein said worthiness … is calculated based on … reviewer rating, … (Page 30, B. Number of Helpful Reviews by User, In Fig. 12 below we can see that users NF, Bashaw, Adam, Alex S and so on have a higher number of helpful reviews as compared to other users, so we can assume that these users reviews about any amazon product will be more helpful for any consumer who is reading reviews before buying any product).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/features of the invention (e.g., worthiness score based on frequency of occurrence and defined key-phrases) of the invention of Devanathan et al. and Sundaresan et al. to further specify wherein the worthiness/importance to each feature is calculated based on reviewer rating of the invention of Rashid because doing so would allow the method to determine which user reviews are more helpful for any consumer (see Rashid, Page 30, B. Number of Helpful Reviews by User). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 7 (Original), which is dependent of claim 6, the combination of Devanathan et al., Sundaresan et al., Kirwin, and Rashid discloses all the limitations in claim 6. Although the combination of Devanathan et al. and Sundaresan et al. discloses selecting features to be provided in a summary review based on frequency and/or characteristics of said features and feature reviews, the combination of Devanathan et al. and Sundaresan et al. does not specifically disclose selecting features to be provided in a summary review based on a source that said user providing said submission or said user submission source.
However, Rashid discloses wherein one or more augmenting multiples can be given to a particular feature or feature review based on a source that said user providing said submission or said user submission source (Page 30, B. Number of Helpful Reviews by User, In Fig. 12 below we can see that users NF, Bashaw, Adam, Alex S and so on have a higher number of helpful reviews as compared to other users, so we can assume that these users reviews about any amazon product will be more helpful for any consumer who is reading reviews before buying any product); In this case, Examiner notes that Rashid is augmenting a source from users that provide a higher number of helpful reviews).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/features of the invention (e.g., worthiness score based on frequency of occurrence and defined key-phrases) of the invention of Devanathan et al. and Sundaresan et al. to further specify wherein the worthiness/importance to each feature is calculated based on a user submission source of the invention of Rashid because doing so would allow the method to determine which user reviews are more helpful for any consumer (see Rashid, Page 30, B. Number of Helpful Reviews by User). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Regarding claim 8 (Previously Presented), which is dependent of claim 6, the combination of Devanathan et al., Sundaresan et al., and Kirwin, and Rashid discloses all the limitations in claim 6. Although Devanathan et al. discloses selecting features to be provided in a summary review based on frequency and/or characteristics of said features and feature reviews, Devanathan et al. does not specifically disclose augmenting based on importance and/or uniqueness of said feature or feature reviews.
However, Rashid discloses wherein one or more augmenting multiples can be given to a particular feature or feature review based on importance and/or characteristic of said feature or feature reviews (Page 38, A. Common Words Used in Reviews, Fig. 10 shows the common words used in the reviews which have good ratings [21]. The words with higher frequency are greater in font size and are darker in color. The font size of the word decreases, and the color becomes lighter as the frequency of the words in the reviews).
It would have been obvious to one ordinary skill in the art before the effective filing date to modify the method for providing an automatic review summary of user submissions by determining important phrases/features of the invention of Devanathan et al. to further specify wherein the important phrases/features are augmented of the invention of Rashid because doing so would allow the method to increase font size for words with higher frequency (see Rashid, Page 4, 2.1 Statistical Approach for Extractive Summarization Page 38, A. Common Words Used in Reviews). Further, the claimed invention is merely a combination of old elements, and in combination each element would have performed the same function as it did separately, and one of ordinary skill in the art would have recognized that the results of the combination were predictable.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant’s disclosure.
Hu (Hu, M. and Liu, B., 2004, August. Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177)) – discloses to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. (see at least Abstract).
Gupta (US 2023/0040315 A1) – discloses to determine one or more additional listings based on textual similarity or dissimilarity between a portion of the text included in the first review and text included in one or more reviews for the one or more additional listings (see Abstract).
Qiao et al. (US 2019/0361987 A1) – discloses reviews that have the largest helpfulness scores (i.e., the reviews that are determined to be most helpful) are selected as candidates for summarization. A modeling algorithm, such as a bootstrap topic-modeling algorithm, may be used to learn a unique set of semantic topics for each product or service, and the probability of a review matching one of the topics is estimated (see at least Paragraph 0019).
Bandaru et al. (US 7,930,302 B2) – discloses a sentence parser and tokenizer extracts sentences from each review stored in the review database, and tags each sentence with an indication of its `part of speech`, such as noun or adjective, prior to passing the sentence to semantic processor. The semantic processor uses both heuristic databases and scoring algorithms to generate a score for each relevant sentence. The semantic scoring algorithm extracts all key aspects of a user review associated with a particular business or service, and generates a single score for each pre-defined core attribute, such as price or service. A synopsis generator uses both relevant fragments and complete sentences, which are semantically tagged, to generate a summary from all the reviews collected in order to provide a searcher with results that include both substance, such as facts and figures, and sentiment, such as feelings, and which are then formatted in an easy to understand way in order to present the search results in a logical and quickly accessible manner (see at least Column 3, lines 50-67).
Agarwal (US 2021/0312124 A1) – discloses n an embodiment, in case a given adjective may not be listed (or present) or the corresponding polarity value for a given adjective is not provided (or missing) in the adjective-polarity database 308, the adjective-polarity module 316 may configure the processing unit 310 to automatically detect the polarity value of the given adjective. The processing unit is further configured to assign, via the adjective-polarity module 316, the polarity value for at least one adjective identified in the received tagged natural language text content by first identifying at least a predefined number of other available tagged natural language text content having the same at least one adjective. Herein, the predefined number may be a minimum number of user reviews having a given adjective, considered to determine the polarity value for the given adjective (see at least Paragraph 0069).
Moreau et al. (US 2017/0091816 A1) – discloses a review of a “Mobile Device” containing a link to a mobile device product webpage recites “The camera is quite nice for one thing—one of the best I've used on any [mobile device] smartphone. The phone's fingerprint sensor has worked flawlessly too. One of the big problems I faced on this phone is battery life.” As illustrated, the targeted sentiments related to various targets such as “mobile device” 402, “Mobile Device Smartphone” 404, “battery life” 406, etc. are provided, with the keywords of “battery life” and “big problems” indicated as a negative sentiment 408 and 410, respectively. In this example, it is assumed that any reader that reads this excerpt will have a positive sentiment towards camera, but negative sentiment towards battery life (see at least Paragraph 0051).
Mary (IN 202141022951 A) – discloses tools provided by natural language processing and deep learning or machine learning along with other direct programming tools to work with large volumes of text, makes it possible to begin extracting sentiments from social media, which works on Twitter social media and gives the polarity which can be used in product profiling, trend analysis and forecasting. The aspect based sentiment analysis (ABSA) identifies not only the sentiments of customers but also the quality of service and other related useful information. The promising results are further developed to cater business environment needs through sentiment analysis in social media (see at least Abstract).
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MARJORIE PUJOLS-CRUZ whose telephone number is (571)272-4668. The examiner can normally be reached Mon-Thru 7:30 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Patricia H Munson can be reached at (571)270-5396. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/M.P./Examiner, Art Unit 3624 /PATRICIA H MUNSON/Supervisory Patent Examiner, Art Unit 3624