DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
2. A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 9/18/2025 has been entered.
Accordingly, claims 1-20 are pending in this application. Claims 1, 8, and 15, are currently amended.
Response to Arguments
Applicant’s arguments with respect to amended pending claims filed on 9/18/2025 have been fully considered. In view of the claim amendment filed, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made.
Further, regarding the new limitations recited in claims 1, 8, and 15, it is submitted that they are properly addressed by the new ground of rejection.
Furthermore, it is also submitted that all limitations in pending claims, including those not specifically argued, are properly addressed. The reason is set forth in the rejections. See claim analysis below for detail.
Double Patenting
4. The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees. A nonstatutory double patenting rejection is appropriate where the conflicting claims are not identical, but at least one examined application claim is not patentably distinct from the reference claim(s) because the examined application claim is either anticipated by, or would have been obvious over, the reference claim(s). See, e.g., In re Berg, 140 F.3d 1428, 46 USPQ2d 1226 (Fed. Cir. 1998); In re Goodman, 11 F.3d 1046, 29 USPQ2d
2010 (Fed. Cir. 1993); In re Langi, 759 F.2d 887,225 USPQ 645 (Fed. Cir. 1985); In re Van
Ornum, 686 F.2d 937,214 USPQ 761 (CCPA 1982); In re Vogel, 422F.2d 438,164 USPQ 619
(CCPA 1970); In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CPR 1.321(c) or l.32l(d) may be used to overcome an actual or provisional rejection based on nonstatutory double patenting provided the reference application or patent either is shown to be commonly owned with the examined application, or claims an invention made as a result of activities undertaken within the scope of a joint research agreement. See MPEP § 717.02 for applications subject to examination under the first inventor to file provisions of the AIA as explained in MPEP § 2159. See MPEP § 2146 et seq. for applications not subject to examination under the first inventor to file provisions of the AIA . A terminal disclaimer must be signed in compliance with 37 CPR l.32l(b).
The filing of a terminal disclaimer by itself is not a complete reply to a nonstatutory double patenting (NSDP) rejection. A complete reply requires that the terminal disclaimer be accompanied by a reply requesting reconsideration of the prior Office action. Even where the NSDP rejection is provisional the reply must be complete. See MPEP § 804, subsection I.B.1. For a reply to a non-final Office action, see 37 CPR 1.111(a). For a reply to final Office action, see 37 CPR l.113(c). A request for reconsideration while not provided for in 37 CPR l.113(c) may be filed after final for consideration. See MPEP §§ 706.07(e) and 714.13.
The USPTO Internet website contains terminal disclaimer forms which may be used. Please visit www.uspto.gov/patent/patents-forms. The actual filing date of the application in which the form is filed determines what form (e.g., PTO/SB/25, PTO/SB/26, PTO/AIA /25, or PTO/AIA /26) should be used. A web-based eTerminal Disclaimer may be filled out completely online using web-screens. An eTerminal Disclaimer that meets all requirements is auto-
processed and approved immediately upon submission. For more information about eTerminal Disclaimers, refer to www.uspto.gov/patents/apply/applying-online/eterminal-disclaimer.
Claims 1-20 are rejected under 35 U.S.C. 101 are rejected on the ground of nonstatutory double patenting as being unpatentable over claims 1-5, 9-13, and 17-19 of U.S. Patent No. 11,921,754 (reference patent).
Although the claims at issue are not identical, they are not patentably distinct from each other because both are directed to a similar invention with similar limitations as demonstrated in the table below.
Instant Application Number 18/420,152
Reference - US Patent No. 11,921,754
A system comprising:
a computing device configured to:
obtain a plurality of data items from an incoming database;
select a categorization model from a model database based on the plurality of data
items;
for each data item of the plurality of data items, determine whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item;
determine, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics;
compare the at least one unknown data item to public data of public resources;
generate a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data;
categorize the at least one unknown data item as the new topic;
generate a visualization including: a first indication indicating a frequency of topics of the plurality of data items, a second indication indicating that the at least one unknown data item was categorized as an unknown topic, and a third indication indicating that public resources were used to determine the title of the new topic; and
transmit the visualization to at least one of: (i) a user interface of an analyst device or (ii) a categorized database
A system comprising:
a computing device configured to:
obtain a plurality of data items over a threshold analysis period from an incoming database in response to a threshold analysis interval elapsing, the plurality of data items corresponding to at least one parameter;
select a categorization model from a model database based on the at least one parameter of the plurality of data items;
for each data item of the plurality of data items, apply the categorization model to the data item and identify at least one topic associated with the corresponding data item, by:
comparing the data item to a set of known topics; determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics;
for at least one unknown data item in the plurality of data items: access, via a distributed communications network, public resources;
compare the at least one unknown data item to public data of the public resources;
generate a new topic, outside the set of known topics, for the at least one unknown data item, wherein the new topic has a title determined based on the public data; and
categorize the at least one unknown data item as the new topic;
generate a visualization indicating a frequency of topics based on data items corresponding to each topic, the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category and public resources were used to determine the title of the new topic; and
transmit the visualization to at least one of: (i) a user interface of an analyst device and (ii) a categorized database.
2. The system of claim 1, wherein the processor is configured to:
for each data item of the plurality of data items, apply a sentiment model to the data item to identify a sentiment of the data item; and
generate the visualization to include the identified sentiment of each data item of the plurality of data items.
2. The system of claim 1, wherein the computing device is configured to:
for each data item of the plurality of data items, apply a sentiment model to the data item to identify a sentiment of the data item; and
generate the visualization to include the identified sentiment of each data item of the plurality of data items.
3. The system of claim 1, wherein the categorization model is selected based on a language of the plurality of data items.
4. The system of claim 3, wherein the categorization model corresponds to at least one language.
4. The system of claim 1, wherein the categorization model implements a transformer-based machine learning model to determine at least one topic corresponding to each data item of the plurality of data items.
5. The system of claim 1, wherein the categorization model implements a transformer-based machine learning model to determine the at least one topic corresponding to each data item of the plurality of data items.
5. The system of claim 1, wherein the categorization model is configured to:
compare each data item of the plurality of data items to a set of known topics;
determine a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorize the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and identify the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics
A system comprising…
comparing the data item to a set of known topics;
determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics
6.The system of claim 1, wherein the visualization includes the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category.
1. A system comprising…
the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category
7.The system of claim 1, wherein the visualization includes the new topic with an indication that public resources were used to determine the title of the new topic.
1. A system comprising:…
the visualization including…
public resources were used to determine the title of the new topic
8.A method comprising:
obtaining a plurality of data items from an incoming database;
select a categorization model from a model database based on the plurality of data
items;
for each data item of the plurality of data items, determine whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item;
determine, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics;
compare the at least one unknown data item to public data of public resources;
generate a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data;
categorize the at least one unknown data item as the new topic;
generate a visualization including: a first indication indicating a frequency of topics of the plurality of data items, a second indication indicating that the at least one unknown data item was categorized as an unknown topic, and a third indication indicating that public resources were used to determine the title of the new topic; and
transmit the visualization to at least one of: (i) a user interface of an analyst device or (ii) a categorized database
6. A method comprising:
obtaining a plurality of data items over a threshold analysis period from an incoming database in response to a threshold analysis interval elapsing, the plurality of data items corresponding to at least one parameter;
selecting a categorization model from a model database based on the at least one parameter of the plurality of data items
for each data item of the plurality of data items, applying the categorization model to the data item and identify at least one topic associated with the corresponding data item, by:
comparing the data item to a set of known topics; determining a similarity based on a distance value between each known topic of the set of known topics and the data item; categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics;
for at least one unknown data item in the plurality of data items: accessing, via a distributed communications network, public resources;
comparing the at least one unknown data item to public data of the public resources;
generating a new topic, outside the set of known topics, for the at least one unknown data item, wherein the new topic has a title determined based on the public data; and
categorizing the at least one unknown data item as the new topic;
generating a visualization indicating a frequency of topics based on data items corresponding to each topic, the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category and public resources were used to determine the title of the new topic; and
transmitting the visualization to at least one of: (i) a user interface of an analyst device and (ii) a categorized database.
9. The method of claim 8, further comprising: for each data item of the plurality of data items, applying a sentiment model to the data item to identify a sentiment of the data item; and
generating the visualization to include the identified sentiment of each data item of the plurality of data items.
7. The method of claim 6, further comprising: for each data item of the plurality of data items, applying a sentiment model to the data item to identify a sentiment of the data item; and generating the visualization to include the identified sentiment of each data item of the plurality of data items.
10. The method of claim 8, wherein the categorization model is selected based on a language of the plurality of data items.
8. The method of claim 6, wherein the at least one parameter is a language of the data item.
11.The method of claim 8, wherein the categorization model implements a transformer-based machine learning model to determine at least one topic corresponding to each data item of the plurality of data items.
10. The method of claim 6, wherein the categorization model implements a transformer-based machine learning model to determine the at least one topic corresponding to each data item of the plurality of data items.
12. The method of claim 8, further comprising: comparing each data item of the plurality of data items to a set of known topics;
determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics.
6.A method comprising..
comparing the data item to a set of known topics;
determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics;
13. The method of claim 8, wherein the visualization includes the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category.
6. A method comprising..
the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category
14. The method of claim 8, wherein the visualization includes the new topic with an indication that public resources were used to determine the title of the new topic.
6. A method comprising..
public resources were used to determine the title of the new topic.
15. A non-transitory computer readable medium having instructions stored thereon, wherein the instructions, when executed by at least one processor, cause a device to perform operations comprising:
obtaining a plurality of data items from an incoming database;
select a categorization model from a model database based on the plurality of data
items;
for each data item of the plurality of data items, determine whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item;
determine, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics;
compare the at least one unknown data item to public data of public resources;
generate a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data;
categorize the at least one unknown data item as the new topic;
generating a visualization including: a first indication indicating a frequency of topics of the plurality of data items, a second indication indicating that the at least one unknown data item was categorized as an unknown topic, and a third indication indicating that public resources were used to determine the title of the new topic; and
transmit the visualization to at least one of: (i) a user interface of an analyst device or (ii) a categorized database.
11. A non-transitory computer readable medium having instructions stored thereon, wherein the instructions, when executed by at least one processor, cause a device to perform operations comprising:
obtaining a plurality of data items over a threshold analysis period from an incoming database in response to a threshold analysis interval elapsing, the plurality of data items corresponding to at least one parameter;
selecting a categorization model from a model database based on the at least one parameter of the plurality of data items;
for each data item of the plurality of data items, applying the categorization model to the data item and identify at least one topic associated with the corresponding data item, by:
comparing the data item to a set of known topics; determining a similarity based on a distance value between each known topic of the set of known topics and the data item; categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics;
for at least one unknown data item in the plurality of data items: accessing, via a distributed communications network, public resources;
comparing the at least one unknown data item to public data of the public resources;
generating a new topic, outside the set of known topics, for the at least one unknown data item, wherein the new topic has a title determined based on the public data; and
categorizing the at least one unknown data item as the new topic;
generating a visualization indicating a frequency of topics based on data items corresponding to each topic, the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category and public resources were used to determine the title of the new topic; and
transmitting the visualization to at least one of: (i) a user interface of an analyst device and (ii) a categorized database.
16. The non-transitory computer readable medium of claim 15, wherein the instructions, when executed by the at least one processor, cause the device to further perform operations comprising:
for each data item of the plurality of data items, applying a sentiment model to the data item to identify a sentiment of the data item; and generating the visualization to include the identified sentiment of each data item of the plurality of data items.
12. The non-transitory computer readable medium of claim 11, wherein the instructions include:
for each data item of the plurality of data items, applying a sentiment model to the data item to identify a sentiment of the data item; and generating the visualization to include the identified sentiment of each data item of the plurality of data items.
17. The non-transitory computer readable medium of claim 15, wherein the
categorization model is selected based on a language of the plurality of data items.
13. The non-transitory computer readable medium of claim 11, wherein: the at least one parameter is a language of the data item, the categorization model corresponds to at least one language
18. The non-transitory computer readable medium of claim 15, wherein the categorization model implements a transformer-based machine learning model to determine at least one topic corresponding to each data item of the plurality of data items.
13. and the categorization model implements a transformer-based machine learning model to determine the at least one topic corresponding to each data item of the plurality of data items.
19. The non-transitory computer readable medium of claim 15, wherein the operations further comprise:
comparing each data item of the plurality of data items to a set of known topics;
determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics.
11. A non-transitory computer readable medium…
comparing the data item to a set of known topics;
determining a similarity based on a distance value between each known topic of the set of known topics and the data item;
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic; and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics.
20. The non-transitory computer readable medium of claim 15, wherein
the visualization includes the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category and public resources were used to determine the title of the new topic.
11. A non-transitory computer readable medium…
the visualization including the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category and public resources were used to determine the title of the new topic
As demonstrated by the mappings in the table above, US Patent No. 11,921,754 discloses or renders obvious all the features of the claims of the instant application.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Matthew et al. (US 20130231920 A1) in view of Lightner et al. (US 20170116203 A1), Zhang et al. (US 20160292157 A1), and (Parris et al. (US 20150269138 A1).
Regarding Claim 1, Matthew discloses a system comprising: a processor; and a non-transitory memory storing instructions that, when executed, cause the processor to ([0027]: FIG. 13 depicts an exemplary architecture for implementing a computing device 1300) configured to:
obtain a plurality of data items from an incoming database (Fig. 1; [0039] System 100 may include enterprise server 110, database server 120, one or more external sources 130, one or more internal sources 140; Fig. 2; [0055]: Using the identified criteria by block 220, a query may be performed in block 235 to retrieve unstructured text associated with the specified criteria from the entire set of available text documents 230);
select a categorization model from a model database based on the plurality of data items (Fig. 3; [0100]: Using a predefined categorization model 330, sentences from step 315 may be mapped to a category topic in step 335);
However, Matthew does not explicitly teach “for each data item of the plurality of data items, determine whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item; determine, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics; compare the at least one unknown data item to public data of public resources; generate a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data; categorize the at least one unknown data item as the new topic; generate a visualization indicating a frequency of topics of the plurality of data items; and transmit the visualization to at least one of: (i) a user interface of an analyst device or (ii) a categorized database.”
On the other hand, in the same field of endeavor, Lightner teaches for each data item of the plurality of data items,
determine whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item (Fig. 2; [Abstract]: According to an embodiment, topics within documents from a corpus may be discovered by applying multiple topic identification (ID) models… Relatedness between discovered topics may be determined by analyzing co-occurring topic IDs from the different models, assigning topic relatedness scores);
determine, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics ([0036]: The disclosed method for automated discovery of topic relatedness may be employed to… discover information from an unknown corpus; [0036]: Change detection computer module 116 may produce zero or more topics that are not represented in the old model);
compare the at least one unknown data item to public data of public resources; (Fig. 1; [0036]: Change detection computer module 116 may also use term vector differences to compare and measure the significance of topics based on established thresholds; [0043]: The linked topics may be compared, in step 206, across all documents; [0034]: Some embodiments may develop multiple computer executed topic models against a large corpus of documents, including various types of knowledge bases from sources such as the internet and internal networks, among others),
generate a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data ([0036]: PNM computer module 110 produces another set of topics 112 defined by a new set 114 of term vectors. Change detection computer module 116 executes computer readable instructions for measuring differences between topics 106 and topics 112 to identify new topics that are not represented);
categorize the at least one unknown data item as the new topic (Fig. 2; [0043]: In step 202, topics in a document corpus are identified, via the computer system 100, for automated discovery of new topics. In step 204, the identified topics may be linked with documents within a corpus under analysis; [0039]: According to various embodiments, a large corpus of documents may be classified employing all the models generated in the system 100 for automated discovery of topics).
Additionally, Zhang teaches generate a visualization including (Fig. 6; [0075]: Presentation component(s) 616 present data indications to a user or other device):
a first indication indicating a frequency of topics of the plurality of data items ([0004]: Natural language processing is utilized to identify candidate topics that are then ranked by an Accumulated Term Frequency-Inverse Document Frequency (ATF-IDF) algorithm to identify trending topics... classified into categories; [0030]-[0031]: The term “Inverse Document Frequency” (IDF) refers to an indication of how common or rare a particular term is among a collection of posts, such as in a social media stream… The “relevance score” is the numerical indication of the relevance of a particular topic),
a second indication indicating that the at least one unknown data item was categorized as an unknown topic ([0034]: An “unknown topic” is an extracted topic that cannot be classified by the classification rules. In these instances, dictionary sources may be utilized to classify the unknown topics; Fig. 2; [0058]: After recognition component 216 applies all rules, recognition component 216 employs dictionary sources (e.g., Wikipedia) to assign category labels for unknown topics),
a third indication indicating that public resources were used to determine the title of the new topic ([0033]: The term “dictionary sources” refers to online dictionary sources, such as Wikipedia, that may be used to classify extracted topics into categories when the classification rules fail to properly classify an extracted topic); and
Furthermore, Parris teaches transmit the visualization ([0074]: Visualizations which may include, visual, interactive representations of the frequency and relationships of topics or ideas within a publication, may be generated by the scope tool using virtually any of the above discussed analysis techniques and inputs) to at least one of: (i) a user interface of an analyst device or (ii) a categorized database ([0107] In some cases, a display location, e.g., a website, for a visualization may have as associated publication metadata, including Title… displayed in an organized fashion).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Matthew to incorporate the teachings of Lightner, Zhang, and Parris to generate a topic for an unknown data item and transmit a visualization to a user interface of an analyst device or a categorized database.
The motivation for doing so would be to perform real-time topic analysis as recognized by Zhang ([Abstract]: Real-time topic analysis for social listening is performed to help users and organizations in discovering and understanding trending topics in varying degrees of granularity), and generate visualizations that present information to support consumer decisions, as recognized by Parris ([Abstract] of Parris: The visualizations present information to support consumer decisions to read, submit, or otherwise interact with the publication).
Regarding Claim 2, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Matthew further teaches wherein the processor is configured to: for each data item of the plurality of data items, apply a sentiment model to the data item to identify a sentiment of the data item ([0020]: FIG. 6 depicts an exemplary logic flow 600 for mapping sentences to observations and baselines for a non-trend report where a user selects a sentiment observation to explain in accordance with one or more embodiments; [0044]: Sentiment scoring engine 112 may identify a value representing the general feeling, attitude or opinion that an author of a section of unstructured text is expressing towards a situation or event); and
Additionally, Parris teaches generate the visualization to include the identified sentiment of each data item of the plurality of data items ([0104]: Visualizations may also present representations of user feedback and activity… Questions of sentiment can be gathered from Sentiment Analysis and from direct interrogation of users).
Regarding Claim 3, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Matthew further teaches wherein the categorization model is selected based on a language of the plurality of data items ([0043]: Natural language processing engine 111 may include subsystems to process unstructured text, including, but not limited to, language detection).
Regarding Claim 4, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Matthew further teaches wherein the categorization model implements a transformer-based machine learning model to determine at least one topic corresponding to each data item of the plurality of data items (Fig. 3; [0100]: Using a predefined categorization model 330, sentences from step 315 may be mapped to a category topic in step 335. The mapping may be applied using predefined mapping rules that are part of model 330 or may be applied using a trained machine learning model).
Regarding Claim 5, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Matthew further teaches wherein the categorization model is configured to: compare each data item of the plurality of data items to a set of known topics (Fig. 2; [0059]: Using criteria identified in block 225, a query may be performed to retrieve all unstructured text associated with the comparison criteria, 240, from the entire set of available text documents 230);
determine a similarity based on a distance value between each known topic of the set of known topics and the data item ([0060] From this set of text documents, 240, qualifying features may be identified, aggregated and quantified in terms of volume, sentiment, customer satisfaction and any other metric used for analysis in block 255. The features identified may include words, word relationships (e.g. a pair of syntactically linked words), topics of discussion… and topic categorization).
Regarding Claim 6, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Parris further teaches wherein the visualization includes the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category ([0074]: Visualizations which may include, visual, interactive representations of the frequency and relationships of topics or ideas within a publication, may be generated by the scope tool… [0079]: Interactions include changing time scale, looking at different topic `resolutions`, e.g. scientific fields, general topic descriptors, specific topic descriptors or keywords and changing classification systems or ontologies).
Regarding Claim 7, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the system of claim 1.
Parris further teaches, wherein the visualization includes the new topic with an indication that public resources were used to determine the title of the new topic ([0074]: Visualizations which may include, visual, interactive representations of the frequency and relationships of topics or ideas within a publication, may be generated by the scope tool… [0079]: Interactions include changing time scale, looking at different topic `resolutions`, e.g. scientific fields, general topic descriptors, specific topic descriptors or keywords and changing classification systems or ontologies).
Regarding Claim 8, Matthew discloses a method comprising:
obtaining a plurality of data items from an incoming database (Fig. 1; [0039] System 100 may include enterprise server 110, database server 120, one or more external sources 130, one or more internal sources 140; Fig. 2; [0055]: Using the identified criteria by block 220, a query may be performed in block 235 to retrieve unstructured text associated with the specified criteria from the entire set of available text documents 230);
selecting a categorization model from a model database based on the plurality of data
items (Fig. 3; [0100]: Using a predefined categorization model 330, sentences from step 315 may be mapped to a category topic in step 335);
However, Matthew does not explicitly teach “for each data item of the plurality of data items, determining whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item; determining, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics; comparing the at least one unknown data item to public data of public resources; generating a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data; categorizing the at least one unknown data item as the new topic; generating a visualization indicating a frequency of topics of the plurality of data items; and transmit the visualization to at least one of: (i) a user interface of an analyst device or (ii) a categorized database.”
On the other hand, in the same field of endeavor, Lightner teaches for each data item of the plurality of data items,
determining whether the data item is associated with a topic in a set of known topics by applying the categorization model to the data item (Fig. 2; [Abstract]: According to an embodiment, topics within documents from a corpus may be discovered by applying multiple topic identification (ID) models… Relatedness between discovered topics may be determined by analyzing co-occurring topic IDs from the different models, assigning topic relatedness scores);
determining, among the plurality of data items, at least one unknown data item that is not associated with any topic in the set of known topics ([0036]: The disclosed method for automated discovery of topic relatedness may be employed to… discover information from an unknown corpus; [0036]: Change detection computer module 116 may produce zero or more topics that are not represented in the old model);
comparing the at least one unknown data item to public data of public resources; (Fig. 1; [0036]: Change detection computer module 116 may also use term vector differences to compare and measure the significance of topics based on established thresholds; [0043]: The linked topics may be compared, in step 206, across all documents; [0034]: Some embodiments may develop multiple computer executed topic models against a large corpus of documents, including various types of knowledge bases from sources such as the internet and internal networks, among others),
generating a new topic, which is different from any topic in the set of known topics, for the at least one unknown data item based on the comparison, wherein the new topic has a title determined based on the public data ([0036]: PNM computer module 110 produces another set of topics 112 defined by a new set 114 of term vectors. Change detection computer module 116 executes computer readable instructions for measuring differences between topics 106 and topics 112 to identify new topics that are not represented);
categorizing the at least one unknown data item as the new topic (Fig. 2; [0043]: In step 202, topics in a document corpus are identified, via the computer system 100, for automated discovery of new topics. In step 204, the identified topics may be linked with documents within a corpus under analysis; [0039]: According to various embodiments, a large corpus of documents may be classified employing all the models generated in the system 100 for automated discovery of topics).
Additionally, Zhang teaches generate a visualization including (Fig. 6; [0075]: Presentation component(s) 616 present data indications to a user or other device):
a first indication indicating a frequency of topics of the plurality of data items ([0004]: Natural language processing is utilized to identify candidate topics that are then ranked by an Accumulated Term Frequency-Inverse Document Frequency (ATF-IDF) algorithm to identify trending topics... classified into categories; [0030]-[0031]: The term “Inverse Document Frequency” (IDF) refers to an indication of how common or rare a particular term is among a collection of posts, such as in a social media stream… The “relevance score” is the numerical indication of the relevance of a particular topic),
a second indication indicating that the at least one unknown data item was categorized as an unknown topic ([0034]: An “unknown topic” is an extracted topic that cannot be classified by the classification rules. In these instances, dictionary sources may be utilized to classify the unknown topics; Fig. 2; [0058]: After recognition component 216 applies all rules, recognition component 216 employs dictionary sources (e.g., Wikipedia) to assign category labels for unknown topics),
a third indication indicating that public resources were used to determine the title of the new topic ([0033]: The term “dictionary sources” refers to online dictionary sources, such as Wikipedia, that may be used to classify extracted topics into categories when the classification rules fail to properly classify an extracted topic); and
Furthermore, Parris teaches transmit the visualization ([0074]: Visualizations which may include, visual, interactive representations of the frequency and relationships of topics or ideas within a publication, may be generated by the scope tool using virtually any of the above discussed analysis techniques and inputs) to at least one of: (i) a user interface of an analyst device or (ii) a categorized database ([0107] In some cases, a display location, e.g., a website, for a visualization may have as associated publication metadata, including Title… displayed in an organized fashion).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the teachings of Matthew to incorporate the teachings of Lightner, Zhang, and Parris to generate a topic for an unknown data item and transmit a visualization to a user interface of an analyst device or a categorized database.
The motivation for doing so would be to perform real-time topic analysis as recognized by Zhang ([Abstract]: Real-time topic analysis for social listening is performed to help users and organizations in discovering and understanding trending topics in varying degrees of granularity), and generate visualizations that present information to support consumer decisions, as recognized by Parris ([Abstract] of Parris: The visualizations present information to support consumer decisions to read, submit, or otherwise interact with the publication).
Regarding Claim 9, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Matthew further teaches further comprising: for each data item of the plurality of data items, applying a sentiment model to the data item to identify a sentiment of the data item ([0020]: FIG. 6 depicts an exemplary logic flow 600 for mapping sentences to observations and baselines for a non-trend report where a user selects a sentiment observation to explain in accordance with one or more embodiments; [0044]: Sentiment scoring engine 112 may identify a value representing the general feeling, attitude or opinion that an author of a section of unstructured text is expressing towards a situation or event); and
Additionally, Parris teaches generating the visualization to include the identified sentiment of each data item of the plurality of data items ([0104]: Visualizations may also present representations of user feedback and activity… Questions of sentiment can be gathered from Sentiment Analysis and from direct interrogation of users).
Regarding Claim 10, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Matthew further teaches wherein the categorization model is selected based on a language of the plurality of data items ([0043]: Natural language processing engine 111 may include subsystems to process unstructured text, including, but not limited to, language detection).
Regarding Claim 11, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Matthew further teaches wherein the categorization model implements a transformer-based machine learning model to determine at least one topic corresponding to each data item of the plurality of data items (Fig. 3; [0100]: Using a predefined categorization model 330, sentences from step 315 may be mapped to a category topic in step 335. The mapping may be applied using predefined mapping rules that are part of model 330 or may be applied using a trained machine learning model).
Regarding Claim 12, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Matthew further teaches further comprising: comparing each data item of the plurality of data items to a set of known topics (Fig. 2; [0059]: Using criteria identified in block 225, a query may be performed to retrieve all unstructured text associated with the comparison criteria, 240, from the entire set of available text documents 230);
determining a similarity based on a distance value between each known topic of the set of known topics and the data item ([0060]: From this set of text documents, 240, qualifying features may be identified, aggregated and quantified in terms of… any other metric used for analysis in block 255. The features identified may include words, word relationships (e.g. a pair of syntactically linked words), topics of discussion… and topic categorization);
categorizing the data item as a corresponding known topic of the set of known topics when the data item is within a threshold distance of the corresponding known topic ([0060]: The features may be identified by a natural language processing engine that supports… topic categorization); and
identifying the data item as an unknown data item when the data item is outside the threshold distance of each known topic of the set of known topics ([0060]: From this set of text documents, 240, qualifying features may be identified, aggregated and quantified in terms of… any other metric used for analysis in block 255).
Regarding Claim 13, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Parris further teaches wherein the visualization includes the new topic with an indication that the at least one unknown data item was categorized into an unknown topic category ([0074]: Visualizations which may include, visual, interactive representations of the frequency and relationships of topics or ideas within a publication, may be generated by the scope tool… [0079]: Interactions include changing time scale, looking at different topic `resolutions`, e.g. scientific fields, general topic descriptors, specific topic descriptors or keywords and changing classification systems or ontologies).
Regarding Claim 14, the combined teachings of Matthew, Lightner, Zhang, and Parris disclose the method of claim 8.
Parris further teaches wherein the visualization includes the new to