Last updated: April 19, 2026
Application No. 18/653,885
QUESTION MINING METHOD, ELECTRONIC DEVICE, AND NON-TRANSIROTY STORAGE MEDIA

Final Rejection §101§103
Filed
May 02, 2024
Examiner
BLACK, LINH
Art Unit
2163
Tech Center
2100 — Computer Architecture & Software
Assignee
Mashang Consumer Finance Co. Ltd.
OA Round
2 (Final)
Interview Optional

— +11.5% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 437 resolved cases, 2023–2026
Examiner Intelligence

BLACK, LINH View full profile →
Grants 51% of resolved cases
Career Allow Rate
222 granted / 437 resolved
-4.2% vs TC avg
Moderate +12% lift
Without
With
+11.5%
Interview Lift
resolved cases with interview
Typical timeline
5y 1m
Avg Prosecution
40 currently pending
Career history
477
Total Applications
across all art units
Statute-Specific Performance

§101
12.3%
-27.7% vs TC avg
§103
64.0%
+24.0% vs TC avg
§102
16.5%
-23.5% vs TC avg
§112
3.3%
-36.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 437 resolved cases
Office Action

§101 §103
DETAILED ACTION
	This communication is in response to application filed 12/26/2025. Claims 1-20 are pending in the application.

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Arguments
Applicant's arguments filed 12/26/2025 have been fully considered but they are not persuasive. 
In response to applicant's argument that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “automated question mining based on a pre-built standard question database”, “(1) an "improved TF-IDF algorithm”, “the Point-wise mutual information (PMI) method”, “processing large-scale text datasets, performing complex statistical calculations” etc.) are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
 Regarding the argument that “the human brain lacks the capacity to process massive text datasets, execute sophisticated statistical algorithms, or access structured databases with the speed and accuracy required by the claimed method”, the claims do not contain the limitations “massive”, “execute sophisticated statistical algorithms”, “or structured databases”.
Data mining has been long practiced in the technology art, e.g., extracting keywords, analyzing high-value search terms from various data sources to identify current trends, events in real life including weather, disease, popular products, discussions relating to a topic/category etc. 
The limitation “a pre-built standard question database”, e.g., SAT Suite or curated collections of standardized questions, FAQs etc. where question sets are pre-built to speed up process of learning, facilitate searching or data gathering for users. 
Specification, para. 25 states “the importance degree of each word to the first intent category may be based on the occurrence information (such as the number of occurrences, frequency of occurrences, etc.) of each word of the first standard question text”.
In the BRI, steps of claim 1 are mental processes of evaluations or judgements which is mentally performable and/or with pen and paper and that “the first standard question text” can contain one or more word.
Regarding the arguments II-III that “automatically generating high-quality target question texts with improved semantic generalization and accurate intent classification”, “(using an improved TF-IDF algorithms)”, or “Critically, the entire mining process is fully automated (i.e., requiring no human intervention)” are not cited in the claims. In addition, the argument “improving to computer functionality”, in the BRI, the claim is broad which are mental processes and that although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
In response to applicant's arguments on page 6 of the Remarks that the references fail to show certain features of the invention, it is noted that the features upon which applicant relies (i.e., “mining question texts from an external text source (i.e., a text set separate from the original database)”, “employs a cross-domain question-mining scheme comprising a standard question database + external target text set”, “applies these rules to an external target text set to automatically mine target question texts” etc.) or “This cross-domain approach, which utilizes a standard database to inform mining from an external text source” are not recited in the rejected claim(s). Although the claims are interpreted in light of the specification, limitations from the specification are not read into the claims. See In re Van Geuns, 988 F.2d 1181, 26 USPQ2d 1057 (Fed. Cir. 1993).
In response to applicant’s argument that there is no teaching, suggestion, or motivation to combine the references on page 8 of the Remarks, the examiner recognizes that obviousness may be established by combining or modifying the teachings of the prior art to produce the claimed invention where there is some teaching, suggestion, or motivation to do so found either in the references themselves or in the knowledge generally available to one of ordinary skill in the art.  See In re Fine, 837 F.2d 1071, 5 USPQ2d 1596 (Fed. Cir. 1988), In re Jones, 958 F.2d 347, 21 USPQ2d 1941 (Fed. Cir. 1992), and KSR International Co. v. Teleflex, Inc., 550 U.S. 398, 82 USPQ2d 1385 (2007).  In this case, motivations to combine the references are on page 11 and 23 of the Office action dated 10/1/2025.
In addition, term frequency is the counting the occurrences of a word within a document to determine its importance within said document text. Even though Kim teaches term frequency at para. 34, 78: TF/IDF, Kim does not explicitly mention occurrences. The teachings of Kostoff were applied to teach said limitation – See the cited columns and lines in the Office action dated 10/1/2025.

                                         Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.  
Step 1: Claims 1-10 fall within the statutory category of a process. Claims 11-19 fall within the statutory category of an apparatus or system. Claim 20 falls within the statutory category of an article of manufacture. Please see below.

Step 2A, Prong One: the claims recite a Judicial Exception.
Claim 1 recites “a question-mining method” is intended use which stating the high-level of abstract idea, a mental evaluation or judgement of ranking events.
The step of “obtaining a pre-built standard question database; wherein the standard question database comprises a first standard question text, the first standard question text corresponds to a first intent category, and the first standard question text comprises a plurality of words” is a mental judgement or selection of elements or an additional element as insignificant extra-solution activity as “selecting information” for analysis as in MPEP 2106.05(g). 
The step of “mining keywords of the first intent category from the plurality of words according to an importance degree of each word of the first standard question text corresponding to the first intent category, wherein the plurality of words comprises the keywords and non-keywords” is a mental process and an evaluation or judgement and mathematical calculations. Extracting/analyzing questions including words/keywords or non-keywords/stop-words is mentally performable with pen and paper.
The step of “determining a co-occurrence word of the keywords according to co-occurrence information of the keywords and the non-keywords in the standard question database; and mining a target question text from a pre-obtained target text set according to the co-occurrence word of the keywords” is a mental process as an evaluation or judgement, i.e., review and extract top candidates for target question set.

The independent claims 11, 20 recite limitations of commensurate scope. For the reasons stated above for claim 1, claims 11, 20 also recite mental processes and mathematical calculations which are an abstract idea.
The limitations “An electronic device, comprising a processor and a memory electrically connected to the processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program stored the memory to perform operations comprising” in claim 11 and “A non-transitory computer-readable storage medium, wherein the storage medium stores a computer program, and the computer program is executed by a processor to perform operations” in claim 20 are ‘apply it’ on a computer as per MPEP 2106.05(f). Thus, the claims are directed to abstract ideas as a mental process and mathematical calculations. 

Step 2A, Prong Two: exception is not integrated into a practical application.
The judicial exception is not integrated into a practical application because the additional elements and combination of additional elements do not impose meaningful limits on the judicial exception. In particular, the additional elements are “An electronic device, comprising a processor and a memory electrically connected to the processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program stored the memory to perform operations comprising” in claim 11 and “A non-transitory computer-readable storage medium, wherein the storage medium stores a computer program, and the computer program is executed by a processor to perform operations” in claim 20, which are merely applying the abstract idea on a computer as per MPEP 2106.05(f), and does not provide integration into a practical application or significantly more. Thus, claims 1, 11 and 20 are directed to abstract ideas.

Step 2B: “Inventive Concept” or “Significantly More”
The claims do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the additional elements when considered both individually and as an ordered combination do not amount to significantly more than the abstract idea. Here, the claims recite generic computer components (e.g., “An electronic device, comprising a processor and a memory electrically connected to the processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program stored the memory to perform operations comprising” in claim 11 and “A non-transitory computer-readable storage medium, wherein the storage medium stores a computer program, and the computer program is executed by a processor to perform operations” in claims 11 and 20) performing generic computing functions that are well understood, routine, and conventional (e.g., plotting data, reorganizing data, forecasting data). See Alice, 573 U.S. at 226 (“Nearly every computer will include a “communications controller’ and [a] ‘data storage unit’ capable of performing the basic calculation, storage, and transmission functions required by the method claims.”); In re TLI Commc’ns LLC Pat. Litig., 823 F.3d 607, 614 (Fed. Cir. 2016) (holding generic computer components insufficient to add an inventive concept to an otherwise abstract idea); buySAFE, Inc. v. Google, Inc., 765 F.3d 1350, 1355 (Fed. Cir. 2014) (“That a computer receives and sends the information over a network--with no further specification--is not even arguably inventive”).
 Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. Looking at the limitations as an ordered combination adds nothing that is not already present when looking at the elements taken individually. There is no indication that the combination of elements improves the functioning of the computer or improves another technology. The claims do not amount to significantly more than the underlying abstract idea.
	Claims 2-5, and similar claims 12-15 recite … “determining a target long text corresponding to the first intent category according to the first standard question text, wherein the target long text comprises at least one first standard question text;
determining a first occurrence information of each word of the target long text in the target long text, and determining a second occurrence information of each word of the target long text in the standard question database; determining an importance degree of each word of the target long text corresponding to the first intent category according to the first occurrence information and the second occurrence information; and 
mining the keywords of the first intent category from the plurality of words according to the importance degree of each word of the target long text corresponding to the first intent category, wherein the importance degrees of the keywords are higher than or equal to a preset importance degree threshold", … “determining a first occurrence number of a first word of the target long text; wherein the first word is any word of the target long text; and determining the occurrence frequency of the first word of the target long text according to first occurrence number and a total number of words of the target long text”, … “determining a first text number of the target long text corresponding to each intent category in the standard question database that comprises a second word;
determining the inverse document frequency corresponding to the second word according to the first text number and a total number of texts of the target long text”, … “determining a second text number of the standard question text that comprises the keywords in N standard question texts; and determining a third text number of the standard question texts that comprise both the keywords and the non-keywords in the N standard question texts; determining the co-occurrence degree of the keywords and the non-keywords in the standard question database according to the second text number, the third text number and the total number of the N standard question texts; and
determining that the non-keywords as the co-occurrence words in response to that the cooccurrence degree of the keywords and the non-keywords is greater than or equal to a preset threshold”. 
In a broadest reasonable interpretation, said limitations recite mental processes and mathematical calculations which are abstract ideas because the query intent category is/are defined and the frequency occurrence for each query word is calculated to determine the degree of importance for each query word in order for the final output of results to the users. The claims cover performance of the limitations in the mind with pen and paper and fall within the "Mental Processes" and “Mathematical concepts” groupings of abstract ideas. Accordingly, the claims recite an abstract idea.
Claims 6-10 and similar claims 16-19 add further limitations which are also directed to an abstract idea. The claims recite: … “screening a candidate question text from the target text set; wherein the candidate question text comprises both the keywords and the non-keywords; predicting the intent category to which the candidate question text belongs, and obtaining a prediction result of the candidate question text; determining whether the candidate question text is the target question text according to the prediction result”, … “clustering the N standard question texts to obtain a clustering result, wherein the clustering result comprises a plurality of question text sets, and each of the question text sets comprises a plurality of the standard question texts; determining a central question text for each of the question text sets, wherein the central question text is the standard question text closest to a clustering center  corresponding to the question text set; from a plurality of central question texts, selecting a central question text with a highest degree of similarity with the candidate question text; and determining the intent category of the central question text with the highest similarity with the candidate question text as the first prediction intent category”, … “in response to that the first prediction intent category is the same as the intent category corresponding to the keyword, determining the candidate question text as the target question text; and in response to that the first prediction intent category and the intent category corresponding to the keyword are different, determining the candidate question text not to be the target question text”, … “using a pre-trained intent recognition model to predict the intent category to which the candidate question text belongs, and obtaining the probability that the candidate question text belongs to each intent category; wherein the intent recognition model is obtained by training according to sample question texts and sample intent categories of the sample question texts”, “calculating an information entropy of the candidate question text according to the probability that the candidate question text belongs to each intent category; in response to that the information entropy is greater than or equal to the preset information entropy threshold, determining the candidate question text to be the target question text; in response to that the information entropy is less than the preset information entropy threshold, determining the candidate question text not to be the target question text”.
Said steps can be performed using human mental evaluation or judgement, and fall into the abstract idea of mental processes that are mentally performable with pen and paper and mathematical concepts, similar to the independent claims. Because the additional elements do not impose meaningful limitations on the judicial exception and the additional elements are well-understood, routine and conventional functionalities in the art, the claims are directed to an abstract idea and are not patent eligible.



Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-6 and 11-16 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 20190392066) in view of Kostoff (US 6886010).
As per claims 1, 11, 20, Kim et al. teaches
a question-mining method, comprising: obtaining a pre-built standard question database; wherein the standard question database comprises a first standard question text, the first standard question text corresponds to a first intent category, and the first standard question text comprises a plurality of words (para. 38-40: a semantic analysis-based procedural natural language QA system can perform semantic analysis on the user query (question) and candidate query results (more specifically, the titles of the candidate query results) that are both in a natural language to generate structured semantic representations of the user query and the candidate query results. The candidate query results are retrieved from a pre-built query result repository, where each query result in the query result repository includes a query result excerpted from an official document written by professionals (e.g., product manuals and official help websites). A paraphrase mining technique is used to extract paraphrasing rules from user interaction data, where each paraphrasing rule includes structured semantic representations of a pair of similar queries and the associated similarity score); 
mining keywords of the first intent category from the plurality of words according to an importance degree of each word of the first standard question text corresponding to the first intent category, wherein the plurality of words comprise the keywords and non-keywords (para. 34, 72: because the triples could be different in term of their importance, the query result scoring engine treats each triple differently. For example, in the semantic representation of the title of the candidate query result shown in FIG. 5B, triple (“save,” Object, “gif format”) may be more important than triple (“save,” Context_Product, “photoshop”) because, if the semantic representation of the query does not include a triple mapped to (“save,” Object, “gif format”), the candidate query result is unlikely to be a correct one); fig. 5A: keywords: create, gif and non-keywords: how to; para. 78.)
	Kim et al. does not explicitly teach determining a co-occurrence word of the keywords according to co-occurrence information of the keywords and the non-keywords in the standard question database; and mining a target question text from a pre-obtained target text set according to the co-occurrence word of the keywords.
	Kostoff teaches said limitations at col. 3:26-39: using a test query to retrieve a relative sample of documents from a database, classifying the retrieved documents as relevant or not relevant, finding text element (phrase) frequencies and text element co-occurrences in at least the relevant documents, grouping the extracted text elements into thematic categories, and then using the thematic grouping and phrase frequency data to develop new queries and query terms; col. 4:34-38: co-occurrence-Occurrence of two or more text elements (e.g., words, phrases) in the same text domain (e.g., sentence, paragraph, Abstract). Co-occurrence Frequency-Frequency of occurrence of two or more text elements in the same text domain; col. 7:14-16: use text and data mining to identify topical matters that have been emphasized in prior research).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim to include the determining a co-occurrence word of the keywords and mining a target question text of Kostoff in order to determine co-occurring words and leverage them for mining target question texts, identifies relationships and enhances retrieval precision. Co-occurrence analysis reveals semantically related terms, which provides an advantage by expanding a query's context to find relevant questions that may not use the exact keyword.

As per claims 2, 12, Kim et al. teaches
wherein the mining keywords of the first intent category from the plurality of words according to the importance degree of each word of the first standard question text to the first intent category comprises: determining a target long text corresponding to the first intent category according to the first standard question text, wherein the target long text comprises at least one first standard question text (para. 26: a paraphrase mining technique is used to generate paraphrasing rules from user interaction data. Each paraphrasing rule includes structured semantic representations of two user queries and a similarity score between the two structured semantic representations. The paraphrasing rules can be used to more accurately align the semantic structures and determine the similarity between user queries and candidate query results, thereby further improving the accuracy of natural language query result retrieval, in particular, in cases where the user query and the query result have a same meaning but are expressed in different ways; para. 38: the candidate query results are retrieved from a pre-built query result repository, where each query result in the query result repository includes a query result excerpted from an official document written by professionals (e.g., product manuals and official help websites) or is specifically written for the QA system by professionals, rather than a short span of texts extracted from a document. These query results are formally written and thus are grammatical, allowing the in-depth semantic analysis to be performed on them; para. 72: because the triples could be different in term of their importance, the query result scoring engine treats each triple differently); 
determining an importance degree of each word of the target long text corresponding to the first intent category (para. 72: because the triples could be different in term of their importance, the query result scoring engine treats each triple differently; para. 78: Goal_Status would have a higher weight for the action “save” than Context_Product because Goal_Status carries more important information than Context_Product. The importance of the role generally depends on the type of action);
mining the keywords of the first intent category from the plurality of words according to the importance degree of each word of the target long text corresponding to the first intent category, wherein the importance degrees of the keywords are higher than or equal to a preset importance degree threshold (para. 40-42: the paraphrasing rules (including the similarity score between the paraphrases) are used to score the candidate query results with respect to the user query. When a user submits a query, the QA system first retrieves multiple candidate query results from the query result repository. The QA system determines a match score for each of the candidate query result with respect to the user query. A candidate query result associated with the highest match score is selected from the candidate query results. If the selected query result has a match score greater than a threshold value, the selected query result is provided to the user (e.g., sent to the user device) as the query result for the user's query; para. 78: model the importance of each individual word, the TF-IDF score can be derived from the query result repository based on the frequency the word appears in the query result repository and the inverse function of the number of in which the word appears).
Kim et al. does not teach at para. 28: the term “natural language query” refers to a query in a natural language, such as a sentence or a long phrase, rather than merely a set of one or more keywords. The meaning of a natural language query often depends on not only the individual terms in the natural language query but also the dependency or other relationship between the individual terms; para. 2: Query answering (QA) relates to using a computer system to provide query results for natural language queries to users. A QA system provides query results by, for example, querying a structured database of knowledge or information or by pulling query results from unstructured collections of natural language documents. The queries can include factual queries or procedural queries; para. 23: based on the natural language query, the QA system retrieves a set of candidate query results from a query result repository (e.g., a database) using, for example, keyword-based searching techniques, query expansion techniques, or other information retrieval techniques. The QA system then generates structured semantic representations (e.g., triples) for the candidate query results, and computes a match score between the natural language query and a given candidate query result based on these semantic representations. Based on the match score between the natural language query and each candidate query result, one or more candidate query results with the highest match score can be selected as the query result. 
Kim et al. does not teach occurrence in determining a first occurrence information of each word of the target long text in the target long text, and determining a second occurrence information of each word of the target long text in the standard question database; according to the first occurrence information and the second occurrence information.
	Kostoff teaches said limitations at col. 3:26-39: using a test query to retrieve a relative sample of documents from a database, classifying the retrieved documents as relevant or not relevant, finding text element (phrase) frequencies and text element co-occurrences in at least the relevant documents, grouping the extracted text elements into thematic categories, and then using the thematic grouping and phrase frequency data to develop new queries and query terms; col. 4:34-38: co-occurrence-Occurrence of two or more text elements (e.g., words, phrases) in the same text domain (e.g., sentence, paragraph, Abstract). Co-occurrence Frequency-Frequency of occurrence of two or more text elements in the same text domain; col. 7:14-16: use text and data mining to identify topical matters that have been emphasized in prior research).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim to include the determining a co-occurrence word of the keywords and mining a target question text of Kostoff in order to determine co-occurring words and leverage them for mining target question texts, identifies relationships and enhances retrieval precision. Co-occurrence analysis reveals semantically related terms, which provides an advantage by expanding a query's context to find relevant questions that may not use the exact keyword.

As per claims 3, 13, Kim et al. does not teach co-occurrence frequency.
	Kostoff teaches 
wherein the first occurrence information comprises an occurrence frequency and the determining the first occurrence information of each word of the target long text in the target long text, comprising: determining a first occurrence number of a first word of the target long text (col. 4:34-38: Co-occurrence-Occurrence of two or more text elements (e.g., words, phrases) in the same text domain (e.g., sentence, paragraph, Abstract/long text). Co-occurrence Frequency-Frequency of occurrence of two or more text elements in the same text domain); 
wherein the first word is any word of the target long text; and determining the occurrence frequency of the first word of the target long text according to first occurrence number and a total number of words of the target long text (col. 3:55-58: the text element frequencies of occurrence within each group are summed to indicate a level of emphasis for each group; col. 6, first paragraph; col. 18:49-51: The
co-occurrence of word pairs in the same document (word co-occurrence frequency) was computed, and a correlation matrix (659x659) of word pairs was generated).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim to include the occurrence frequency of the word of Kostoff in order to uncover patterns and gain insights into the material or provide a quick summary of the text contents.

As per claims 4, 14, Kim et al. teaches
wherein the second occurrence information comprises an inverse document frequency and the determining the second occurrence information of each word of the target long text in the standard question database comprises: determining a first text number of the target long text corresponding to each intent category in the standard question database that comprises a second word;  determining the inverse document frequency corresponding to the second word according to the first text number and a total number of texts of the target long text (para. 34: information retrieval techniques, such as term frequency-inverse document frequency (TF-IDF) or BM25, have been used for QA. These techniques generally assume independency between query terms (e.g., words), and score the query terms independently without considering its context; para. 78:  to model the importance of each individual word, the TF-IDF score can be derived from the query result repository based on the frequency the word appears in the query result repository).

As per claims 5, 15, Kim et al. does not teach co-occurrence degree.
	Kostoff teaches 
wherein the co-occurrence information comprises a co-occurrence degree, and the determining a co-occurrence word of the keywords according to the co-occurrence information of the keywords and the non-keywords in the standard question database comprises: determining a second text number of the standard question text that comprises the keywords in N standard question texts (col. 3:55-58: the text element frequencies of occurrence within each group are summed to indicate a level of emphasis for each group; col. 6:5-9: For example, if "dog" is the first phrase, and "cat" is the second phrase, and n is 50, each occurrence of "dog" within fifty words of "cat" is a co-occurrence. A phrase co-occurrence may also be defined as the occurrence of phrases within the same paragraph or sentence); 
determining a third text number of the standard question texts that comprise both the keywords and the non-keywords in the N standard question texts; determining the co-occurrence degree of the keywords and the non-keywords in the standard question database according to the second text number, the third text number and the total number of the N standard question texts (co. 11:47-55: a co-occurrence matrix of the highest frequency text elements in each relevance category
is generated. Each element of the text element Mij co-occurrence matrix is the number of times that text element i occurs in the same spatial domain as text element j. In practice, the co-occurrence matrix element is usually the number of domains in which text element i co-occurs with text element j; col. 12:12-28: The co-occurrence of text elements in the frequency analyzed documents is then analyzed to generate a list of
co-occurrence pairs. Each of these co-occurrence pairs includes an anchor text element (selected so that each major thematic category generated by the grouping of text elements is represented by at least one anchor text element) and another extracted text element. This analysis generates a list of co-occurrence pairs including co-occurrence data for each listed co-occurrence pair. The co-occurrence data is combined
 with the frequency data for the extracted text elements; col. 33:5-55); 
determining that the non-keywords as the co-occurrence words in response to that the co-occurrence degree of the keywords and the non-keywords is greater than or equal to a preset threshold (col. 39:20-26; col. 47:6-16: phrase-this entry was a single, adjacent double, or adjacent triple word phrase that was located within a specified number of words from the theme phrase in one or both of the relevant/non-relevant categories. The capability also allowed specification of co-occurrence within the same Abstract, paragraph or sentence. The phrase survived a filtering by a trivial phrase algorithm, and the frequency of its occurrence in combination with the theme phrase in either the relevant or non-relevant category in the aggregate sample was above some pre-defined threshold. Each phrase contains keyword(s)/word(s) and non-keyword(s)/stopword(s)).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim to include the occurrence frequency of the word of Kostoff in order to uncover patterns and gain insights into the material or provide a quick summary of the text contents and/or provide relevant query results to the users.

As per claims 6, 16, Kim et al. does not teach co-occurrence word of the keywords.
	Kostoff teaches
wherein the mining the target question text from the pre-obtained target text set according to the co-occurrence word of the keywords comprises: screening a candidate question text from the target text set; wherein the candidate question text comprises both the keywords and the non-keywords (col. 3:26-39: use text and data mining to identify topical matters that have been emphasized in prior research; by using a test query to retrieve a relative sample of documents from a database, classifying the retrieved documents as relevant or not relevant, finding text element (phrase) frequencies and text element co-occurrences in at least the relevant documents, grouping the extracted text elements into thematic categories, and then using the thematic grouping and phrase frequency data to develop new queries and query terms; col. 12:23-28: expert system then reviews the frequency data for the extracted text elements and the co-occurrence data. From this analysis, the expert or expert system selects candidate query terms, thus forming a list. The list of candidate query terms should represent each of the thematic candidate terms); 
predicting the intent category to which the candidate question text belongs, and obtaining a prediction result of the candidate question text; determining whether the candidate question text is the target question text according to the prediction result (col. 3:26-39: use text and data mining to identify topical matters that have been emphasized in prior research; col. 10:23-28: the sample of retrieved documents is classified according to the documents' relevance to the subject matter of the search. The relevance classification may be binary, such as either ' relevant ' or ' not relevant', or may be graded or ranked on a verbal or numerical scale (e.g., 1-5, or 'strongly relevant', ' moderately relevant ', etc.); col. 11:33-36: the purpose of the groupings in each relevance category is to insure that the query has representation from each of the
major themes of each relevance category. This will insure a balanced query, and that major themes are not overlooked; col. 35:23-41: highly relevant, or similar, articles provided comprehensive retrieval of papers in the specific target field of interest; col. 41:1-66).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim to include the occurrence of the word of Kostoff in order to uncover patterns and gain insights into the material or provide a quick summary of the text contents and/or provide relevant query results to the users.
Claim(s) 7-9, 17-19 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 20190392066) in view of Kostoff (US 6886010) and further in view of Di Fabbrizio et al. (US 20230106590).

As per claims 7-9, 17-19, Kim and Kostoff do not explicitly teach said claims.
	Di Fabbrizio et al. teaches
wherein the prediction result comprises a first prediction intent category and the predicting the intent category to which the candidate question text belongs and obtaining a result of the candidate question text comprises: clustering the N standard question texts to obtain a clustering result, wherein the clustering result comprises a plurality of question text sets, and each of the question text sets comprises a plurality of the standard question texts (para. 84-85: the clustering step includes using deep learning clustering techniques, such as K-means, or Approximate Nearest Neighbors techniques, to cluster the QA pair data into clusters. In this way, each cluster can be associated to its centroid question; if the QA pair data includes data related to a dimension attribute of a product, the cluster including these questions could be assigned a label related to a dimension, such as “dimensions_questions”); 
determining a central question text for each of the question text sets, wherein the central question text is the standard question text closest to a clustering center corresponding to the question text set (para. 75: a clustering similarity metric can include a function that can quantify a measure of similarity between a numerical representation of a question and clusters of questions represented by centroids generated using a clustering technique as described herein. The clusters can represent questions that share the same topic. In some embodiments, the text classification similarity metric can include a function that can quantify a measure of similarity between two questions based on a numerical representation of two questions. For example, “fridges with an ice maker” and “fridges that have an ice maker” will have a high similarity measure; para. 84); 
from a plurality of central question texts, selecting a central question text with a highest degree of similarity with the candidate question text; and determining the intent category of the central question text with the highest similarity with the candidate question text as the first prediction intent category (para. 3-4: the query type can be determined based on at least one of a clustering similarity metric, a text classification similarity metric, or retrieved information. Providing the updated mapping of attributes and attribute values can also include applying a threshold to the similarity metric and selecting an attribute whose similarity metric is above the threshold for inclusion in the updated mapping of attributes and attribute values. Providing the answer can further include generating the answer using a natural language engine based on the updated mapping of attributes and attribute values; para. 84-85: the clustering step can include using deep learning clustering techniques, such as K-means, or Approximate Nearest Neighbors techniques, to cluster the QA pair data into clusters. In this way, each cluster can be associated to its centroid question. The classification can include classifying the QA pair data into categories using the cluster centroids or other classification techniques).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim, Kostoff to include clustering question text of Di Fabbrizio et al. in order to provide to user query responses more accurately and quickly than existing systems in that the question-answer expansion approach which enables an enhanced user experience for search or query related tasks that provides users with a fulfilling interactive dialog when searching and relevant search results.

As per claims 8, 18, Kim and Kostoff do not explicitly teach said claims.
	Di Fabbrizio et al. teaches*
wherein the determining whether the candidate question text is the target question text according to the prediction result comprises: in response to that the first prediction intent category is the same as the intent category corresponding to the keyword, determining the candidate question text as the target question text (para. 86-89: The trained predictive model can extract the target topic of the questions. For example, the question “Does this fridge has an icemaker?” can be determined to include a target topic of “ice maker”; the target topics can be extracted from user queries using an unsupervised model, such as a tree-cut model. Entity linking can be performed to match the target topic of the query to one or more attributes in the product catalog 620. The product catalog 620 can include a taxonomy of products, product attributes, and the corresponding values of the product attributes. Once the linked attributes and the target product corresponding to the user's query are known, a text-based similarity metric can be determined between the product attributes and the detected attributes. The text-based similarity metric can be determined using a language model); 
and in response to that the first prediction intent category and the intent category corresponding to the keyword are different, determining the candidate question text not to be the target question text (para. 85: the QA pair data can be cleaned after classifying to remove spam questions, or unrelated questions from the original set of QA pair data. In this way, unrelated questions will not be erroneously associated with a cluster of questions and can be removed; para. 89-91: a threshold can be applied to retain only the most relevant attributes. A similarity metric value is determined for each attribute generated in step 2. For example, “ice maker” has a similarity metric value of 0.9, while “ice shape” has a similarity metric value of 0.5. In step 4, attributes are filtered and those having a similarity metric value below a predetermined threshold (e.g., 0.70) are discarded).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim, Kostoff to include determining the candidate question text as the target question text of Di Fabbrizio et al. in order to provide to user query responses more accurately and quickly than existing systems and enhance user experiences for search or query related tasks that provides users with a fulfilling interactive dialog when searching and relevant search results.

As per claims 9, 19, Kim and Kostoff do not explicitly teach said claims.
	Di Fabbrizio et al. teaches 
wherein the prediction result comprises: a probability that the candidate question text belongs to each intent category, and the predicting the intent category to which the candidate question text belongs, and obtaining a prediction result of the candidate question text comprises: using a pre-trained intent recognition model to predict the intent category to which the candidate question text belongs, and obtaining the probability that the candidate question text belongs to each intent category (para. 60: the NLU module 336 can include one or more of intent classifiers (IC), named entity recognition (NER), and a model-selection component that can evaluate performance of various IC and NER components in order to select the configuration most likely generate contextually accurate conversational results. The NLU module 336 can include competing models which can predict the same labels but using different algorithms and domain models where each model produces different labels (customer care inquires, search queries, FAQ, etc.); para. 72: tree-cut models can be applied to a data structure, such as a question tree data structure. A tree-cut model can be used to identify cuts or partitions in the question tree data structure corresponding to question topics and question focus. In some embodiments, the tree-cut model can include a minimum description length (MLD)-based model. MLD models can utilize taxonomies associated with questions to determine a topic profile and a specificity measure. A topic profile can include a probability distribution of the topics distributed into one or more topic categories. Thus, the probability that the candidate question text belongs to each intent category); 
wherein the intent recognition model is obtained by training according to sample question texts and sample intent categories of the sample question texts (para. 5: training the predictive model can include clustering the question-answer pair data associated with the plurality of items in the item catalog. The question-answer pair data can include a first data element characterizing a query by a user for information associated with an item and a second data element characterizing a natural language answer to the query. Training the predictive model can also include determining at least one centroid question based on the clustering and categorizing the question-answer pair data base don the clustering; para. 31-32: the ASR engines 140 can include automated speech recognition engines configured to receive spoken or textual natural language inputs and to generate textual outputs corresponding the inputs. For example, the ASR engines 140 can process the user's verbalized query or utterance “What kind of ice cubes does the Acme SLX2 refrigerator make?” into a text string of natural language units characterizing the query. The QA processing platform 120 can dynamically select a particular ASR engine 140 that best suits a particular task, dialog, or received user query. The processing can include classifying an intent of the text string and extracting information from the text string).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim, Kostoff to include predicting the intent category to which the candidate question text belongs of Di Fabbrizio et al. in order to effectively assign a user's query to a predefined category that represents the user's underlying goals and respond accurately to user requests.

Claim(s) 10 is/are rejected under 35 U.S.C. 103 as being unpatentable over Kim et al. (US 20190392066) in view of Kostoff (US 6886010) and further in view of Di Fabbrizio et al. (US 20230106590) and further in view of Arya et al. (US 20230037894).

As per claim 10, Kim, Kostoff, Di Fabbrizio do not explicitly teach said claim.
	Arya et al. teaches
wherein the determining whether the candidate question text is the target question text according to the prediction result comprises: calculating an information entropy of the candidate question text according to the probability that the candidate question text belongs to each intent category; in response to that the information entropy is greater than or equal to the preset information entropy threshold, determining the candidate question text to be the target question text; in response to that the information entropy is less than the preset information entropy threshold, determining the candidate question text not to be the target question text (para. 44: the probability distribution may be performed by moment analysis including at least one of computing kurtosis, kewness, entropy and thresholding. The evaluation based on probability distribution may check if the utterances fall above a third predefined threshold and if yes, the corresponding intent may be concluded as the new variation of the existing intent. If the utterances fall below a third predefined threshold, the corresponding intent is concluded as falling under the bucket corresponding to the new intent; para. 46: a recommended result may be derived using the training files and trained classifier. The recommended result may be at least one from a recommended intent, answer or que code, training main question, new variation, new intent, similar training question and recommended question code).
Thus, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to modify the teachings of Kim, Kostoff, Di Fabbrizio to include calculating an information entropy of the candidate question text of Arya et al. in order to effectively identify new categories in high entropy questions thus, improve query responds to user requests.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. Japa et al. (US 12164873) teaches at col. 7:14-15: the QA server 183 is collocated with the storage device. 
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LINH BLACK whose telephone number is (571)272-4106. The examiner can normally be reached 9AM-5PM EST M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Tony Mahmoudi can be reached on 571-272-4078. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/LINH BLACK/Examiner, Art Unit 2163                                                                                                                                                                                                        1/28/2026

/TONY MAHMOUDI/Supervisory Patent Examiner, Art Unit 2163
Read full office action
Prosecution Timeline

May 02, 2024
Application Filed
Sep 26, 2025
Non-Final Rejection — §101, §103
Dec 26, 2025
Response Filed
Jan 29, 2026
Final Rejection — §101, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

17/358,120
Patent 12602376
SYSTEMS AND METHODS FOR DATA CURATION IN A DOCUMENT PROCESSING SYSTEM
2y 5m to grant Granted Apr 14, 2026
17/567,788
Patent 12530339
DISTRIBUTED PLATFORM FOR COMPUTATION AND TRUSTED VALIDATION
2y 5m to grant Granted Jan 20, 2026
16/508,009
Patent 12468835
SYSTEM AND METHOD FOR SESSION-AWARE DATASTORE FOR THE EDGE
2y 5m to grant Granted Nov 11, 2025
17/192,869
Patent 12461923
SUITABILITY METRICS BASED ON ENVIRONMENTAL SENSOR DATA
2y 5m to grant Granted Nov 04, 2025
17/179,274
Patent 12450239
METHODS AND APPARATUS FOR IMPROVING SEARCH RETRIEVAL
2y 5m to grant Granted Oct 21, 2025
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
51%
Grant Probability
62%
With Interview (+11.5%)
5y 1m
Median Time to Grant
Moderate
PTA Risk
Based on 437 resolved cases by this examiner. Grant probability derived from career allow rate.