Last updated: May 29, 2026
Application No. 17/895,730
TEXT MINING USING A RELATIVELY LOWER DIMENSION REPRESENTATION OF DOCUMENTS

Non-Final OA §103§112
Filed
Aug 25, 2022
Examiner
WILLIS, AMANDA LYNN
Art Unit
2156
Tech Center
2100 — Computer Architecture & Software
Assignee
International Business Machines Corporation
OA Round
2 (Non-Final)
This examiner grants 36% of cases after interview

— +26.4% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 348 resolved cases, 2023–2026
Examiner Intelligence

WILLIS, AMANDA LYNN View full profile →
Grants only 36% of cases
Career Allowance Rate
124 granted / 348 resolved
-19.4% vs TC avg
Strong +26% interview lift
Without
With
+26.4%
Interview Lift
resolved cases with interview
Typical timeline
4y 9m
Avg Prosecution
16 currently pending
Career history
374
Total Applications
across all art units
Statute-Specific Performance

§101
1.6%
-38.4% vs TC avg
§103
86.2%
+46.2% vs TC avg
§102
5.9%
-34.1% vs TC avg
§112
1.1%
-38.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 348 resolved cases
Office Action

§103 §112
DETAILED ACTION
	Receipt of Applicant’s Amendment, filed January 2, 2026 is acknowledged.  
Claims 1-3, 5, 6, 8 11-13, 15, 16, 18, and 20 were amended.
Claims 7 and 17 were cancelled.
Claims 1-6, 8-16, 18-20 are pending in this office action.

Claim Interpretation
Within claims 1, 11, and 20, claim 1 recites “wherein clusters of the words represent features”.  Claims 11 and 20 appear to recite substantially similar language.  Paragraph [0076] of the original specification recites “The clusters each thereby represent a different feature”.  Paragraph [0062] recites “Accordingly, a plurality of clusters may be established, that each include two or more words determined to have a relatively high correlation with one another… Each cluster of the words represents a feature of at least one of the documents. For context, a feature defines a plurality of words that have a relatively high correlation with one another in one or more of the documents”  One of ordinary skill in the art would therefore recognize the term ‘feature’ and ‘cluster’ as describing the same thing, e.g. a collection of “two or more words determined to have a relatively high correlation with each other”.

Claim Objections
Claims 5-7 and 15-17 are objected to because of the following informalities.  Appropriate correction is required.
With regard to claims 5-6 and 15-16, claim 5 recites “wherein elements of the fist matrix”.  The remaining claims recite similar language with claims 6 and 16 referring to the second matrix.  This claim limitation lacks antecedent basis as the claims have not previously recited any elements.  Claim 1 recites “elements” as being an aspect of the third matrix not the first.  Each unique claim element is expected to have a unique claim label and each unique claim label is expected to refer to the same claim label.  The use of the single label “element” to refer to two distinct claim elements (e.g. something within the third matrix verse something within the first matrix) raises an antecedent basis issue.  Distinct labels should be used (e.g. first elements and third elements for the respective first and third matrix).
It is suggested that the claims be amended to clarify the structure of the matrixes as the current claim language leaves the reader to make assumptions regarding the meaning of the terms.  For examination purposes the ‘elements’ of the respective matrixes have been interpreted as referring to the values of the matrix (within the exemplified matrixes, these appear to be the frequency values).

Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.

The following is a quotation of the first paragraph of pre-AIA  35 U.S.C. 112:
The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor of carrying out his invention.

Claims 2-4, 8, 12-14 and 18 are rejected under 35 U.S.C. 112(a) or 35 U.S.C. 112 (pre-AIA ), first paragraph, as failing to comply with the written description requirement. The claim(s) contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, or for applications subject to pre-AIA  35 U.S.C. 112, the inventor(s), at the time the application was filed, had possession of the claimed invention. 

With regard to claims 2, and 12, claim 2 recites “wherein the first matrix is a relatively higher dimension representation of the documents and includes words of the document”.  Claim 12 appears to recite substantially similar language.  There is no support for the instant claim limitation.  Figure 4A element 406 depicts the First matrix, which shows the matrix including only the frequency.  Where the ‘words’ and document are the identifier for the row and columns, but not the elements included within the matrix itself.  Paragraph [0051] describes the first matrix, describing a matrix where the elements only include the frequency value, and that the ‘columns of the fist matrix may represent the words of the document’.  This does not state that the matrix includes the words themselves, merely that the rows represent the words.  One of ordinary skill in the art would recognize this as describing the matrix depicted in the figure, in which the matrix does not include the words, but instead includes the frequency, with the columns representing the words.  For examination purposes, the claim limitations have been construed in light of Figure 4A, 406 and ¶51.

With regard to claims 2, and 12, claim 2 recites “wherein the third matrix is a relatively lower dimension representation of the documents and includes the features”.  Claim 12 appears to recite substantially similar language.  There is no support for the instant claim limitation.  Figure 4f, element 432 and Paragraph 76 was identified as depicting a third matrix.  Within the drawings, the matrix only includes the ‘ffidf11’ element, where the ‘F1 … FK’ is the column heading, and not an element included in the matrix.  Paragraph [0076] of the original specification makes it clear that the ‘F1 … FK’ represent the ‘clusters’.  One of ordinary skill in the art would recognize the depicted matrix as not including the features, but instead including the frequencies, with the columns representing the features.  For examination purposes the claim limitations have been construed in light of Figure 4F, 432; ¶76.

With regard to claims 8 and 18, claim 8 recites “wherein the second matrix includes the determined frequencies and counts”.  Claim 18 recites substantially similar language.  There is no support for this claim limitation.
Figure 4C, 418 and Paragraph [0059] of the original specification depicts the second matrix.  Within the first, the element that is included in the matrix is a single value for each element etry (e.g. tficf11 … tficfML).  Paragraph [0059] provides Equation (4) which depicts the formula that is used to calculate this value.  Within the original specification, frequency (variable ‘f1jl’), the total count (variable “L”), count of chunks (variable ‘cfj’) are used within the equation to calculate the final value stored in the matrix, but this does not mean that the frequency, total count, or count are included in the matrix themselves.  For examination purposes this claim limitation has been construed in light of Paragraph [0059] to include a value that is calculated based on the frequency, total count, and count.

The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims 8 and 18 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.

With regard to claims 8 and 18, claim 8 recites “wherein the second matrix includes the determined frequencies and counts”.  Claim 18 recites substantially similar language.  This claim limitation lacks antecedent basis.
It is unclear which count is being referenced to.  The claim has previously recited “total count” and “count” as two distinct elements.  It is unclear if applicant is referring to the count, or attempting to refer to both the total count and the count.
Similarly it is unclear with ‘frequencies’ are being referenced.  The claim has previously recited “frequencies that the feature appears” in the parent claim, and “a frequency that a first word occurs’ in the instant claim.
	It is suggested that the claim be amended to clearly reference the exact claim labels consistently.  Please see the 112a rejection above regarding claim interpretation used for examination purposes.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-6, 8-16, 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Holt [6701305] in view of Baker [2015/0339288].

With regard to claim 1 Holt teaches A computer-implemented method, comprising:
generating a first matrix (Holt, Column 6, lines 16-19 “The text data collection is represented by a term-by-document matrix having a plurality of entries with each entry representing the frequency of occurrence of a term in a respective document”) based on words extracted (Holt, Column 9, lines 59-62 “the logic of generating a term list. The logic of FIG. 2 moves from a start block to block 130 where terms are tokenized according to a tokenizing policy”) from documents (Holt, Column 9, lines 5-8 “If so, the logic moves to block 104 where a term list is generated from the initial document collection. Generating a term list from the initial document collection is illustrated in detail FIG. 2,”);
generating a second matrix as working matrix Ak(Holt, Column 11, 53-60 “For example, the working matrix A can be projected into a k dimensional subspace, thereby defining the subspace representation Ak. While the working matrix A can be projected into the subspace according to a variety of techniques including a variety of orthogonal decompositions, the projection of A into the subspace is typically performed via a two-sided orthogonal matrix decomposition”) based on [[ as sets (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”), wherein the [[ as terms of the document (Holt, Column 12, lines 3-5 “more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”);
performing word clustering (Holt, Column 18, lines 26-30 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes.”) based on results of  as the document being classified into classes (Holt, Column 18, lines 26-31 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes”) an analysis as the classification (Id; Holt, Column 18, lines 36-41 “A transformation for generating a subspace representation of the classes is then generated from the matrix by using a two-sided orthogonal 40 decomposition, analogous to the indexing of a term-by document matrix D for information retrieval”) performed on the second matrix as the subspace representation Ak (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”), wherein clusters of the words (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) represent features as the class represents related terms of the document (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”; Column 12, lines 1-5 “The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features.” Please note this claim limitation has been interpreted in light of paragraph [0062] which recites “a feature defines a plurality of words that have a relatively high correlation with one another in one or more of the documents”) of at least one of the documents as the document (Id), wherein the analysis comprises calculating, based on word vectors, distances between the words (Column 17 line 66 - Column 18 line 2, “The similarity between the query vector and the document vectors is then determined by measuring the distance there between”);
generating a third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) based  on the first matrix as the information stored in the term-by-document matrix  is the term to document frequencies used to generate the term-by-class matrix (Holt, Column 18, lines 35-37 “The entries of this matrix are the frequencies of the terms in the documents that belong to a given class.”; Column 6, lines 16-19) and the clusters as  the class (Holt, Column 18, lines 15-17 “documents can be classified into none, one or more of a plurality of predefined classes as shown in FIGS. 7”), wherein the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”) includes elements of the third matrix that define rows and columns of the third matrix as a matrix with the row and columns being the term and the assigned class, and the entry being the frequency there between (Holt, Column 6, lines 6-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class.”), wherein the elements of the third matrix indicate frequencies that the features appear in the at least one of the documents as the frequency of occurrence of that term in the documents (Id); and
performing text mining (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”) using the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”).
Holt does not explicitly teach deduplication chunks.  Baker teaches deduplication chunks (Baker, ¶40 “Once this has been done, for each pair of articles, deduplication module 210 may extract the uni grams, bi grams and trigrams from each pair of preprocessed bodies of text and converted into sets of tokens.”; ¶46 “the article that is contained by the superset article is classified as a duplicate and removed”).
It would have been obvious to one of ordinary skill to which said subject matter pertains at the time the invention was filed to have implemented the projection of the data into sets of a reduced number of dimension using the deduplication analysis taught by Baker as it yields the predictable results of reducing the storage space for the article (Baker, ¶35 “Deduplication module 210 may first use the titles of each article in the set as a filtering stage to reduce the search space to be explored for article deduplication”)

With regard to claims 2 and 12, the proposed combination further teaches wherein the first matrix is a relatively higher dimension representation of the documents (Holt, Column 2, lines 30-33 “individual documents are treated as vectors in a high-dimensional vector space in which each dimension corresponds to some feature of a document.”) and includes (Please note this claim limitation has been construed in light of Figure 4A 406 which shows the ‘words’ as being the column heading and not an element within the matrix itself, and Paragraph [0051] which describes a word-document matrix wherein the value is the frequency that a given word appears in a given document) words of the document as the terms of the document in the term-by-document matrix (Holt, Column 6, lines 16-19 “The text data collection is represented by a term-by-document matrix having a plurality of entries with each entry representing the frequency of occurrence of a term in a respective document”), wherein the third matrix is a relatively lower dimension representation of the documents (Holt, Column 11, lines 51-53 “the matrix A is projected into a lower dimensional subspace.”) and includes the features as the class represents related terms of the document (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”; Column 12, lines 1-5 “The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features.”).

With regard to claims 3 and 13, the proposed combination further teaches 
determining, subsequent to performing the text mining (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”), whether the elements of the third matrix as the entire subspace (Holt, Column 12, lines 13-18 “As will be described hereinafter, the entire subspace representation Ak need not always be determined. Instead, only those portions, i.e., those rows, of the subspace representation Ak that correspond to the terms included within the query must be determined”) exceed as only the portions necessary (Id) a predetermined number of elements as the rows that correspond to the terms that must be determined (Id), and in response to a determination that the elements of the third matrix do not exceed the predetermined number of elements: increasing a threshold used for performing the word clustering as expanding the document subspace (Holt, Column 12, lines 37-41 “Next, in block 172, the existing term subspace Uk is augmented with the normalized residual, which is orthogonal to the original term subspace, and the document subspace, Vk, is expanded by adding a small identity matrix accordingly.”), updating the clusters and re-generating the third matrix based on the updated clusters  (Column 12, lines 27 “Still referring to FIG. 4, the logic then moves to block 164 where a new subspace representation is determined by updating the existing subspace with new documents and terms”).

With regard to claims 4 and 14, the proposed combination further teaches wherein performing text mining (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”) using the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”) includes running a text mining program as the text mining operation (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”) on the relatively lower dimension representation of the documents (Holt, Column 11, lines 51-53 “the matrix A is projected into a lower dimensional subspace.”).

With regard to claims 5 and 15, the proposed combination further teaches wherein elements of the first matrix indicate a frequency that a given word of the words extracted from the documents (Holt, Column 9, lines 59-62 “the logic of generating a term list. The logic of FIG. 2 moves from a start block to block 130 where terms are tokenized according to a tokenizing policy”) appears in the documents (Holt, Column 6, lines 16-19 “The text data collection is represented by a term-by-document matrix having a plurality of entries with each entry representing the frequency of occurrence of a term in a respective document”).

With regard to claims 6 and 16, the proposed combination further teaches wherein elements of the second matrix as working matrix A (Holt, Column 3, lines 22-28 “The term-by-document matrix can then be preprocessed to define a working matrix A by normalizing the columns of the term-by-document matrix D to have a unit sum, stabilizing the variance of the term frequencies via a nonlinear function and then centering the term frequencies with respect to the mean vector of the columns.”) indicate a frequency that a given word of the words extracted from the documents (Holt, Column 9, lines 59-62 “the logic of generating a term list. The logic of FIG. 2 moves from a start block to block 130 where terms are tokenized according to a tokenizing policy”) appears as the normalized and stabilized term frequencies (Id) in a the deduplication chunks as the sets (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”; Please note this claim limitation has been construed in light of ¶46 which recites “These groups are used as features, which are used to represent a document instead of words, with lower dimension.” Wherein the chunks instead of the original words themselves).


With regard to claims 8 and 18, the proposed combination further teaches wherein generating the second matrix based on deduplication chunks includes: 
determining, for each chunk as the set (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”), a frequency that a first word occurs (Holt, Column 3, lines 22-28 “The term-by-document matrix can then be preprocessed to define a working matrix A by normalizing the columns of the term-by-document matrix D to have a unit sum, stabilizing the variance of the term frequencies via a nonlinear function and then centering the term frequencies with respect to the mean vector of the columns.”; Baker, ¶92 “TF/IDF algorithm”); 
determining a total count of the chunks such as for example, 5 unigrams (Holt, Column 7, lines 25-29 “each term is weighted by determining an inverse one-norm of the term, i.e., the inverse of the sum of the absolute values of the entries of the row of the subspace representation Ak corresponding to the term.”; Baker, ¶31 “The sentence "The dog chased the cat" has 5 unigrams: "the", "dog", "chased", "the", "cat", that is each individual word token”); and
determining a count of the chunks that the first word occurs in as mean and average calculates involve summing the total and dividing by the number of occurrences, for example the unigram “the” has a count of 2 in the example ‘The dog chased the cat’ (Holt, Column 3, lines 25-31 “stabilizing the variance of the term frequencies via a nonlinear function and then centering the term frequencies with respect to the mean vector of the columns. This preprocessing is denoted as A=f(D)-ceT in which c is the mean of the columns of f(D) and e is a d-vector whose components are all 1, so that the average of the columns of A is now”; Baker, ¶32 “From the previous example, the sentence "The dog chased the cat" may be represented as a vector of the following form: [2, 1, 1, 1]. This vector has 4 dimensions.”), wherein the second matrix includes as working matrix Ak (Holt, Column 11, 53-60; Please see the 112a above regarding claim interpretation.  This claim limitation has been interpreted in light of Equation 4 recited in Paragraph [0059] of the original specification) the determined frequencies as f(D) (Holt, Column 11, lines 33-37 “The preprocessing can be mathematically represented by A=f(D) - ceT in which c is the mean vector and e is ad-vector whose components are all 1 so that the average of the columns of A is now  zero. As such, each ijth entry in A is a score indicating the relative occurrence of the ith term in the jth document.”) and counts as the occurrence of the term (Id).

With regard to claims 9 and 19, the proposed combination further teaches causing the documents to be stored into a deduplication storage (Holt, Column 19, lines 30-31 “The computer also includes nonvolatile storage 64, such as a hard disk drive, where data is stored”), wherein content of the documents is split into the deduplication chunks as the sets (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”) during the documents being stored into the deduplication storage (Holt, Column 19, lines 30-31 “The computer also includes nonvolatile storage 64, such as a hard disk drive, where data is stored”).

With regard to claim 10, the proposed combination further teaches wherein data normalization is not performed to generate matrixes (Holt, Column 10, lines 19-20 “This policy may be not to perform any term normalization, thereby making this an optional step”).

With regard to claim 11, the proposed combination teaches A computer program product, the computer program product comprising a computer readable storage medium (Holt, Column 19, lines 27-30 “The computer 50 includes a processing unit 60 and a system memory 62 which includes random access memory (RAM) and read-only memory (ROM)”) having program instructions (Holt, Column 19, lines 46-51 “computer program instructions may be loaded onto the computer or other programmable apparatus to produce a machine, such that the instructions which execute on the computer or other programmable apparatus create means for implementing the functions specified in the block diagram, flowchart or control flow block(s) or step(s).”) embodied therewith, the program instructions readable and/or executable by a computer to cause the computer to:
Generate, by the computer (Holt, Column 19, line 21 “general purpose computer 50”), a first matrix (Holt, Column 6, lines 16-19 “The text data collection is represented by a term-by-document matrix having a plurality of entries with each entry representing the frequency of occurrence of a term in a respective document”) based on words extracted (Holt, Column 9, lines 59-62 “the logic of generating a term list. The logic of FIG. 2 moves from a start block to block 130 where terms are tokenized according to a tokenizing policy”) from documents (Holt, Column 9, lines 5-8 “If so, the logic moves to block 104 where a term list is generated from the initial document collection. Generating a term list from the initial document collection is illustrated in detail FIG. 2,”);
generate, by the computer (Holt, Column 19, line 21 “general purpose computer 50”), a second matrix as working matrix Ak(Holt, Column 11, 53-60 “For example, the working matrix A can be projected into a k dimensional subspace, thereby defining the subspace representation Ak. While the working matrix A can be projected into the subspace according to a variety of techniques including a variety of orthogonal decompositions, the projection of A into the subspace is typically performed via a two-sided orthogonal matrix decomposition”) based on [[ as sets (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”), wherein the [[ as terms of the document (Holt, Column 12, lines 3-5 “more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”);
perform, by the computer (Holt, Column 19, line 21 “general purpose computer 50”), word clustering (Holt, Column 18, lines 26-30 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes.”) based on results of  as the document being classified into classes (Holt, Column 18, lines 26-31 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes”) an analysis as the classification (Id; Holt, Column 18, lines 36-41 “A transformation for generating a subspace representation of the classes is then generated from the matrix by using a two-sided orthogonal 40 decomposition, analogous to the indexing of a term-by document matrix D for information retrieval”) performed on the second matrix as the subspace representation Ak (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”), wherein clusters of the words (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) represents features as the class represents related terms of the document (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”; Column 12, lines 1-5 “The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features.” Please note this claim limitation has been interpreted in light of paragraph [0062] which recites “a feature defines a plurality of words that have a relatively high correlation with one another in one or more of the documents”) of at least one of the documents as the document (Id), wherein the analysis comprises calculating, based on word vectors, distances between the words (Column 17 line 66 - Column 18 line 2, “The similarity between the query vector and the document vectors is then determined by measuring the distance there between”);
generate, by the computer (Holt, Column 19, line 21 “general purpose computer 50”), a third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) based  on the first matrix as the information stored in the term-by-document matrix  is the term to document frequencies used to generate the term-by-class matrix (Holt, Column 18, lines 35-37 “The entries of this matrix are the frequencies of the terms in the documents that belong to a given class.”; Column 6, lines 16-19) and the clusters as  the class (Holt, Column 18, lines 15-17 “documents can be classified into none, one or more of a plurality of predefined classes as shown in FIGS. 7”), wherein the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”) includes elements of the third matrix that define rows and columns of the third matrix as a matrix with the row and columns being the term and the assigned class, and the entry being the frequency there between (Holt, Column 6, lines 6-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class.”), wherein the elements of the third matrix indicate frequencies that the features appear in the at least one of the documents as the frequency of occurrence of that term in the documents (Id); and
perform, by the computer (Holt, Column 19, line 21 “general purpose computer 50”), text mining (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”) using the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”).
Holt does not explicitly teach deduplication chunks.  Baker teaches deduplication chunks (Baker, ¶40 “Once this has been done, for each pair of articles, deduplication module 210 may extract the uni grams, bi grams and trigrams from each pair of preprocessed bodies of text and converted into sets of tokens.”; ¶46 “the article that is contained by the superset article is classified as a duplicate and removed”).
It would have been obvious to one of ordinary skill to which said subject matter pertains at the time the invention was filed to have implemented the projection of the data into sets of a reduced number of dimension using the deduplication analysis taught by Baker as it yields the predictable results of reducing the storage space for the article (Baker, ¶35 “Deduplication module 210 may first use the titles of each article in the set as a filtering stage to reduce the search space to be explored for article deduplication”).

With regard to claim 20, the proposed combination teaches A system, comprising:
a processor (Holt, Column 19, lines 27-30 “The computer 50 includes a processing unit 60 and a system memory 62 which includes random access memory (RAM) and read-only memory (ROM)”); and
logic (Holt, Column 19, lines 46-51 “computer program instructions may be loaded onto the computer or other programmable apparatus to produce a machine, such that the instructions which execute on the computer or other programmable apparatus create means for implementing the functions specified in the block diagram, flowchart or control flow block(s) or step(s).”) integrated with the processor, executable by the processor, or integrated with and executable by the processor, the logic being configured to:
Generate, a first matrix (Holt, Column 6, lines 16-19 “The text data collection is represented by a term-by-document matrix having a plurality of entries with each entry representing the frequency of occurrence of a term in a respective document”) based on words extracted (Holt, Column 9, lines 59-62 “the logic of generating a term list. The logic of FIG. 2 moves from a start block to block 130 where terms are tokenized according to a tokenizing policy”) from documents (Holt, Column 9, lines 5-8 “If so, the logic moves to block 104 where a term list is generated from the initial document collection. Generating a term list from the initial document collection is illustrated in detail FIG. 2,”);
generate, a second matrix as working matrix Ak(Holt, Column 11, 53-60 “For example, the working matrix A can be projected into a k dimensional subspace, thereby defining the subspace representation Ak. While the working matrix A can be projected into the subspace according to a variety of techniques including a variety of orthogonal decompositions, the projection of A into the subspace is typically performed via a two-sided orthogonal matrix decomposition”) based on [[ as sets (Holt, Column 11, lines 66 – Column 12, line 5 “Statistically, the effect of the TURV is to combine the original large set of variables into a smaller set of more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”), wherein the [[ as terms of the document (Holt, Column 12, lines 3-5 “more semantically significant features. The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features”);
perform, word clustering (Holt, Column 18, lines 26-30 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes.”) based on results of  as the document being classified into classes (Holt, Column 18, lines 26-31 “In this regard and as with any classification method, there is a training phase where a training sample is used to determine a classifier and a classification phase that uses this classifier to determine the manner in which new documents will be classified into classes”) an analysis as the classification (Id; Holt, Column 18, lines 36-41 “A transformation for generating a subspace representation of the classes is then generated from the matrix by using a two-sided orthogonal 40 decomposition, analogous to the indexing of a term-by document matrix D for information retrieval”) performed on the second matrix as the subspace representation Ak (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”), wherein clusters of the words (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) represents features as the class represents related terms of the document (Holt, Column 18, lines 49-51 “Those portions of the subspace representation Ak of the term-by-class matrix that relate to the terms of the document to be classified”; Column 12, lines 1-5 “The coordinates of the projected data in the reduced number of dimensions can be used to characterize the documents, and therefore represent the effect of thousands or tens of thousands of terms in a few hundred or more significant features.” Please note this claim limitation has been interpreted in light of paragraph [0062] which recites “a feature defines a plurality of words that have a relatively high correlation with one another in one or more of the documents”) of at least one of the documents as the document (Id), wherein the analysis comprises calculating, based on word vectors, distances between the words (Column 17 line 66 - Column 18 line 2, “The similarity between the query vector and the document vectors is then determined by measuring the distance there between”);
generate, a third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class”) based  on the first matrix as the information stored in the term-by-document matrix  is the term to document frequencies used to generate the term-by-class matrix (Holt, Column 18, lines 35-37 “The entries of this matrix are the frequencies of the terms in the documents that belong to a given class.”; Column 6, lines 16-19) and the clusters as  the class (Holt, Column 18, lines 15-17 “documents can be classified into none, one or more of a plurality of predefined classes as shown in FIGS. 7”), wherein the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”) includes elements of the third matrix that define rows and columns of the third matrix as a matrix with the row and columns being the term and the assigned class, and the entry being the frequency there between (Holt, Column 6, lines 6-63 “A term-by-class matrix is formed from this training set having a plurality of entries with each entry representing the frequency of occurrence of a term in all the documents assigned to a class.”), wherein the elements of the third matrix indicate frequencies that the features appear in the at least one of the documents as the frequency of occurrence of that term in the documents (Id); and
perform, text mining (Holt, Column 9, lines 47-50 “If so, the logic moves to block 118 for performance of a text mining operation, namely, an information retrieval operation as depicted in FIG. 6.”; Column 19, lines 33-35 “A display 66 is provided for viewing text mining data, and interacting with a user interface to request text mining operations.”) using the third matrix (Holt, Column 6, lines 60-63 “A term-by-class matrix”).
Holt does not explicitly teach deduplication chunks.  Baker teaches deduplication chunks (Baker, ¶40 “Once this has been done, for each pair of articles, deduplication module 210 may extract the uni grams, bi grams and trigrams from each pair of preprocessed bodies of text and converted into sets of tokens.”; ¶46 “the article that is contained by the superset article is classified as a duplicate and removed”).
It would have been obvious to one of ordinary skill to which said subject matter pertains at the time the invention was filed to have implemented the projection of the data into sets of a reduced number of dimension using the deduplication analysis taught by Baker as it yields the predictable results of reducing the storage space for the article (Baker, ¶35 “Deduplication module 210 may first use the titles of each article in the set as a filtering stage to reduce the search space to be explored for article deduplication”).

Response to Arguments
Applicant's arguments filed January 2, 2026 have been fully considered but they are not persuasive.  All the arguments regarding the newly added limitations are addressed in the above rejections.

With regard to claims 1-6 and 8-10 applicant argues that Holt disparages incorporation of calculating distances between word vectors.  Applicant cites to Holt Column 4, line 29 and Column 2, line 65.
In response it is noted that the issues that Holt raises, are specifically when the query is being treated as a pseudo-document (Column 4, lines 55-58) because this may result in a query that has zero components.  Holt does state that Treating a query as a pseudo-document is certainly a viable technique, and recites some advantages (Holt, Column 4, line 50-51) as well as pointing out difficulties that may be encountered.  The recited problems are not specific to the distance calculation, but to the situation where the query is being treated as a pseudo-document.  Furthermore, Holt details the solution to this problem (e.g. treating the zero components as irrelevant during the comparison, See Holt Column 4, lines 63-65).  This does not disparage the comparison itself, but instead details how to weight the terms during the comparison.  In fact, Holt details a traditional technique which applies global weights that is known to address this problem (Column 5, lines 25-32).  Holt details issues with the weight being global (Holt, Column 5, liens 32-46) and therefore introduces the improvement to the technology that Holt is presenting, individually weighted terms (Holt, Column 47-55).
Applicant’s arguments that Holt does not teach the comparison itself based on the recited issues with queries being treated as pseudo-documents and the use of global weights does not disparate the use of the comparison itself.  The device taught by Holt explicitly compares the vectors (Holt, Column 17 line 66 - Column 18 line 2), and introduces the idea of individually weighted terms to address the issues raised.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to AMANDA WILLIS whose telephone number is (571)270-7691. The examiner can normally be reached Monday-Friday 8am-2pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ajay Bhatia can be reached at 571-272-3906. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AMANDA L WILLIS/           Primary Examiner, Art Unit 2156
Read full office action
Prosecution Timeline

Aug 25, 2022
Application Filed
Oct 25, 2023
Response after Non-Final Action
Oct 06, 2025
Non-Final Rejection mailed — §103, §112
Jan 02, 2026
Response Filed
Feb 25, 2026
Final Rejection mailed — §103, §112
Mar 12, 2026
Response after Non-Final Action
May 20, 2026
Request for Continued Examination
May 22, 2026
Response after Non-Final Action
Precedent Cases

Applications granted by this same examiner with similar technology

16/678,984
Patent 12639369
Dynamic Audio File Generation
6y 6m to grant Granted May 26, 2026
18/732,241
Patent 12639306
DATABASE OPERATOR CLAUSE VARIABLE CALCULATION IN DISTRIBUTED SYSTEMS
1y 11m to grant Granted May 26, 2026
18/603,392
Patent 12619635
METHODS AND SYSTEMS FOR SUPPLY CHAIN ANALYTICS USING VISUALIZATIONS AND STANDARDIZATION CONSTRUCTS
2y 1m to grant Granted May 05, 2026
15/132,638
Patent 12608395
EXTRACTION OF AUDIT TRAILS
10y 0m to grant Granted Apr 21, 2026
17/380,905
Patent 12602380
SUBSUMPTION OF VIEWS AND SUBQUERIES
4y 8m to grant Granted Apr 14, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.
Typically takes 5-10 seconds — AI-generated, attorney review required before filing
Prosecution Projections

2-3
Expected OA Rounds
36%
Grant Probability
62%
With Interview (+26.4%)
4y 9m (~11m remaining)
Median Time to Grant
Moderate
PTA Risk
Based on 348 resolved cases by this examiner. Grant probability derived from career allowance rate.
TEXT MINING USING A RELATIVELY LOWER DIMENSION REPRESENTATION OF DOCUMENTS

This examiner grants 36% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

Strategy Recommendation AI-generated — please review before filing

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email