Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
DETAILED ACTION
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception (i.e., a law of nature, a natural phenomenon, or an abstract idea) without significantly more. Independent claim(s) 1, 8, 15 is/are directed to the clustering of data, such as to determine similarity, such as of unstructured text, and as such are considered an abstract idea, such as in the form of a concept performed in the human mind or the transformation of first data into second data. The claim(s) do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the recited computations structures are stated at a high level of generality and are considered well-known routine and conventional. Independent claims 2-7, 9-14, 16-20 do not remedy and are similarly rejected. Applicant’s amendments to the independent claims filed 10/3/25 do not recite elements sufficient to amount to significantly more than the judicial exception of turning data in to additional data and as such the claims remain rejected.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries set forth in Graham v. John Deere Co., 383 U.S. 1, 148 USPQ 459 (1966), that are applied for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claims 1-20 rejected under 35 U.S.C. 103 as being unpatentable over Saleh: 20230252478 hereinafter Sal further in view of Osuala: 20220374598 hereinafter Osu.
Regarding claim 1
Sal teaches:
A processor, comprising: one or more circuits to:
for each of a plurality of clusters of data records (Sal: Abstract; ¶ 12, 14, 17, etc.: system operates to utilize a DNN, embeddings thereof, passed thereto, etc. to determine similarities and relationships among clustered features vectors of a plurality of data records), determine one or more data structures representing intra-cluster similarity and one or more data structures representing inter-cluster similarity (Sal: Abstract; ¶ 19, 39, 54, etc.; Fig 2: system uses a silhouette score which is considered to comprise a measure of cluster self-similarity or intra cluster similarity and compare this to other clusters or inter cluster similarity; see additionally Silhouette Wikipedia page; the communication of vectors between layers of the figure 2 DNN and the provision thereto of the silhouette score are considered to encompass the recited data structures) based, at least in part, on a vector representation for each data record generated by one or more neural networks (Sal: ¶ 47-49: an embedding layer of a neural network operable for deep clustering in concert with a silhouette score based on vectorized input)
cause one or more performance metrics corresponding to one or more data clustering algorithms used to produce the plurality of clusters, to be generated based, at least in part, on the data structures representing intra-cluster and inter-cluster similarity (Sal: ¶ 14, 19, etc.: silhouette score used for cluster performance evaluation) to identify one or more of:
cluster coverage with respect to an amount of unclustered data records (Sal: Abstract; ¶ 57; Fig 3: such as shown in the figure which details cluster membership, coverage, etc. and includes unclustered data within a clustering domain),
an amount of duplicate clusters,
a degree of cluster noise (Sal: Abstract; ¶ 14, 19, 54, 57: a silhouette score is a measure of cluster noise in as much as it provides a metric for cluster consistency which is considered to provide a measure of noise within a cluster in the form of a measure of how well particular data point(s) integrate distance wise with the overall cluster), or
an amount of sub-clusters; and
update the one or more data clustering algorithms based, at least in part, on the one or more performance metrics (Sal: ¶ 13, 18, 19, 34, 39, 54: the silhouette score is used in training a DNN, RNN, etc. to optimize a loss function with respect to a clustering algorithm and to monitor performance thereof; recurrent neural network (RNN) used to predict relationships among input data, an RNN is considered to update weight vectors, array/matrices thereof and upon each timestep, cadences of timesteps etc.).
Sal does not explicitly discuss processing plural clusters text strings however nor does Sal explicitly discuss determination of structures, scores, etc. based, at least in part, on a vector representation for each text string generated by one or more neural networks to represent a semantic meaning of each respective text string.
In a related field of endeavor Osu teaches a system and method for determining contextual word embeddings from a plurality of documents matching topic clusters in a first document (Osu: Abstract) wherein a clustering modules processes each of a plurality of clusters of text strings (Osu: Abstract; ¶ 4, 59-62; Fig 1, 2: clustering modules accumulates contextual word embeddings into a first set of clusters representing a topics in reference and comparison documents); determines vectors representing semantic meaning by a neural network and subsequently based, at least in part, on a vector representation for each text string generated by one or more neural networks to represent a semantic meaning of each respective text string (Osu: ¶ 8, 26 62, 63; Figs 2-4: system generates vectors for embeddings of a deep neural network to generate contextual word embedding vectors to realize meaning in the form of semantic relations among the text, vectors, clusters, etc. and to generate comparison scores among clusters) to thereby update the one or more data clustering algorithms based, at least in part, on the one or more performance metrics (Osu: ¶ 27, 61, 67; Fig 4: such as by providing iterative analysis of document comparisons such as by back propagation of BCE loss). It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant application to utilize the Sal taught clustering techniques to operate on text based contextual semantic embeddings representative of semantic meaning as in the manner taught or suggested by Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 2
Sal in view of Osu teaches or suggests:
The processor of claim 1, wherein the vector representations generated by the one or more neural networks are applied to indicate data not grouped with the plurality of clusters of data (Sal: ¶ 57, etc.; Fig 3: system identifies data unaffiliated or associated with a single cluster such as that of account D); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies outlying embedding excluded from clusters). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 3
Sal in view of Osu teaches or suggests:
The processor of claim 1, wherein the vectors to be generated by the one or more neural networks are indicative of noisy data among members of one or more clusters of data (Sal: Abstract; ¶ 14, 19, 54, 57: a silhouette score is a measure of cluster noise in as much as it provides a metric for cluster consistency which is considered to provide a measure of noise within a cluster in the form of a measure of how well particular data point(s) integrate distance wise with the overall cluster); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies noisy data in the form of outlying embedding excluded from clusters). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 4
Sal in view of Osu teaches or suggests:
The processor of claim 1, wherein the vectors to be generated by the one or more neural networks indicative of duplicate clusters of data among members of one or more clusters of data (Sal: ¶ 39: silhouette score is considered a measure of redundancy as clusters which are not well separated, insufficiently distinguished, etc. are increasingly duplicates of one another); (Osu: ¶ 4, 61, 74: clusters with high similarity values considered duplicate clusters such as with relation to an evaluation threshold).The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 5
Sal in view of Osu teaches or suggests:
The processor of claim 1, further comprising generating the one or more clusters of data using the one or more data clustering algorithms, wherein the one or more clusters of data is based, at least in part, on unstructured textual data (Sal: Abstract: clustering of embeddings based on silhouette scores); (Osu: Abstract; ¶ 4, 12, 58, 59, etc.: system operates to perform natural language processing upon input text or text generated from documents, audio, etc. to thereby parse unstructured text such as by classification, clustering, etc. and operable to generate contextual word embeddings from a plurality of input documents; the system generates semantic structure upon unstructured input or derived text). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 6
Sal in view of Osu teaches or suggests:
The processor of claim 1, wherein the vectors to be generated by the one or more neural networks indicative of a semantic relationship between members of the one or more clusters of data (Sal: Abstract; ¶ 14: clustering of embeddings based on silhouette scores allows for determination of similarities and relationships among input data); (Osu: Abstract; ¶ 4, 58-60, etc.: system operates to perform natural language processing upon input text or text generated from documents, audio, etc. to thereby parse unstructured text such as by classification, clustering, etc. and operable to generate contextual word embeddings from a plurality of input documents such as for comparing semantic, topical, etc. similarity thereof). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 7
Sal in view of Osu teaches or suggests:
The processor of claim 1, wherein the one or more performance metrics corresponding to the one or more data clustering algorithms are to be generated based, at least in part, on one or more performance metrics indicative of at least one of an amount of unclustered data (Sal: ¶ 57, etc.; Fig 3: system identifies data unaffiliated or associated with a single cluster such as that of account D); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies outlying embedding excluded from clusters), noisy data (Sal: Abstract; ¶ 14, 19, 54, 57: a silhouette score is a measure of cluster noise in as much as it provides a metric for cluster consistency which is considered to provide a measure of noise within a cluster in the form of a measure of how well particular data point(s) integrate distance wise with the overall cluster); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies noisy data in the form of outlying embedding excluded from clusters), duplicate clusters (Sal: ¶ 39: silhouette score is considered a measure of redundancy as clusters which are not well separated, insufficiently distinguished, etc. are increasingly duplicates of one another); (Osu: ¶ 4, 61, 74: clusters with high similarity values considered duplicate clusters such as with relation to an evaluation threshold) or sub-clusters to be merged, among the members of the one or more clusters of data. The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claims 8, 15—the claims recite substantially similar subject matter to that of claim 1 supra and are similarly rejected
Regarding claim 9—the claim recites substantially similar subject matter to that of claim 2 supra and is similarly rejected.
Regarding claim 10—the claim recites substantially similar subject matter to that of claim 3 supra and is similarly rejected.
Regarding claim 11—the claim recites substantially similar subject matter to that of claim 4 supra and is similarly rejected.
Regarding claim 12—the claim recites substantially similar subject matter to that of claim 5 supra and is similarly rejected.
Regarding claims 13, 17—the claims recite substantially similar subject matter to that of claim 6 supra and are similarly rejected.
Regarding claims 14, 16—the claims recite substantially similar subject matter to that of claim 7 supra and are similarly rejected.
Regarding claim 18
Sal in view of Osu teaches or suggests:
The method of claim 15, further comprising: updating parameters of one or more data clustering algorithms based, at least in part on, the one or more performance metrics (Sal: ¶ 13, 28, 35, etc.: recurrent neural network (RNN) used to predict relationships among input data, an RNN is considered to update weight vectors, array/matrices thereof and upon each timestep, cadences of timesteps etc.); (Osu: ¶ 27, 61, 67; Fig 4: iterative analysis of document comparisons such as by back propagation of BCE loss). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 19
Sal in view of Osu teaches or suggests:
The method of claim 15, further comprising generating a similarity matrix based, at least in part on the vectors generated by the one or more neural networks. Examiner takes official notice that utilizing data structures such as similarity matrices to store or buffer vector data for comparison, clustering, processing by a neural network, etc. was well known in the art before the effective filing date of the instant invention and would have comprised an obvious inclusion such as for managing the dimensionality of data in a computationally efficient manner; one of ordinary skill in the art would have expected only predictable results therefrom. The claim is thus considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Regarding claim 20
Sal in view of Osu teaches or suggests:
The method of claim 19, further comprising generating one or more performance metrics related to an amount of unclustered data (Sal: ¶ 57, etc.; Fig 3: system identifies data unaffiliated or associated with a single cluster such as that of account D); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies outlying embedding excluded from clusters), noisy data (Sal: Abstract; ¶ 14, 19, 54, 57: a silhouette score is a measure of cluster noise in as much as it provides a metric for cluster consistency which is considered to provide a measure of noise within a cluster in the form of a measure of how well particular data point(s) integrate distance wise with the overall cluster); (Osu: ¶ 75, 76; fig 9A, 10A: system identifies noisy data in the form of outlying embedding excluded from clusters), duplicate clusters (Sal: ¶ 39: silhouette score is considered a measure of redundancy as clusters which are not well separated, insufficiently distinguished, etc. are increasingly duplicates of one another); (Osu: ¶ 4, 61, 74: clusters with high similarity values considered duplicate clusters such as with relation to an evaluation threshold) or sub-clusters to be merged, among the members of the one or more clusters of data; based, at least in part, on the similarity matrix (please see claim 19 supra: the utility of operating with respect to a similarity matrix/matrices is considered obvious based thereon). The claim is considered obvious over Sal as modified by Osu as addressed in the base claim as it would have been obvious to apply the further teaching of Sal and/or Osu to the modified device of Sal and Osu; one of ordinary skill in the art would have expected only predictable results therefrom.
Response to Arguments
Applicant’s arguments in concert with claim amendments, see Remarks and Claims, filed 10/3/25, with respect to the rejection(s) of claim(s) 1-20 under 35 USC 103 over Deshpande in view of Lehal have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made in view of Saleh in view of Osuala.
Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to PAUL C MCCORD whose telephone number is (571)270-3701. The examiner can normally be reached 730-630 M-F.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, CAROLYN EDWARDS can be reached at (571) 270-7136. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/PAUL C MCCORD/Primary Examiner, Art Unit 2692