DETAILED ACTION
This Office action is in response to Applicant’s reply filed 09/17/2025.
Claims 1 and 7-10 are pending.
Claims 2-6 are canceled. Claims 9-10 are new.
Claims 1 and 7-10 are rejected.
Notice of AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Statutory Review under 35 USC § 101
Claims 1 and 9-10 are directed toward a system and have been reviewed.
Claims 1 and 9-10 appear to be statutory, as the system includes hardware (at least one processor) as disclosed in ¶ 0126 of the applicant’s specification.
Claims 1 and 9-10 also now appear to be directed to patent-eligible subject matter, as claim 1 has been amended to incorporate the patent-eligible subject matter of at least claim 3.
Claim 7 is directed towards a method and has been reviewed.
Claim 7 also appears to now be patent-eligible subject matter, as claim 7 has been amended to incorporate the patent-eligible subject matter of at least claim 3.
Claim 8 is directed toward an article of manufacture and has been reviewed.
Claim 8 initially appears to be statutory, as the article of manufacture excludes transitory signals.
Claim 8 also appears to now be patent-eligible subject matter, as claim 8 has been amended to incorporate the patent-eligible subject matter of at least claim 3.
Response to Amendments - 35 USC § 101
Claims 1 and 6-8 were previously rejected under 35 U.S.C. 101 because the claimed invention was directed to an abstract idea without significantly more.
Independent claims 1 and 7-8 have been amended to include at least the subject matter of dependent claims 2-3.
Claims 1 and 7-8 are now directed to patent-eligible subject matter as the judicial exception is integrated into a practical application as per (Revised) Step 2A, Prong Two of the patent subject matter eligibility determination.
Specifically, the claims recite additional elements demonstrating that the claim as a whole integrates the exception into a practical application. The claims have been evaluated to ensure that the claims reflect the disclosed improvement: the claims are drawn to calculating an evaluation index (which is a mathematical concept and thus is an abstract idea) but doing so on a cluster comprising a plurality of embeddings which have all been generated from a plurality of training data pieces including the same label. Adjusting a calculation of an evaluation index based on a cluster having embeddings having been generated from identically-labeled training data pieces shows an improved calculation.
These additional claim elements improve the functioning of a computer or any other technology or technical fields, thus integrating the abstract exception into a practical application.
The rejection of claims 1 and 7-8 is withdrawn.
Dependent claim 6 is canceled, and the rejection of claim 6 is now moot.
Response to Arguments
35 U.S.C. 101
Applicant’s arguments, see pp6-8, filed 09/17/2025, with respect to the 35 U.S.C. 101 rejection of claims 1 and 6-8 have been fully considered and are persuasive. The 35 U.S.C. 101 rejection of claims 1 and 6-8 has been withdrawn.
35 U.S.C. 112
Applicant’s arguments, see p8, filed 09/17/2025, with respect to the 35 U.S.C. 112(b) rejection of claim 5 have been fully considered; due to the cancellation of claim 5, the 35 U.S.C. 112(b) rejection of claim 5 has been withdrawn.
35 U.S.C. 103
Applicant’s arguments, see pp8-11, filed 09/17/2025, with respect to the rejection(s) of claim(s) 1-8 under 35 U.S.C. 103 have been fully considered and are persuasive. Therefore, the rejection has been withdrawn. However, upon further consideration, a new ground(s) of rejection is made.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 7, and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Cao et al., U.S. Patent Application Publication No. 2018/0032897 (hereinafter Cao) in view of Cheng et al., U.S. Patent Application Publication No. 2025/0124026 (filed October 11, 2023, prior to the priority date of October 17, 2023 for the instant application; hereinafter Cheng) in further view of Li et al., U.S. Patent Application Publication No. 2023/0237769 (previously utilized in the rejection of dependent claim 2; published July 27, 2023, prior to the priority date of October 17, 2023 for the instant application; hereinafter Li) in further view of Osuala et al., U.S. Patent Application Publication No. 2022/0374598 (hereinafter Osuala).
Regarding claim 1, Cao teaches:
An evaluation apparatus, comprising at least one processor, the at least one processor carrying out: (Cao FIG. 7, ¶ 0037: The method in one embodiment is performed by at least one hardware processor)
an acquisition process of acquiring embeddings for natural language sentences that are respectively included in a plurality of training data pieces, the embeddings having been generated with use of … a language processing model; (Cao FIG. 1, ¶ 0017-0018: generates document embedding in one embodiment of the present disclosure. A word is represented by a vector, called word embedding, and a cluster is represented by the sum embeddings of its words. The word embeddings are trained by the skip-gram model using news documents in one embodiment of the present disclosure ... To compute the embedding for a document, the document may be received, for example, from a computer network 102. The document may be preprocessed at 104. Examples of preprocessing may include word segmentation, training of word embedding, and filtering out stop words)
a clustering process of carrying out clustering of the embeddings… (Cao FIG. 1, ¶ 0034: At 120, clustering may be performed on the documents encoded with embedded representation. The method in one embodiment of the present disclosure clusters the documents, whose representations are embeddings, by the k-means method)
a calculation process of, for each result of the clustering, identifying, with reference to labels included in the plurality of training data pieces, one or more clusters that satisfy an occupancy condition, the occupancy condition being a condition for regarding a cluster as being occupied by embeddings … (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed; see then Cao FIG. 6, ¶ 0037-0041: for a given word in the document, a cosine similarity between a word embedding of the given word and the word embeddings of words in an existing cluster may be determined. For instance, the word embeddings of words in a cluster may be summed to represent the cluster, and the cluster's word embedding representation may be compared with the word embedding of the given word. Responsive to determining that the cosine similarity between the word embedding of the given word and the word embeddings of the existing cluster meets a defined threshold, the given word is placed in the existing cluster)
an evaluation process of evaluating quality of the embedding… (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed. The smaller S_Dbw is, the better clusters are. The processing at 124 evaluates the quality of document embeddings learned by a method of the present disclosure in one embodiment … If the documents have better embeddings, better clusters will result)
Cao does not expressly disclose use of an embedding layer included in a language processing model.
Cao further does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao further does not expressly disclose:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Cheng addresses at least some of these limitations by teaching:
the embeddings having been generated with use of an embedding layer included in a language processing model; (Cheng ¶ 0025: The embedding generator 310 generates, using the selected text embedding model 220S, the text embedding 312 for each requested data element 152; Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
Cheng also teaches:
an evaluation process of evaluating quality of the embedding… (Cheng ¶ 0029: The model selector 210 may evaluate the parameters 222 and the thresholds 202 to select the text embedding model 220 based on the thresholds 202 (e.g., the cost threshold 202A and/or the quality threshold 202B). For example, the model selector 210 uses a cost function or the like to evaluate the thresholds 202 and the parameters 222)
Cheng also teaches the embedding layer. (Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the embedding and models of Cao with the embedding and models of Cheng.
In addition, both of the references (Cao and Cheng) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as determination of embedding quality.
Motivation to do so would be to improve the functioning of Cao performing an evaluation of quality with the functioning of Cheng also performing an evaluation of quality but with the improvement of utilizing cost or quality thresholds.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to balance scalability, embedding quality, and cost based on the user's needs, desires, and available resources as seen in Cheng (¶ 0035).
Cao in view of Cheng does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao in view of Cheng further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng further does not expressly disclose:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao in view of Cheng further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Li teaches a clustering process of carrying out clustering of the embeddings a plurality of times while varying the number of clusters. (Li ¶ 0036: to vary the precision of the classification module, the number of clusters 322C of each of the plurality of categories may be varied into a new number of clusters 322C. When the number of clusters 322C are varied, the plurality of feature vectors of each of the plurality of categories may be re-clustered into the new number of clusters 322C ... after the number of centroids, K, is varied or adjusted, the plurality of feature vectors of each of the plurality of categories may be re-clustered into new clusters 322C accordingly. Based on the new number of clusters 322C, new clustered centroids may be determined, and the classification layer 342 in the image recognition model 330 may be re-generated based on the new clustered centroids)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Li.
Motivation to do so would be to improve the functioning of Cao as modified using models (to generate embeddings) with the similar functioning of Li also using models but with the improved adjustment of precision.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to implement improved recognition model accuracy and also to implement improved classification layer accuracy as seen in Li ¶ 0034 and ¶ 0036.
Cao in view of Cheng and Li does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng and Li further does not expressly disclose:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao in view of Cheng and Li further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Osuala addresses this by teaching the following:
Osuala teaches embeddings generated from training data pieces including the same label. (Osuala ¶ 0067: the server computer 102 trains (or fine tunes) a neural network classifier that predicts if an embedding pair is from the same cluster or if the two embeddings are from the same (or, based on probability predictions, different) clusters by, for example, considering loss (e.g., binary cross entropy). After training, the neural network classifier has learned a reference text-specific function that describes the semantic similarity between the cluster distributions of the compared texts)
Osuala further teaches:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering; (Osuala FIG. 2, ¶ 0077: cluster weights w.sub.A of c.sub.A and w.sub.B of c.sub.B are the ratio of the number of embeddings in the respective cluster c.sub.A (or c.sub.B) compared to the total number of clustered embeddings of A (or B))
an evaluation process of evaluating quality of the embedding layer based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion. (Osuala FIG. 2, FIG. 12, ¶ 0077-0079: Exemplary cluster weights, relative weight proportions, summary scores, and weighted summary scores are shown combined according to aspects of the present invention in explanation 1202 of FIG. 12 to provide an exemplary comparison document similarity value S.sub.BA. In the example shown 1202, shown in FIG. 1202, the S.sub.BA is 0.84 which exceeds an exemplary sufficiency threshold of 0.6)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Osuala.
Motivation to do so would be to improve the functioning of Cao as modified performing a similarity determination with the similar functioning of Osuala also performing similarity determination but with the improvement of inclusion of adjustable similarity thresholds (Osuala ¶ 0077-0079).
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to improve comparison accuracy while increasing computational efficiency as seen in Osuala ¶ 0009-0010.
Regarding claim 7, Cao teaches:
An evaluation method, comprising: an acquisition process in which at least one processor acquires embeddings for natural language sentences that are respectively included in a plurality of training data pieces, the embeddings having been generated with use of … a language processing model; (Cao FIG. 1, ¶ 0017-0018: generates document embedding in one embodiment of the present disclosure. A word is represented by a vector, called word embedding, and a cluster is represented by the sum embeddings of its words. The word embeddings are trained by the skip-gram model using news documents in one embodiment of the present disclosure ... To compute the embedding for a document, the document may be received, for example, from a computer network 102. The document may be preprocessed at 104. Examples of preprocessing may include word segmentation, training of word embedding, and filtering out stop words)
a clustering process in which the at least one processor carries out clustering of the embeddings… (Cao FIG. 1, ¶ 0034: At 120, clustering may be performed on the documents encoded with embedded representation. The method in one embodiment of the present disclosure clusters the documents, whose representations are embeddings, by the k-means method)
a calculation process in which, for each result of the clustering, the at least one processor identifies, with reference to labels included in the plurality of training data pieces, one or more clusters that satisfy an occupancy condition, the occupancy condition being a condition for regarding a cluster as being occupied by embeddings … (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed; see then Cao FIG. 6, ¶ 0037-0041: for a given word in the document, a cosine similarity between a word embedding of the given word and the word embeddings of words in an existing cluster may be determined. For instance, the word embeddings of words in a cluster may be summed to represent the cluster, and the cluster's word embedding representation may be compared with the word embedding of the given word. Responsive to determining that the cosine similarity between the word embedding of the given word and the word embeddings of the existing cluster meets a defined threshold, the given word is placed in the existing cluster)
an evaluation process in which the at least one processor evaluates quality of the embedding… (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed. The smaller S_Dbw is, the better clusters are. The processing at 124 evaluates the quality of document embeddings learned by a method of the present disclosure in one embodiment … If the documents have better embeddings, better clusters will result)
Cao does not expressly disclose use of an embedding layer included in a language processing model.
Cao further does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao further does not expressly disclose:
calculates, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings;
Cao further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Cheng addresses at least some of these limitations by teaching:
the embeddings having been generated with use of an embedding layer included in a language processing model; (Cheng ¶ 0025: The embedding generator 310 generates, using the selected text embedding model 220S, the text embedding 312 for each requested data element 152; Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
Cheng also teaches:
an evaluation process in which the at least one processor evaluates quality of the embedding… (Cheng ¶ 0029: The model selector 210 may evaluate the parameters 222 and the thresholds 202 to select the text embedding model 220 based on the thresholds 202 (e.g., the cost threshold 202A and/or the quality threshold 202B). For example, the model selector 210 uses a cost function or the like to evaluate the thresholds 202 and the parameters 222)
Cheng also teaches the embedding layer. (Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the embedding and models of Cao with the embedding and models of Cheng.
In addition, both of the references (Cao and Cheng) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as determination of embedding quality.
Motivation to do so would be to improve the functioning of Cao performing an evaluation of quality with the functioning of Cheng also performing an evaluation of quality but with the improvement of utilizing cost or quality thresholds.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to balance scalability, embedding quality, and cost based on the user's needs, desires, and available resources as seen in Cheng (¶ 0035).
Cao in view of Cheng does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao in view of Cheng further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng further does not expressly disclose:
calculates, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings;
Cao in view of Cheng further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Li teaches a clustering process in which the at least one processor carries out clustering of the embeddings a plurality of times while varying the number of clusters. (Li ¶ 0036: to vary the precision of the classification module, the number of clusters 322C of each of the plurality of categories may be varied into a new number of clusters 322C. When the number of clusters 322C are varied, the plurality of feature vectors of each of the plurality of categories may be re-clustered into the new number of clusters 322C ... after the number of centroids, K, is varied or adjusted, the plurality of feature vectors of each of the plurality of categories may be re-clustered into new clusters 322C accordingly. Based on the new number of clusters 322C, new clustered centroids may be determined, and the classification layer 342 in the image recognition model 330 may be re-generated based on the new clustered centroids)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Li.
Motivation to do so would be to improve the functioning of Cao as modified using models (to generate embeddings) with the similar functioning of Li also using models but with the improved adjustment of precision.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to implement improved recognition model accuracy and also to implement improved classification layer accuracy as seen in Li ¶ 0034 and ¶ 0036.
Cao in view of Cheng and Li does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng and Li further does not expressly disclose:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao in view of Cheng and Li further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Osuala addresses this by teaching the following:
Osuala teaches embeddings generated from training data pieces including the same label. (Osuala ¶ 0067: the server computer 102 trains (or fine tunes) a neural network classifier that predicts if an embedding pair is from the same cluster or if the two embeddings are from the same (or, based on probability predictions, different) clusters by, for example, considering loss (e.g., binary cross entropy). After training, the neural network classifier has learned a reference text-specific function that describes the semantic similarity between the cluster distributions of the compared texts)
Osuala further teaches:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering; (Osuala FIG. 2, ¶ 0077: cluster weights w.sub.A of c.sub.A and w.sub.B of c.sub.B are the ratio of the number of embeddings in the respective cluster c.sub.A (or c.sub.B) compared to the total number of clustered embeddings of A (or B))
an evaluation process in which the at least one processor evaluates quality of the embedding layer based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion. (Osuala FIG. 2, FIG. 12, ¶ 0077-0079: Exemplary cluster weights, relative weight proportions, summary scores, and weighted summary scores are shown combined according to aspects of the present invention in explanation 1202 of FIG. 12 to provide an exemplary comparison document similarity value S.sub.BA. In the example shown 1202, shown in FIG. 1202, the S.sub.BA is 0.84 which exceeds an exemplary sufficiency threshold of 0.6)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Osuala.
Motivation to do so would be to improve the functioning of Cao as modified performing a similarity determination with the similar functioning of Osuala also performing similarity determination but with the improvement of inclusion of adjustable similarity thresholds (Osuala ¶ 0077-0079).
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to improve comparison accuracy while increasing computational efficiency as seen in Osuala ¶ 0009-0010.
Regarding claim 8, Cao teaches:
A non-transitory storage medium storing a program for causing a computer to function as an evaluation apparatus, the program causing the computer to carry out: (Cao ¶ 0053: The present invention may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention)
an acquisition process of acquiring embeddings for natural language sentences that are respectively included in a plurality of training data pieces, the embeddings having been generated with use of … a language processing model; (Cao FIG. 1, ¶ 0017-0018: generates document embedding in one embodiment of the present disclosure. A word is represented by a vector, called word embedding, and a cluster is represented by the sum embeddings of its words. The word embeddings are trained by the skip-gram model using news documents in one embodiment of the present disclosure ... To compute the embedding for a document, the document may be received, for example, from a computer network 102. The document may be preprocessed at 104. Examples of preprocessing may include word segmentation, training of word embedding, and filtering out stop words)
a clustering process of carrying out clustering of the embeddings; (Cao FIG. 1, ¶ 0034: At 120, clustering may be performed on the documents encoded with embedded representation. The method in one embodiment of the present disclosure clusters the documents, whose representations are embeddings, by the k-means method)
a calculation process in which, for each result of the clustering, the computer identifies,, with reference to labels included in the plurality of training data pieces, one or more clusters that satisfy an occupancy condition, the occupancy condition being a condition for regarding a cluster as being occupied by embeddings … (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed; see then Cao FIG. 6, ¶ 0037-0041: for a given word in the document, a cosine similarity between a word embedding of the given word and the word embeddings of words in an existing cluster may be determined. For instance, the word embeddings of words in a cluster may be summed to represent the cluster, and the cluster's word embedding representation may be compared with the word embedding of the given word. Responsive to determining that the cosine similarity between the word embedding of the given word and the word embeddings of the existing cluster meets a defined threshold, the given word is placed in the existing cluster)
an evaluation process in which the computer evaluates quality of the embedding… (Cao FIG. 1, ¶ 0035: At 124, cluster quality evaluation may be performed. For instance, a measurement method such as S_Dbw, a popular clustering validation metric may be computed. The smaller S_Dbw is, the better clusters are. The processing at 124 evaluates the quality of document embeddings learned by a method of the present disclosure in one embodiment … If the documents have better embeddings, better clusters will result)
Cao does not expressly disclose use of an embedding layer included in a language processing model.
Cao further does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao further does not expressly disclose:
calculates, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Cheng addresses at least some of these limitations by teaching:
the embeddings having been generated with use of an embedding layer included in a language processing model; (Cheng ¶ 0025: The embedding generator 310 generates, using the selected text embedding model 220S, the text embedding 312 for each requested data element 152; Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
Cheng also teaches:
an evaluation process in which the computer evaluates quality of the embedding… (Cheng ¶ 0029: The model selector 210 may evaluate the parameters 222 and the thresholds 202 to select the text embedding model 220 based on the thresholds 202 (e.g., the cost threshold 202A and/or the quality threshold 202B). For example, the model selector 210 uses a cost function or the like to evaluate the thresholds 202 and the parameters 222)
Cheng also teaches the embedding layer. (Cheng ¶ 0030: the model selector 210 may freeze the weights of one or more early layers in the pre-trained selected text embedding model 220S, add any required task-specific output layers, and train the output layers using a relatively small domain-specific training data set pulled from the data store 150)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the embedding and models of Cao with the embedding and models of Cheng.
In addition, both of the references (Cao and Cheng) disclose features that are directed to analogous art, and they are directed to the same field of endeavor, such as determination of embedding quality.
Motivation to do so would be to improve the functioning of Cao performing an evaluation of quality with the functioning of Cheng also performing an evaluation of quality but with the improvement of utilizing cost or quality thresholds.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to balance scalability, embedding quality, and cost based on the user's needs, desires, and available resources as seen in Cheng (¶ 0035).
Cao in view of Cheng does not expressly disclose carrying out clustering of the embeddings a plurality of times while varying a number of clusters.
Cao in view of Cheng further does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng further does not expressly disclose:
calculates, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao in view of Cheng further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Li teaches a clustering process of carrying out clustering of the embeddings a plurality of times while varying the number of clusters. (Li ¶ 0036: to vary the precision of the classification module, the number of clusters 322C of each of the plurality of categories may be varied into a new number of clusters 322C. When the number of clusters 322C are varied, the plurality of feature vectors of each of the plurality of categories may be re-clustered into the new number of clusters 322C ... after the number of centroids, K, is varied or adjusted, the plurality of feature vectors of each of the plurality of categories may be re-clustered into new clusters 322C accordingly. Based on the new number of clusters 322C, new clustered centroids may be determined, and the classification layer 342 in the image recognition model 330 may be re-generated based on the new clustered centroids)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Li.
Motivation to do so would be to improve the functioning of Cao as modified using models (to generate embeddings) with the similar functioning of Li also using models but with the improved adjustment of precision.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to implement improved recognition model accuracy and also to implement improved classification layer accuracy as seen in Li ¶ 0034 and ¶ 0036.
Cao in view of Cheng and Li does not expressly disclose embeddings generated from training data pieces including the same label.
Cao in view of Cheng and Li further does not expressly disclose:
calculates, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering;
Cao in view of Cheng and Li further does not expressly disclose evaluating quality based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion.
However, Osuala addresses this by teaching the following:
Osuala teaches embeddings generated from training data pieces including the same label. (Osuala ¶ 0067: the server computer 102 trains (or fine tunes) a neural network classifier that predicts if an embedding pair is from the same cluster or if the two embeddings are from the same (or, based on probability predictions, different) clusters by, for example, considering loss (e.g., binary cross entropy). After training, the neural network classifier has learned a reference text-specific function that describes the semantic similarity between the cluster distributions of the compared texts)
Osuala further teaches:
calculating, for the result, a ratio obtained by dividing a total number of embeddings included in the identified clusters by a total number of the embeddings subjected to the clustering; (Osuala FIG. 2, ¶ 0077: cluster weights w.sub.A of c.sub.A and w.sub.B of c.sub.B are the ratio of the number of embeddings in the respective cluster c.sub.A (or c.sub.B) compared to the total number of clustered embeddings of A (or B))
an evaluation process in which the computer evaluates quality of the embedding layer based on whether a sequence of the ratios obtained for respective numbers of clusters satisfies a predetermined determination criterion. (Osuala FIG. 2, FIG. 12, ¶ 0077-0079: Exemplary cluster weights, relative weight proportions, summary scores, and weighted summary scores are shown combined according to aspects of the present invention in explanation 1202 of FIG. 12 to provide an exemplary comparison document similarity value S.sub.BA. In the example shown 1202, shown in FIG. 1202, the S.sub.BA is 0.84 which exceeds an exemplary sufficiency threshold of 0.6)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Osuala.
Motivation to do so would be to improve the functioning of Cao as modified performing a similarity determination with the similar functioning of Osuala also performing similarity determination but with the improvement of inclusion of adjustable similarity thresholds (Osuala ¶ 0077-0079).
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to improve comparison accuracy while increasing computational efficiency as seen in Osuala ¶ 0009-0010.
Claims 9-10 are rejected under 35 U.S.C. 103 as being unpatentable over Cao in view of Cheng in further view of Li in further view of Osuala in further view of Dong, U.S. Patent Application Publication No. 2020/0285898 (hereinafter Dong).
Regarding claim 9, Cao in view of Cheng and Li and Osuala teaches all the features with respect to claim 1 above including:
wherein the at least one processor is further configured to obtain a first evaluation result of the quality of the embedding layer by performing the acquisition, clustering, calculation, and evaluation processes recited in claim 1 using a first language processing model, (Osuala ¶ 0001-0003: Computer systems that use Natural Language Processing (NLP) and similar Artificial Intelligence (AI) methods can produce summaries of various documents; Osuala shows acquisition and clustering in FIG. 2, ¶ 0062: The server computer 102 receives, at block 202, for a reference document 106, contextual word embeddings e.sub.A arranged into a first set of clusters a1,a2 (for example, as shown in FIGS. 6A. 7A, and 8A), each corresponding to a semantic topic within the reference document and represented schematically in vector space by a representative embedding c.sub.A; Osuala shows involvement of a neural network model classifier in ¶ 0067-0069; Osuala shows calculation and evaluation in ¶ 0077: cluster weights w.sub.A of c.sub.A and w.sub.B of c.sub.B are the ratio of the number of embeddings in the respective cluster c.sub.A (or c.sub.B) compared to the total number of clustered embeddings of A (or B) ... Exemplary cluster weights, relative weight proportions, summary scores, and weighted summary scores are shown combined according to aspects of the present invention in explanation 1202 of FIG. 12 to provide an exemplary comparison document similarity value S.sub.BA. In the example shown 1202, shown in FIG. 1202, the S.sub.BA is 0.84 which exceeds an exemplary sufficiency threshold of 0.6)
obtaining a second evaluation result by performing the processes recited in claim 1 using a second language processing model that is different from the first language processing model… (Osuala ¶ 0068-0069: other prediction models selected by one skilled in this field could also suffice. For example, according to aspects of the present invention, the neural network classifier model may be exchanged by prediction model that include decision trees, Support Vector Machines (SVMs), random forest methods, and so forth)
when … the first … evaluation results fails to satisfy the predetermined determination criterion, (Osuala ¶ 0077-0080: The server computer 102, at block 216, determines by said computer for each comparison document, a sufficiency rating based, at least in part, on at least one of the cluster similarity values. In an embodiment, the sufficiency rating is based on a sum of the cluster similarity values)
Cao in view of Cheng and Li and Osuala does not expressly disclose using a second language processing model … while using the same training data set.
Cao in view of Cheng and Li and Osuala further does not expressly disclose:
when each of the first and second evaluation results fails to satisfy the predetermined determination criterion, evaluate that quality of the training data set does not satisfy the criterion.
However, Dong addresses this by teaching:
…using a second language processing model that is different from the first language processing model while using the same training data set, and, (Dong ¶ 0049: Different embodiments may use different techniques to group the training data. For example, one or more mixture models (e.g., a Bayesian Gaussian Mixture Model, etc.) and/or one or more clustering algorithms (e.g., a K-means clustering algorithm, etc.) may be used by the data clustering module 306 to group the training data in the initial training data set 330 into multiple clusters based on the attribute representations; Dong ¶ 0052: Different embodiments may use different techniques to determine the threshold ratio; Dong ¶ 0061: The nodes 516 and 518 may include different algorithms and/or different weights assigned to the data variables from the nodes 508-514 such that the nodes 516-518 may produce different values based on the same input values received from the nodes 508-514)
when each of the first and second evaluation results fails to satisfy the predetermined determination criterion, evaluate that quality of the training data set does not satisfy the criterion. (Dong ¶ 0018-0020: Having a corresponding ratio lower than the threshold ratio may indicate that at least some of the training data included in the cluster is either irrelevant for classifying data, mislabeled, or both; Dong ¶ 0053-0058: A cluster having a corresponding classification ratio below the threshold ratio indicates that the training data within the cluster is either irrelevant to the data classification and/or that at least some of the training data within the cluster is mislabeled)
Motivation to do so would be to improve the functioning of Cao as modified performing a similarity determination with the similar functioning of Qian also performing similarity determination but with the improved inclusion of positive labels and the negative labels.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Dong.
Motivation to do so would be to improve the functioning of Cao as modified using clustering models with the similar functioning of Dong also using clustering models but with the improved performance in grouping and clustering based on similarity as seen in Dong ¶ 0049.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to implement improved service performance and quality for legitimate users and improved quality of training data for training classification models as seen in Dong ¶ 0001-0003.
Regarding claim 10, Cao in view of Cheng and Li and Osuala teaches all the features with respect to claim 1 above.
Cao teaches the occupancy condition. (Cao FIG. 6, ¶ 0037-0041: Responsive to determining that the cosine similarity between the word embedding of the given word and the word embeddings of the existing cluster meets a defined threshold, the given word is placed in the existing cluster)
Osuala teaches:
determining whether, from a plurality of clustering results, a relationship between (i) a number of clusters and (ii) a ratio of a total number of embeddings included in clusters that satisfy the … condition to a total number of the embeddings [that] satisfies a predetermined criterion, (Osuala FIG. 2, FIG. 9B, FIG. 10B, blocks 214-216, ¶ 0075: The server computer 102, at block 214, modifies the similarity value by multiplying the cluster similarity values by a modification value. According to aspects of the invention, the modification value is a cluster weight value based on a number of embeddings [relevant to number of clusters] (e.g., as shown in FIGS. 9A & 10A and discussed in explanations 902, 1002 shown in FIGS. 9B & 10B); Osuala FIG. 2, FIG. 12, ¶ 0077-0078: Exemplary cluster weights, relative weight proportions, summary scores, and weighted summary scores are shown combined according to aspects of the present invention in explanation 1202 of FIG. 12 to provide an exemplary comparison document similarity value S.sub.BA. In the example shown 1202, shown in FIG. 1202, the S.sub.BA is 0.84 which exceeds an exemplary sufficiency threshold of 0.6 ... The server computer 102, at block 216, determines by said computer for each comparison document, a sufficiency rating based, at least in part, on at least one of the cluster similarity values [shown above to have had a modification value based on a number of embeddings applied to it])
and, based on the decision, outputs guidance… (Osuala FIG. 2, ¶ 0079-0080: The server computer 102, at block 218, responsive to the sufficiency rating exceeding a sufficiency threshold, determining the comparison document is a sufficient representation (e.g., a summary) of the reference document. The server computer 102, at block 220, generates document similarity values for a plurality of comparison documents. The server computer 102 uses the document similarity values to generate a ranked list of comparison documents that is ordered, at least in part, by those document similarity values)
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the similarity determination of Cao as modified with the similarity determination of Osuala.
Motivation to do so would be to improve the functioning of Cao as modified performing cluster-based similarity determination with the similar functioning of Osuala also performing cluster-based similarity determination but with the improvement of the inclusion of adjustable similarity thresholds (Osuala ¶ 0077-0079).
Cao in view of Cheng and Li and Osuala does not expressly disclose improving at least one of a training data set, a pre-trained model, or hyperparameters.
However, Dong addresses this by teaching:
based on the decision … improving at least one of a training data set, a pre-trained model, or hyperparameters. (Dong ¶ 0018-0020: after the one or more clusters having the corresponding ratios below the threshold ratio are identified, at least a portion of the training data included in the identified one or more clusters are removed from the training set; Dong ¶ 0042: the training data set that was obtained initially (e.g., from the account database 136) may include noisy data, such as transaction requests that were mislabeled and/or transaction requests that were irrelevant in classifying future transaction requests. As such, the model generation module 204 of some embodiments may pre-process the training data set before using the training data set to train the classification model 202; Dong ¶ 0053-0058: the process 400 may remove irrelevant and/or mislabeled training data from the training data set multiple times, using a different threshold ratio each time, before training a classification model using the modified training data set)
Motivation to do so would be to improve the functioning of Cao as modified performing a similarity determination with the similar functioning of Qian also performing similarity determination but with the improved inclusion of positive labels and the negative labels.
It would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine the clustering of Cao as modified with the clustering of Dong.
Motivation to do so would be to improve the functioning of Cao as modified using clustering models with the similar functioning of Dong also using clustering models but with the improved performance in grouping and clustering based on similarity as seen in Dong ¶ 0049.
Motivation to do so would also be the teaching, suggestion, or motivation for a person of ordinary skill in the art to implement improved service performance and quality for legitimate users and improved quality of training data for training classification models as seen in Dong ¶ 0001-0003.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Boss et al., U.S. Patent Application Publication No. 2021/0365854, "Probabilistic Optimization for Testing Efficiency"; see Boss FIGs. 4-5, 7-11, ¶ 0028, ¶ 0075, and ¶ 0088 discussing embeddings and clusters above a threshold number/ratio, relevant to at least the independent claim limitations involving calculating a ratio obtained by performing division with numbers of embeddings
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JEDIDIAH P FERRER whose telephone number is (571)270-7695. The examiner can normally be reached Monday-Friday 12:00pm-8:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kavita Stanley can be reached at (571)272-8352. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.P.F/Examiner, Art Unit 2153 January 5, 2026
/KRIS E MACKES/Primary Examiner, Art Unit 2153