Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statements (IDS) submitted on 06/05/2024 and 07/25/2025 were filed before the mailing date of the first office action. The submissions are in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statements are being considered by the examiner.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101. Claims 1-10 are directed to a method, claims 11-16 are directed to a system, and claims 17-20 are directed to a separate method; therefore, claims 1-20 fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter). However, claims 1-20 fall within the judicial exception of an abstract idea, specifically the abstract ideas of “Mental Processes” (including observation, evaluation, and opinion) and “Mathematical Concepts (including mathematical calculations and relationships)”.
Claim 1:
Claim 1 is directed to a method; therefore, the claim does fall within one of the four statutory categories (i.e., process, machine, manufacture, or composition of matter).
Claim 1 recites the following abstract ideas:
determining, within the embedding space, a pairwise embedding relationship between the first encoded point and the second encoded point (mental step directed to observation, evaluation – a person could determine a pairwise embedding relationship between observed encoded points in an embedding space in their mind);
generating a set of output object pairs by identifying pairs of objects that correspond to pairs of encoded points within the embedding space having the pairwise embedding relationship (mental step directed to observation, evaluation – a person could identify pairs of objects that have a pairwise embedding relationship in their mind and generate a set of output object pairs from these identified pairs in their mind, potentially assisted by pen and paper (see MPEP 2106.04(a)(2)(III));
Claim 1 recites the following additional elements:
generating a first encoded point in an embedding space by encoding a first object utilizing a machine-learning model, wherein the embedding space includes encoded points based on a set of input data; generating a second encoded point in the embedding space by encoding a second object utilizing the machine-learning model; and providing the set of output object pairs to a client device.
Generating encoded points in an embedding space by encoding objects utilizing a machine-learning model is interpreted as merely implementing an abstract idea using a generic computer component, as the claims do not further define the machine-learning model nor distinguish the way this machine-learning model encodes object data from the way a person could encode objects as points in an embedding space in their mind, potentially assisted by pen and paper (see MPEP 2106.04(a)(2)(III)). Providing the set of output pairs to a client device is interpreted as transmitting data over a network. These additional elements do not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea (see MPEP 2106.05(d) and MPEP 2106.05(f)).
Claim 11 is a system claim and its limitation is included in claim 1. The only difference is that claim 11 requires a system comprising a processor and a memory, which are interpreted as generic computer components merely used to implement the claimed abstract ideas (see MPEP 2106.05(f)). Therefore, claim 11 is rejected for the same reasons as claim 1.
Claim 17 is a method claim and its limitation is included in claim 11. The only difference is that claim 1 requires a method. Therefore, claim 17 is rejected for the same reasons as claim 11.
The independent claims are not patent eligible.
Dependent claims 2-10, 12-16, and 18-20 when analyzed as a whole are held to be patent ineligible under 35 U.S.C. 101 because the additional recited limitations fail to establish that the claims are not directed to an abstract idea, as they recite further embellishment of the judicial exception.
Claim 2 recites determining a first input object relationship between the first object and the second object; and determining that the embedding space of the machine-learning model preserves the first input object relationship by identifying the first input object relationship between each of the pairs of objects in the set of output object pairs.
These limitations are interpreted as mental steps directed to observation, evaluation – a person could determine a relationship between first and second observed objects in their mind and identify in their mind that relationships between input object pairs and output object pairs are preserved in an observed embedding space after a machine learning model has been utilized. Examiner notes that this interpretation is supported by at least paragraph [0042] of Applicant’s specification, which states “the embeddings relationship system provides the output object pairs to a user, who makes determinations regarding the embedding space and relationship types” and paragraph [0071], which states “the embeddings relationship system 202 provides the output object pairs to a client device where a user is able to infer or determine whether the machine-learning model preserves a given relationship between the input object pair and the output object pairs”.
Claim 3 recites providing the first object to the machine-learning model to generate the first encoded point; identifying an encoded point within the embedding space that is close to the first encoded point according to a predefined distance metric; determining that the encoded point corresponds to a second object; generating an object pair that includes the first object and the second object; and utilizing the object pair as object inputs to the machine-learning model before generating the pairwise embedding relationship between the first encoded point and the second encoded point.
Identifying that a second encoded point is close to a first encoded point according to a predefined distance metric, and generating an object pair from the first and second objects are both interpreted as mental steps directed to observation, evaluation – a person could identify object pairs from observed encoded points in their mind based on a predefined mentally calculated distance metric. Providing an object to a machine learning model and utilizing an object pair as input to a machine learning model are both interpreted as additional elements directed to transmitting data over a network, which does not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea (see MPEP 2106.05(d)).
Claim 4 recites determining one or more relationship types preserved in the embedding space from the set of input data based on analyzing the set of output object pairs. This limitation is interpreted as a mental steps directed to observation, evaluation – a person could determine types of relationships preserved in an embedding space from observed input data in their mind based on analyzing observed output object pairs in their mind. Examiner notes that this interpretation is supported by at least paragraph [0042] of Applicant’s specification, which states “the embeddings relationship system provides the output object pairs to a user, who makes determinations regarding the embedding space and relationship types” and paragraph [0071], which states “the embeddings relationship system 202 provides the output object pairs to a client device where a user is able to infer or determine whether the machine-learning model preserves a given relationship between the input object pair and the output object pairs”
Claim 5 recites generating pairwise relationship features for each point pair in a set of point pairs in the embedding space; clustering a group of point pairs having pairwise relationship features within a cluster distance threshold or a cluster density threshold; and providing object pairs corresponding to the group of point pairs as the set of output object pairs.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could generate pairwise relationship features for observed points in point pairs in their mind, and cluster a group of point pairs in their mind based on an observed or mentally determined cluster distance threshold or cluster density threshold. Providing object pairs as an output is interpreted as an additional element directed to transmitting data over a network, which does not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea (see MPEP 2106.05(d)).
Claim 6 recites obtaining a set of anchor object pairs that shares a given object relationship between paired objects within each anchor object pair, wherein the set of anchor object pairs includes the first object in an anchor pair with the second object; generating pairs of encoded anchor points from the set of anchor object pairs utilizing the machine-learning model; and generating an anchor embedding relationship metric based on determining the pairwise embedding relationship between the pairs of encoded anchor points.
Generating pairs of anchor points and generating an anchor embedding relationship metric are both interpreted as a mental steps directed to observation, evaluation – a person could generate pairs of encoded anchor points in their mind and generate an anchor embedding relationship metric in their mind based on determining a pairwise embedding relationship between observed anchor point pairs in their mind. Obtaining a set of anchor object pairs that share a relationship is interpreted as an additional element directed to receiving data over a network, which does not integrate the abstract idea into a practical application or amount to significantly more than the abstract idea (see MPEP 2106.05(d)).
Claim 7 recites generating a set of non-anchor object pairs by randomly replacing one of the paired objects in each anchor object pair within the set of anchor object pairs; generating pairs of encoded non-anchor points from the set of non-anchor object pairs utilizing the machine-learning model; and generating a non-anchor embedding relationship metric based on determining an additional pairwise embedding relationship between the pairs of encoded non-anchor points.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could generate a set of non-anchor object pairs in their mind by randomly replacing one of the objects in the anchor pair, generate pairs of non-anchor points in their mind, and generate a non-anchor embedding relationship in their mind by determining an additional pairwise embedding relationship between non-anchor points in their mind.
Claim 8 recites generating a given object relationship metric based on comparing the anchor embedding relationship metric to the non-anchor embedding relationship metric; and determining that the machine-learning model preserves the given object relationship based on the given object relationship metric satisfying a relationship strength metric.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could compare an anchor and a non-anchor embedding relationship metric in their mind, and determine that the machine-learning model preserves the given object relationship in their mind based on an observed or determined relationship strength metric.
Claim 9 recites generating a classifier to detect the anchor object pairs as positive and the non-anchor object pairs as negative; and determining that the machine-learning model preserves the given object relationship based on a performance of the classifier.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could classify, or detect, whether observed anchor object pairs as positive and non-anchor pairs as negative in their mind, and determine that the machine-learning model preserves the given object relationship in their mind based on the mental classification. Examiner notes that this interpretation is supported by at least paragraph [0042] of Applicant’s specification, which states “the embeddings relationship system provides the output object pairs to a user, who makes determinations regarding the embedding space and relationship types” and paragraph [0071], which states “the embeddings relationship system 202 provides the output object pairs to a client device where a user is able to infer or determine whether the machine-learning model preserves a given relationship between the input object pair and the output object pairs”
Claim 10 recites determining that the machine-learning model has been modified; generating a modified given object relationship metric; and determining that the machine-learning model preserves the given object relationship based on the modified given object relationship metric satisfying the relationship strength metric.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could determine whether a machine learning model has been modified in their mind, generate a modified object relationship metric in their mind, and determine whether an object relationship has been preserved by the machine learning model based determining if the mentally determined modified object relationship has been satisfied in their mind. Examiner notes that this interpretation is supported by at least paragraph [0042] of Applicant’s specification, which states “the embeddings relationship system provides the output object pairs to a user, who makes determinations regarding the embedding space and relationship types” and paragraph [0071], which states “the embeddings relationship system 202 provides the output object pairs to a client device where a user is able to infer or determine whether the machine-learning model preserves a given relationship between the input object pair and the output object pairs”
Claim 12 is a system claim and its limitation is included in claim 2. Claim 12 is rejected for the same reasons as claim 2.
Claim 13 recites determining, within the embedding space, an additional pairwise embedding relationship between the first encoded point and the second encoded point, wherein the additional pairwise embedding relationship differs from the pairwise embedding relationship; identifying additional pairs of encoded points within the embedding space that have the additional pairwise embedding relationship based on satisfying a threshold; and generating an additional set of output object pairs that includes an additional pairs of objects that correspond to the additional pairs of encoded points within the embedding space.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could determine an additional pairwise embedding relationship between encoded points in their mind, identify additional pair of encoded points that have the mentally determined additional relationship in their mind, and generate an additional set of output pairs with the mentally determined additional pairs in their mind.
Claim 14 recites determining a second input object relationship between the first object and the second object; and determining that the embedding space of the machine-learning model preserves the second input object relationship by identifying the second input object relationship between each of the additional pairs of objects in the additional set of output object pairs.
These limitations are interpreted as a mental steps directed to observation, evaluation – a person could determine a second relationship between observed objects in their mind and a person could identify pairs of objects that have this second relationship preserved in an embedding space of the machine learning model in their mind. Examiner notes that this interpretation is supported by at least paragraph [0042] of Applicant’s specification, which states “the embeddings relationship system provides the output object pairs to a user, who makes determinations regarding the embedding space and relationship types” and paragraph [0071], which states “the embeddings relationship system 202 provides the output object pairs to a client device where a user is able to infer or determine whether the machine-learning model preserves a given relationship between the input object pair and the output object pairs”
Claim 15 recites wherein generating the set of output object pairs includes ranking the pairs of objects within the set of output object pairs based on a relationship strength metric. This limitation is interpreted as a mental step directed to observation, evaluation – a person could rank observed object pairs based on an observed or mentally determined relationship strength metric in their mind.
Claim 16 recites determining the relationship strength metric for an object pair in the set of output object pairs based on a combination of vector distance and vector angle. This limitation is interpreted as a mental step directed to observation, evaluation – a person could determine the relationship strength metric for an observed or mentally determined object pair based on an observed or mentally determined combination of vector distance and vector angle in their mind, potentially assisted by pen and paper (see MPEP 2106.04(a)(2)(III)).
Claim 18 is a method claim and its limitation is included in claim 5. Claim 18 is rejected for the same reasons as claim 5.
Claim 19 is a method claim and its limitation is included in claim 2. Claim 19 is rejected for the same reasons as claim 2.
Claim 20 is a method claim and its limitation is included in claim 15. Claim 20 is rejected for the same reasons as claim 15.
Viewed as a whole, these additional claim elements do not provide meaningful limitations to transform the abstract idea into a patent eligible application of the abstract idea such that the claims amount to significantly more than the abstract idea itself. Therefore, the claims are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter.
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.
(a)(2) the claimed invention was described in a patent issued under section 151, or in an application for patent published or deemed published under section 122(b), in which the patent or application, as the case may be, names another inventor and was effectively filed before the effective filing date of the claimed invention.
Claims 1, 3, 11, and 17 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Baek et al* (US 20220197961 A1, herein Baek).
*this document was cited in the IDS dated 06/05/2024
Regarding claim 1, Baek teaches a computer-implemented method (para. [0005] recites “Embodiments of the present disclosure provide benefits and/or solve one or more of the foregoing or other problems in the art with systems, non-transitory computer readable media, and methods that utilize machine-learning models to generate identifier embeddings from digital content identifiers and then leverage these identifier embeddings to determine digital connections between digital content items”), comprising:
generating a first encoded point in an embedding space by encoding a first object utilizing a machine-learning model, wherein the embedding space includes encoded points based on a set of input data; generating a second encoded point in the embedding space by encoding a second object utilizing the machine-learning model (para. [0055] recites “the identifier embedding system 104 may identify a first identifier embedding for the first digital content item and a second identifier embedding for a second identifier (e.g., embeddings generated utilizing a trained embedding machine-learning model)”. Para. [0079] recites “the identifier embedding system 104 can use a variety of encoding approaches” (i.e., generating at least a first and second encoded point in an embedding space using a machine learning model));
determining, within the embedding space, a pairwise embedding relationship between the first encoded point and the second encoded point (para. [0065] recites “the identifier embedding system 104 can utilize embedding machine learning models to generate identifier embeddings and determine digital connections”. Para. [0114] recites “the trained machine-learning model 508 processes the training identifier embeddings 506 (e.g., two at a time in pair-wise fashion). In particular, the trained machine-learning model 508 processes a first training identifier embedding 506a and a second training identifier embedding 506b” (i.e., determining pairwise relationships between at least a first and second embedding));
generating a set of output object pairs by identifying pairs of objects that correspond to pairs of encoded points within the embedding space having the pairwise embedding relationship (para. [0024] recites “the identifier embedding system can process a pair of training identifiers (corresponding to a pair of digital content items) utilizing the trained machine-learning model to generate a digital similarity prediction between the pair of digital content items”. Para. [0026] recites “the trained machine-learning model can generate a prediction that a pair of digital content items have other types of relationships” (i.e., outputting identified pairs of content items, or objects that have been encoded into embeddings using the process from at least fig. 5A, that have been determined to have a pair-wise relationship));
and providing the set of output object pairs to a client device (para. [0035] recites “the identifier embedding system can utilize a content management machine learning model to process identifier embeddings together with file extension embeddings, user activity embeddings, context data embeddings, or other available contextual information to flexibly generate classifications, predictions, or suggestions”. Para. [0050] recites “As shown in FIG. 1, the environment 100 includes server(s) 102, client devices 106a-106n (collectively, client devices 106), and a network 110”. Para. [0055] recites “the server(s) 102 can provide, for display within a user interface of the client applications 108 on the client devices 106, one or more suggestions based on the digital connections” (i.e., outputs, including the relationship information between identified output pairs, can be sent to a client device)).
Regarding claim 3, the combination of Baek and Syeda-Mahmood teaches the computer-implemented method of claim 1, further comprising: providing the first object to the machine-learning model to generate the first encoded point; identifying an encoded point within the embedding space that is close to the first encoded point according to a predefined distance metric; determining that the encoded point corresponds to a second object (Baek para. [0033] recites “upon receiving a selection from a client device of a first digital content item, the disclosed systems can utilize machine learning models and identifier embeddings to generate a digital suggestion that includes a related digital content item”. Baek para. [0155] recites “the comparison model may determine a cosine similarity between the first identifier embedding 618 and the second identifier embedding 620 for the respective first and second digital content items. Additionally, the comparison model may determine a cosine similarity between a pair of encodings for the context data embeddings 628 (e.g., a first and second folder path encoding) that correspond to the respective first and second digital content items” (i.e., determining that a second content object is close, or similar, to a first content object based on a predefined cosine similarity, or distance metric));
generating an object pair that includes the first object and the second object (Baek para. [0130] recites “At an act 604, the identifier embedding system 104 identifies a first identifier 606 for the first digital content item”. Baek para. [0131] recites “at the act 604, the identifier embedding system 104 identifies a second identifier 608 for a second digital content item”. Baek para. [0026] recites “the trained machine-learning model can generate a prediction that a pair of digital content items have other types of relationships” (i.e., generating an object pair from a first and second content item));
and utilizing the object pair as object inputs to the machine-learning model before generating the pairwise embedding relationship between the first encoded point and the second encoded point (Baek para. [0135] recites “as shown in FIG. 6B, the identifier embedding system 104 performs a series of acts to generate suggestions based on digital connections determined between digital content items. As shown, at an act 622, the identifier embedding system 104 generates one or more features as inputs to the content management model 214. These input features may include identifier embeddings (e.g., the first identifier embedding 618 and the second identifier embedding 620 as discussed above)” (i.e., inputting the object pair embeddings to a machine learning model to determine a pairwise relationship between the first and second points)).
Claim 11 is a system claim and its limitation is included in claim 1. The only difference is that claim 11 requires a system (Baek para. [0005] recites “Embodiments of the present disclosure provide benefits and/or solve one or more of the foregoing or other problems in the art with systems, non-transitory computer readable media, and methods that utilize machine-learning models to generate identifier embeddings from digital content identifiers and then leverage these identifier embeddings to determine digital connections between digital content items”). Therefore, claim 11 is rejected for the same reasons as claim 1.
Claim 17 is a method claim and its limitation is included in claim 11. The only difference is that claim 1 requires a method (Baek para. [0005] recites “Embodiments of the present disclosure provide benefits and/or solve one or more of the foregoing or other problems in the art with systems, non-transitory computer readable media, and methods that utilize machine-learning models to generate identifier embeddings from digital content identifiers and then leverage these identifier embeddings to determine digital connections between digital content items”). Therefore, claim 17 is rejected for the same reasons as claim 11.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 2, 4-6, 12-16, and 18-20 are rejected under 35 U.S.C. 103 as being unpatentable over Baek et al* (US 20220197961 A1, herein Baek) in view of Syeda-Mahmood (US 20230274098 A1, herein Syeda-Mahmood).
Regarding claim 2, Baek teaches the computer-implemented method of claim 1, further comprising: determining a first input object relationship between the first object and the second object (Baek para. [0142] recites “at the act 630, the content suggestion machine learning model may process the first identifier embedding 618 and the second identifier embedding 620 to generate a predicted digital connection between the first and second digital content items” (i.e., determining a relationship between a first and second object)).
However, Baek does not explicitly teach determining that the embedding space of the machine-learning model preserves the first input object relationship by identifying the first input object relationship between each of the pairs of objects in the set of output object pairs.
Syeda-Mahmood teaches determining that the embedding space of the machine-learning model preserves the first input object relationship by identifying the first input object relationship between each of the pairs of objects in the set of output object pairs (para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels” (i.e., using a machine learning model to preserve pairwise relationships between pairs of objects, or words)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by apply the method of determining whether pairwise relationships can be preserved between embedding pairs from Syeda-Mahmood to the embedding machine learning model from Baek. Baek and Syeda-Mahmood are both directed to systems which can generate and analyze embedding vectors. One of ordinary skill would be motivated to determine whether the embedding pairs generated by Baek could be preserved using the method from Syeda-Mahmood.
Regarding claim 4, Baek teaches the computer-implemented method of claim 3.
However, Baek does not explicitly teach determining one or more relationship types preserved in the embedding space from the set of input data based on analyzing the set of output object pairs.
Syeda-Mahmood teaches determining one or more relationship types preserved in the embedding space from the set of input data based on analyzing the set of output object pairs (para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels”. Para. [0067] recites “This embedding is used to cluster words and generate similarity lists through the similarity list formation engine 230” (i.e., using a machine learning model to preserve clusters, or relationships between pairs of objects, or words)).
See claim 2 for motivation to combine.
Regarding claim 5, Baek teaches the computer-implemented method of claim 1, further comprising: generating pairwise relationship features for each point pair in a set of point pairs in the embedding space (Baek para. [0114] recites “FIG. 5B illustrates a schematic diagram of the trained machine-learning model 508 used in training the embedding machine-learning model 302c in accordance with one or more embodiments. As shown, the trained machine-learning model 508 processes the training identifier embeddings 506 (e.g., two at a time in pair-wise fashion). In particular, the trained machine-learning model 508 processes a first training identifier embedding 506a and a second training identifier embedding 506b” (i.e., generating pairwise relationship features for point pairs. Examiner notes that at least paragraph [0124] provides a non-limiting example of the different point pairs and corresponding relationships that can be determined for a given data set));
and providing object pairs corresponding to the group of point pairs as the set of output object pairs (Baek para. [0035] recites “the identifier embedding system can utilize a content management machine learning model to process identifier embeddings together with file extension embeddings, user activity embeddings, context data embeddings, or other available contextual information to flexibly generate classifications, predictions, or suggestions”. Para. [0050] recites “As shown in FIG. 1, the environment 100 includes server(s) 102, client devices 106a-106n (collectively, client devices 106), and a network 110”. Para. [0055] recites “the server(s) 102 can provide, for display within a user interface of the client applications 108 on the client devices 106, one or more suggestions based on the digital connections” (i.e., outputs, including the relationship information between identified output pairs, can be sent to a client device)).
However, Baek does not explicitly teach clustering a group of point pairs having pairwise relationship features within a cluster distance threshold or a cluster density threshold.
Syeda-Mahmood teaches clustering a group of point pairs having pairwise relationship features within a cluster distance threshold or a cluster density threshold (para. [0047] recites “given a vocabulary ontology data structure, the ontology is traversed using the hypernym and synonym relationships, and those terms that are within a specified depth distance, e.g., 2, and which have a threshold similarity score, e.g., a WUP (i.e., Wu-Palmer score) of 0.8, are selected for inclusion in the initial similarity list, which is represented as a similarity list data structure. The similarity list data structure(s) for a given term, or multi-word term, may be referred to herein as a "similarity cluster"” (i.e., clustering points, such as the paired points from Baek, within a cluster distance threshold)).
See claim 2 for motivation to combine.
Regarding claim 6, the Baek teaches the computer-implemented method of claim 1,
However, Baek does not explicitly teach obtaining a set of anchor object pairs that shares a given object relationship between paired objects within each anchor object pair, wherein the set of anchor object pairs includes the first object in an anchor pair with the second object; generating pairs of encoded anchor points from the set of anchor object pairs utilizing the machine-learning model; and generating an anchor embedding relationship metric based on determining the pairwise embedding relationship between the pairs of encoded anchor points.
Syeda-Mahmood teaches obtaining a set of anchor object pairs that shares a given object relationship between paired objects within each anchor object pair, wherein the set of anchor object pairs includes the first object in an anchor pair with the second object; generating pairs of encoded anchor points from the set of anchor object pairs utilizing the machine-learning model (para. [0051] recites “for a contrastive learning formulation, a batch for learning would have the target word paired with a positive pair and the remaining being negative. Since there are a large number of pairings to learn, a batch is formed consisting of multiple pairs of positive and negative pairs coming from multiple target terms, or multi-word terms. A positive pair is a target term, or multi-word term, and some member of its similarity cluster/list”. Para. [0078] recites “The embedding learned by the encoder-decoder network of the illustrative embodiments pulls together all members of the similarity list of an anchor word (or synset) as positive samples and pushes apart the other words in the vocabulary as negative examples using a contrastive loss function designed for this purpose” (i.e., generating object pairs from the encoded anchor points using the machine learning model));
and generating an anchor embedding relationship metric based on determining the pairwise embedding relationship between the pairs of encoded anchor points (Syeda-Mahmood para. [0075] recites “The raw similarity lists may then be filtered or ranked using a similarity metric, such as the Wu-Palmer (WUP) similarity metric, a Word2Vec similarity score, or the like, and a given threshold specifying a minimum level of similarity metric/score to indicate similar words, e.g., a threshold of 0.8 for the similarity metric/score”. Syeda-Mahmood para. [0091] recites “To produce a ranked list of similar words, a cosine distance was used on word vectors formed from the trained encoder-decoder network 240. The results rank the matches to the sense and meaning of words higher and there is significant differentiation between the scores of similarity list members and other members, as is illustrated by the mean average precision (MAP) metrics shown in Table 2. The MAP metric measures, on average, how often the matches to a query term/word, that are in the top K matches, are in fact similar in meaning and sense to the query term/word. Thus, the MAP metric implicitly captures the correctness of the match in terms of meaning and sense, with a higher MAP implying a better model performance” (i.e., ranking objects, the anchor object pairs from at least para. [0078] of Syeda-Mahmood, based on their similarity, or relationship strength)).
See claim 2 for motivation to combine.
Claim 12 is a system claim and its limitation is included in claim 2. Claim 12 is rejected for the same reasons as claim 2.
Regarding claim 13, the combination of Baek and Syeda-Mahmood teaches the system of claim 12, further comprising instructions that, when executed by the at least one processor, cause the system to carry out operations comprising: determining, within the embedding space, an additional pairwise embedding relationship between the first encoded point and the second encoded point, wherein the additional pairwise embedding relationship differs from the pairwise embedding relationship (Baek para. [0144] recites “the content suggestion machine-learning model may additionally analyze at least one other pair of embeddings for the first and second digital content items” (i.e., an additional pairwise relationship can be determined for a first and second object));
identifying additional pairs of encoded points within the embedding space that have the additional pairwise embedding relationship based on satisfying a threshold (Examiner notes that the non-limiting example from at least para. [0121] of Baek shows wherein the same kind of pairwise parent-child relationship can be determined between different pairs of files, or points in the embedding space. This relationship is different from a sibling pairwise or “no direct file” pairwise relationships as described in at least para. [0120] of Baek));
and generating an additional set of output object pairs that includes an additional pairs of objects that correspond to the additional pairs of encoded points within the embedding space (Baek para. [0142] recites “at the act 630, the content suggestion machine learning model may process the first identifier embedding 618 and the second identifier embedding 620 to generate a predicted digital connection between the first and second digital content items. In turn, the identifier embedding system 104 can generate a similar prediction for a variety of digital content items”. Baek para. [0144] recites “the identifier embedding system 104 analyzes multiple inputs to determine a digital connection between digital content items” (i.e., generating additional pairs of objects in the embedding space)).
Regarding claim 14, the combination of Baek and Syeda-Mahmood teaches the system of claim 13, further comprising instructions that, when executed by the at least one processor, cause the system to carry out operations comprising: determining a second input object relationship between the first object and the second object (Examiner notes that the non-limiting example from at least para. [0121] of Baek shows wherein a second kind of pairwise parent-child relationship can be determined between a pair of files, or objects, different from a sibling pairwise or “no direct file” pairwise relationships as described in at least para. [0120] of Baek);
and determining that the embedding space of the machine-learning model preserves the second input object relationship by identifying the second input object relationship between each of the additional pairs of objects in the additional set of output object pairs (Syeda-Mahmood para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Syeda-Mahmood para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Syeda-Mahmood para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels” (i.e., using a machine learning model to preserve pairwise relationships between pairs of objects, or words)).
Regarding claim 15, the Baek teaches the system of claim 11,
However, Baek does not explicitly teach wherein generating the set of output object pairs includes ranking the pairs of objects within the set of output object pairs based on a relationship strength metric.
Syeda-Mahmood teaches wherein generating the set of output object pairs includes ranking the pairs of objects within the set of output object pairs based on a relationship strength metric (para. [0075] recites “The raw similarity lists may then be filtered or ranked using a similarity metric, such as the Wu-Palmer (WUP) similarity metric, a Word2Vec similarity score, or the like, and a given threshold specifying a minimum level of similarity metric/score to indicate similar words, e.g., a threshold of 0.8 for the similarity metric/score”. Para. [0091] recites “To produce a ranked list of similar words, a cosine distance was used on word vectors formed from the trained encoder-decoder network 240. The results rank the matches to the sense and meaning of words higher and there is significant differentiation between the scores of similarity list members and other members, as is illustrated by the mean average precision (MAP) metrics shown in Table 2. The MAP metric measures, on average, how often the matches to a query term/word, that are in the top K matches, are in fact similar in meaning and sense to the query term/word. Thus, the MAP metric implicitly captures the correctness of the match in terms of meaning and sense, with a higher MAP implying a better model performance” (i.e., ranking objects, such as the paired objects from Baek, based on their similarity, or relationship strength)).
See claim 2 for motivation to combine.
Regarding claim 16, the combination of Baek and Syeda-Mahmood teaches the system of claim 15, further comprising instructions that, when executed by the at least one processor, cause the system to carry out operations comprising determining the relationship strength metric for an object pair in the set of output object pairs based on a combination of vector distance and vector angle (Syeda-Mahmood para. [0088] recites “Since the embedding is a vector space, the trained encoder can search for a nearest vector either using a distance evaluation, such as using Euclidean distance or cosine distance, for example. The cosine distance, for example, which measures the angle of separation of two vectors, may be used for nearest neighbor searches. The embedding performed by the trained encoder may be used to compute the distance between its embedding vectors and all other embedding vectors corresponding to the entire vocabulary. The top K closest vectors may then be returned as matches. Assuming the embedding is performed correctly, then these nearest embedding vectors should correspond to the nearest in meaning/sense words”. Syeda-Mahmood para. [0091] recites “To produce a ranked list of similar words, a cosine distance was used on word vectors formed from the trained encoder-decoder network 240. The results rank the matches to the sense and meaning of words higher and there is significant differentiation between the scores of similarity list members and other members, as is illustrated by the mean average precision (MAP) metrics shown in Table 2. The MAP metric measures, on average, how often the matches to a query term/word, that are in the top K matches, are in fact similar in meaning and sense to the query term/word. Thus, the MAP metric implicitly captures the correctness of the match in terms of meaning and sense, with a higher MAP implying a better model performance” (i.e., using a vector distance and angle to determine the strength of the relationship between two objects, or words)).
Claim 18 is a method claim and its limitation is included in claim 5. Claim 18 is rejected for the same reasons as claim 5.
Claim 19 is a method claim and its limitation is included in claim 2. Claim 19 is rejected for the same reasons as claim 2.
Claim 20 is a method claim and its limitation is included in claim 15. Claim 20 is rejected for the same reasons as claim 15.
Claims 7-10 are rejected under 35 U.S.C. 103 as being unpatentable over Baek et al* (US 20220197961 A1, herein Baek) in view of Syeda-Mahmood (US 20230274098 A1, herein Syeda-Mahmood), in further view of Sohn (US 20170228641 A1, herein Sohn).
Regarding claim 7, the combination of Baek and Syeda-Mahmood teaches the computer-implemented method of claim 6, further comprising: generating pairs of encoded non-anchor points from the set of non-anchor object pairs utilizing the machine-learning model; and generating a non-anchor embedding relationship metric based on determining an additional pairwise embedding relationship between the pairs of encoded non-anchor points (Syeda-Mahmood para. [0075] recites “The raw similarity lists may then be filtered or ranked using a similarity metric, such as the Wu-Palmer (WUP) similarity metric, a Word2Vec similarity score, or the like, and a given threshold specifying a minimum level of similarity metric/score to indicate similar words, e.g., a threshold of 0.8 for the similarity metric/score”. Syeda-Mahmood para. [0078] recites “The embedding learned by the encoder-decoder network of the illustrative embodiments pulls together all members of the similarity list of an anchor word (or synset) as positive samples and pushes apart the other words in the vocabulary as negative examples using a contrastive loss function designed for this purpose” (i.e., relationships between non-anchor points, such as the paired points from Baek, can be determined and evaluated using the ranked similarity metrics)).
However, the combination of Baek and Syeda-Mahmood does not explicitly teach generating a set of non-anchor object pairs by randomly replacing one of the paired objects in each anchor object pair within the set of anchor object pairs.
Sohn teaches generating a set of non-anchor object pairs by randomly replacing one of the paired objects in each anchor object pair within the set of anchor object pairs (para. [0063] recites “At step 610, receive N pairs of training examples and class labels for the training examples that correspond to a plurality of classes. Each of the N pairs includes a respective anchor example and further includes a respective non-anchor example capable of being a positive training example or a negative training example. In an embodiment, each of the N pairs of the training examples can correspond to a different one of the plurality of classes. In an embodiment, the plurality of classes, can be randomly selected as a subset from a set of classes, wherein the set of classes, includes the plurality of classes and one or more other classes” (i.e., a pair of training examples, including the anchor pairs, can be chosen randomly)).
Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to combine these teachings by adapting the method of randomly selecting training example pairs from Sohn to randomly select training pairs from Syeda-Mahmood (which modifies Baek). Sohn and Syeda-Mahmood are both directed to methods of generating and analyzing anchor embedding pairs. One of ordinary skill in the art would be motivated to apply Sohn’s random selection method to Syeda-Mahmood to avoid hard negative mining, as described in at least paragraph [0105] of Sohn.
Regarding claim 8, the combination of Baek, Syeda-Mahmood, and Sohn teaches the computer-implemented method of claim 7, further comprising: generating a given object relationship metric based on comparing the anchor embedding relationship metric to the non-anchor embedding relationship metric (Syeda-Mahmood para. [0075] recites “The raw similarity lists may then be filtered or ranked using a similarity metric, such as the Wu-Palmer (WUP) similarity metric, a Word2Vec similarity score, or the like, and a given threshold specifying a minimum level of similarity metric/score to indicate similar words, e.g., a threshold of 0.8 for the similarity metric/score”. Syeda-Mahmood para. [0078] recites “The embedding learned by the encoder-decoder network of the illustrative embodiments pulls together all members of the similarity list of an anchor word (or synset) as positive samples and pushes apart the other words in the vocabulary as negative examples using a contrastive loss function designed for this purpose” (i.e., determining the relationship metric for an anchor embedding vs a non-anchor embedding));
and determining that the machine-learning model preserves the given object relationship based on the given object relationship metric satisfying a relationship strength metric (Syeda-Mahmood para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Syeda-Mahmood para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Syeda-Mahmood para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels” (i.e., using a machine learning model to preserve pairwise relationships between pairs of objects, or words, the strength of which can be shown by the ranked similarity metrics taught by at least para. [0075] and [0091] of Syeda-Mahmood)).
Regarding claim 9, the combination of Baek, Syeda-Mahmood, and Sohn teaches the computer-implemented method of claim 7, further comprising: generating a classifier to detect the anchor object pairs as positive and the non-anchor object pairs as negative (Syeda-Mahmood para. [0078] recites “The embedding learned by the encoder-decoder network of the illustrative embodiments pulls together all members of the similarity list of an anchor word (or synset) as positive samples and pushes apart the other words in the vocabulary as negative examples using a contrastive loss function designed for this purpose” (i.e., determining anchor objects, such as the pairs from Baek, as positive, and non-anchor objects as negative));
and determining that the machine-learning model preserves the given object relationship based on a performance of the classifier (Syeda-Mahmood para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Syeda-Mahmood para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Syeda-Mahmood para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels” (i.e., using a machine learning model to preserve pairwise relationships between pairs of objects, or words)).
Regarding claim 10, the combination of Baek, Syeda-Mahmood, and Sohn teaches the computer-implemented method of claim 8, further comprising: determining that the machine-learning model has been modified; generating a modified given object relationship metric (Syeda-Mahmood para. [0105] recites “Each training operation for each batch may be performed iteratively with adjustments of the operational parameters of the encoder-decoder network to reduce the contrastive loss function until an acceptable level of loss is reached (convergence) or a predetermined number of epochs are reached. This process may be repeated for each batch such that the encoder-decoder network is trained across all batches and learns a similarity embedding based on the similarity lists and negative examples of each batch (step 614)” (i.e., the steps of training the model, which include determining a relationship strength metric as described in at least para. [0075] and [0091] of Syeda-Mahmood, can be repeated iteratively with a model that has been modified with new or different parameters));
and determining that the machine-learning model preserves the given object relationship based on the modified given object relationship metric satisfying the relationship strength metric (Syeda-Mahmood para. [0048] recites “The similarity list data structure(s) are used to perform a training of the machine learning computer model, e.g., neural network model, to produce embeddings using multi-label group contrastive loss”. Syeda-Mahmood para. [0050] recites “The learning of this embedding is a complex learning problem since ultimately, pairwise similarity comparisons need to be made for all words, each of which is a high-dimensional vector”. Syeda-Mahmood para. [0065] recites “A sense and similarity preserving embedding is then learned for the similarity sets using a novel contrastive loss and an adaptive batching strategy designed for efficiently learning very large numbers of labels” (i.e., using a machine learning model to preserve pairwise relationships between pairs of objects, or words, the strength of which can be shown by the ranked similarity metrics taught by at least para. [0075] and [0091] of Syeda-Mahmood)).
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20210012116 A1 (Urtasun et al) teaches a method for identifying anchor points within a batch of feature embeddings.
US 20220374648 A1 (Mizobuchi) teaches a method for generating pairs of embedded vectors using a clustering method.
US 20210056168 A1 (Bull et al) teaches a method for generating a vector embedding space model for natural language processing.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to LEAH M FEITL whose telephone number is (571) 272-8350. The examiner can normally be reached on M-F 0900-1700 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Viker Lamardo can be reached on (571) 270-5871. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/L.M.F./ Examiner, Art Unit 2147 /VIKER A LAMARDO/Supervisory Patent Examiner, Art Unit 2147