DETAILED ACTION
This final Office action is responsive to amendments filed February 11th, 2026. Claims 1 and 12 have been amended. Claims 6 and 17 have been cancelled. Claims 1-5, 7-16, and 18-20 are presented for examination.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Response to Arguments
Applicant’s arguments, see pages 9-13, filed 2/11/26, with respect to claims 1-20 have been fully considered and are persuasive. The 35 USC 101 rejection of 11/17/25 has been withdrawn.
Applicant's arguments regarding claim rejections under 35 USC 103 filed 2/11/26 have been fully considered but they are not persuasive.
On pages 13-16 of the provided remarks, Applicant argues that the cited prior art does not disclose the claimed limitations. Beginning on page 13 of the provided remarks, Applicant argues the claimed “Brand Registry – Clustering Brand Content Vectors to Determine Brand Identifiers”. Specifically, on page 14 of the provided remarks, Applicant argues that Cavallari’s “coherent cluster” is not analogous to the claimed “form clusters from brand content vectors to discover brands; compute a centroid of a cluster to represent a brand; or create and store a brand identifier including both a centroid and associated brand indicators”. Examiner respectfully disagrees and asserts that the clusters of Cavallari are not “incidental” but essential to the labeling of content method claimed. Per cited Figure 5 below, Cavallari discloses the claimed analysis of determined brand content vectors to identify clusters. As seen in Figure 5, the unlabeled vector is boxed amongst primary vectors to show a primary nearest neighbor in the form of a cluster. While Applicant argues that the claims recite “compute a centroid of a cluster to represent a brand” Examiner asserts that the claim merely recites “determining a brand identifier for the identified brand including a centroid of the identified cluster and the brand indicators associated with the associated brand content vectors included in the cluster”. The claims do not recite the argued computation method to determine a centroid of the cluster and therefore the cited nearest neighbor method is applicable. Examiner asserts that within Figure 5, the application of Label A on the Unlabeled Vector due to the proximity of Primary Vectors 1-4 is analogous to the claimed “determining a most similar brand the brand identifier stored in the brand registry having a closest centroid of the cluster to the representative vectors of the unknown content representation”. Finally, the newly labeled content is stored within the database per cited paragraphs [0020, 0030, and Figure 1]. Applicant’s arguments are not persuasive.
Continuing on page 14 of the provided remarks, Applicant argues “a nearest neighbor parameter is not a centroid. A centroid is a vector computed as an aggregate (e.g., mean) of vectors forming a cluster.” Examiner respectfully disagrees and asserts, as stated above that the present claim does not recite the argued detailed calculation method of determining a centroid of the cluster. Therefore, the nearest neighbor parameter of Cavallari is applicable. Additionally, cited paragraph [0028] discloses the tunable parameter “is determined based on a number of vectors whose similarity scores satisfy a similarity threshold. For example, if five vectors have similarity scores that meet the similarity threshold, then vector selection module 280 includes five vectors in set”. Therefore, the tunable parameter forming the cluster (i.e., set) of vectors is analogous to the claimed centroid. Applicant’s arguments are not persuasive.
On pages 14-15 of the provided remarks, Applicant argues that Leddy does not cure the argued deficiencies of Cavallari. However, this argument is moot as Leddy was not cited to disclose the argued “analyzing brand content vectors to identify clusters corresponding to brands; determining a centroid for each cluster; or defining a brand identifier comprising a centroid and associated representative indicators derived from clustered brand content”. Applicant’s arguments are not persuasive.
On page 15 of the provided remarks, Applicant argues regarding “Classification Based on Closest Centroid of Cluster” that “because the cited reference combination does not disclose generating or storing brand centroids as identifiers, it cannot disclose selecting a brand based on proximity to such a centroid”. Examiner begins by questioning where the argued “generating or storing brand centroids as identifiers” is claimed as the following limitation “storing in the brand registry the determined brand identifier” is not specific to the centroid and “determining a brand identifier for the identified brand including a centroid” does not state the generation of a centroid. Additionally, as stated above, the cited nearest neighbor method of Cavallari is analogous to the claimed limitations regarding classification of content. Therefore, Applicant’s arguments are not persuasive.
On page 16 of the provided remarks, Applicant argues “Risk Score Based on Comparison Vector Tied to Clustered Brand Indicators”, that “because the cited references do not disclose brand indicators stored in associated with clustered brad content vectors forming a centroid-based brand identifier, the subsequent comparison vector (defined relative to those clustered brand indicators) is similarly not disclosed.” Examiner respectfully disagrees and begins by asserting, as stated above, that cited Cavallari does disclose brand indicators stored in associated with clustered brad content vectors forming a centroid-based brand identifier. Additionally, as cited below, Cavallari discloses the generation of a comparison vector defined relative to those clustered brand indicators. Further, while Applicant argues that cited Leddy “is not predicated on (i) clustering brand content, (ii) deriving a centroid-based brand identifier, and (iii) comparing representative indicators tied to that clustered brand structure” Examiner respectfully disagrees and asserts that the argued “filter outputs and signatures” is a generalization of Leddy’s cited method of vector filter rules being utilized to generate a scam vector to compare and identify potential scams. Therefore, the 35 USC 103 rejection is maintained. Applicant’s arguments are not persuasive.
Claim Objections
Claims 1 and 12 are objected to because of the following informalities: the limitation beginning "based on the classification" recites "the classification" which lacks antecedent basis and should recite "the identification". Appropriate correction is required.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claim(s) 1-5, 7-16, and 18-20 is/are rejected under 35 U.S.C. 103 as being unpatentable over Cavallari (U.S 2022/0075961 A1) in view of Leddy (U.S 2020/0067861 A1).
Claims 1 and 12
Regarding Claim 1, Cavallari discloses the following:
A computer system for generating a brand registry and classifying content based on the brand registry, the computer system comprising: memory comprising a non-transitory computer readable medium and storing the brand registry, an embedding machine learning algorithm, an encoder machine learning algorithm, and a model; processor circuitry configured to [see at least Paragraph 00012 for reference to techniques being disclosed for automatically propagating labels across large sets of unlabeled content; Paragraph 0017 for reference to the embedding module embeds unlabeled content to generate unlabeled vector; Paragraph 0017 for reference to the embedding module including various machine learning models including GOOGLE’s Universal Sentence Encoder; Paragraph 0018 for reference to an encoder included in the embedding module attempts to understand the semantic meaning of content and provides a representation of the content based on a determined meaning; Paragraph 0019 for reference to the computer system maintaining a vector index storing vector representations of labeled content stored in database; Paragraph 0065 for reference to the processing unit containing an on-board memory; Paragraph 0066 for reference to the storage subsystem being implemented using various forms of physical memory media; Paragraph 0068 for reference to various articles of manufacture storing instructions and optionally data executable by a computing system to implement techniques disclosed wherein the articles include non-transitory computer-readable memory media; Figure 1 and related text regarding the computer system configured to assign new labels to unlabeled content including item 150 ‘database’; Figure 2 and related text regarding the computer system configured to retrieve sets of labeled content for responding to queries including item 120 ‘embedding module’ and item 150 ‘database’; Figure 4 and related text regarding item 430 ‘Confidence Score Module’ and item 432 ‘Confidence Score(s)’]
generate a representation of content including representative vectors and representative indicators comprising: receiving the content [see at least Paragraph 0048 for reference to the NLP System receiving a query; Paragraph 0052 for reference to a computer system receiving unlabeled content which could include receiving a query which could be a document, sentence, a set of terms, etc.; Figure 6 and related text regarding item 602 ‘User Query’; Figure 7 and related text regarding item 710 ‘Receive unlabeled content’]
identifying the representative indicators from the received content [see at least Paragraph 0049 for reference to the system generating a response to the query by classifications provided by trained machine learning classifier for content of user query; Figure 6 and related text regarding item 620 ‘Query Handler Module’ and item 622 ‘Unlabeled Content’
extracting content indicators from the received content [see at least Paragraph 0049 for reference to the query handler module extracting unlabeled user-generated content from the query; Figure 6 and related text regarding item 620 ‘Query Handler Module’ and item 622 ‘Unlabeled Content’
splitting the content indicators into visual indicators and textual indicators [see at least Paragraph 0017 for reference to unlabeled content including a particular portion of content generated by a user; Paragraph 0017 for reference to particular portions of user-generated content may be referred to as utterances; Paragraph 0026 for reference to the query specifying a particular unlabeled utterance; Paragraph 0032 for reference to the training module sending a query that specifies a sentence, utterance, symbol, image, document, etc.; Paragraph 0032 for reference to sentences may then be used to train machine learning classifiers to recognize user - generated content that has similar semantic meanings to the labeled sentences or documents; Paragraph 0033 for reference to the generated trained machine learning classifiers being used to classify user queries relating to products in order to automatically provide appropriate responses; Figure 3 and related text regarding item 332 ‘machine learning classifier’; Figure 6 and related text regarding item 632 ‘trained machine learning classifier’]
for each of the content indicators, generating a vector as an embedding of the indicator by applying the embedding machine learning algorithm to the content indicator [see at least Paragraph 0017 for reference to the embedding module embeds unlabeled content to generate unlabeled vector; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
for each of the generated vectors, generating as one of the representative vectors a reduced vector by applying the encoder machine learning algorithm to reduce a dimension of the generated vector [see at least Paragraph 0017 for reference to the embedding module including various machine learning models including GOOGLE’s Universal Sentence Encoder; Paragraph 0018 for reference to an encoder included in the embedding module attempts to understand the semantic meaning of content and provides a representation of the content based on a determined meaning; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
generate the brand registry comprising: receiving brand content for multiple brands [see at least Paragraph 0048 for reference to the NLP System receiving a query; Paragraph 0052 for reference to a computer system receiving unlabeled content which could include receiving a query which could be a document, sentence, a set of terms, etc.; Figure 6 and related text regarding item 602 ‘User Query’; Figure 7 and related text regarding item 710 ‘Receive unlabeled content’]
for each piece of the received brand content, determining brand content vectors and brand indicators associated with the brand content vectors by: generating as a brand content representation the representation of the piece of brand content [see at least Paragraph 0018 for reference to an encoder included in the embedding module attempts to understand the semantic meaning of content and provides a representation of the content based on a determined meaning; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Paragraph 0019 for reference to the computer system using the embedding module to generate vector representations of various labeled content stored in database; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
including in the brand content vectors the representative vectors of the generated brand content representation [see at least Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
including in the brand indicators the representative indicators of the generated brand content representation in association with the representative vectors of the generated brand content representation [see at least Paragraph 0040 for reference to primary vectors being given a proposed label based on a determined propagation score; Paragraph 0041 for reference to the propagation score module using equations to calculate the propagation score for a proposed label; Figure 4 and related text regarding item 440 ‘propagation score module’ and item 442 ‘propagation score’; Examiner notes the brand indicator as analogous to the ‘proposed label’]
analyzing the determined brand content vectors to identify clusters in the determined brand content vectors, wherein each of the identified clusters is associated with the brand content vectors forming the cluster [see at least Paragraph 0047 for reference to the unlabeled vector being shown with five primary nearest neighbors which each have a label A; Paragraph 0047 for reference to the set of primary vectors being referred to as a coherent cluster because the vectors belong to the same class; Figure 5 and related text regarding the example nearest neighbor]
for each of the identified clusters: identifying the cluster as a brand [see at least Paragraph 0047 for reference to the unlabeled vector being shown with five primary nearest neighbors which each have a label A; Paragraph 0047 for reference to the set of primary vectors being referred to as a coherent cluster because the vectors belong to the same class; Figure 5 and related text regarding the example nearest neighbor; Examiner notes the identified class as analogous to the identified brand]
determining a brand identifier for the identified brand including a centroid of the identified cluster and the brand indicators associated with the associated brand content vectors included in the cluster [see at least Paragraph 0047 for reference to Primary vector 5 is shown with four secondary nearest neighbors 504 (secondary vectors 1-4). Secondary vectors 1 and 2 have label D, secondary vector 1 has label A, and secondary vector 4 has a label C. Primary vector 5 may be associated with a low confidence score since three of its nearest neighbor vectors are associated with labels other than label A. As a consequence, even though primary vector 5 is associated with class A, it is identified as a poor representation of classification A. Because the low confidence score for primary vector 5 is used in calculating the propagation score for label A (i.e., whether label A should be assigned to unlabeled vector 122), this low confidence score may produce a low propagation score. Based on the low propagation score, the disclosed system may decide not to label unlabeled vector 122 with label A; Figure 5 and related text regarding the example nearest neighbor]
storing in the brand registry the determined brand identifier [see at least Paragraph 0020 for reference to computer system stores the newly-labeled content in the database; Paragraph 0030 for reference to the newly labeled content being stored in the database; Figure 1 and related text regarding item 112 ‘newly-labeled content’ and item 150 ‘database’; Figure 2 and related text regarding item 252 ‘Set of Labeled Content’ and item 150 ‘database’; Figure 7 and related text regarding item 750 ‘Store the newly labeled content in the database’]
classify unknown content as real or fake comprising: receiving the unknown content [see at least Paragraph 0048 for reference to the NLP System receiving a query; Paragraph 0052 for reference to a computer system receiving unlabeled content which could include receiving a query which could be a document, sentence, a set of terms, etc.; Figure 6 and related text regarding item 602 ‘User Query’; Figure 7 and related text regarding item 710 ‘Receive unlabeled content’]
generating as an unknown content representation the representation of the unknown content [see at least Paragraph 0018 for reference to an encoder included in the embedding module attempts to understand the semantic meaning of content and provides a representation of the content based on a determined meaning; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Paragraph 0019 for reference to the computer system using the embedding module to generate vector representations of various labeled content stored in database; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
generating advanced features based on basic raw data features extracted from the unknown content [see at least Paragraph 0049 for reference to the query handler module extracting unlabeled user-generated content from the query; Figure 6 and related text regarding item 620 ‘Query Handler Module’ and item 622 ‘Unlabeled Content’]
determining as a most similar brand the brand identifier stored in the brand registry having a closest centroid of the cluster to the representative vectors of the unknown content representation [see at least Paragraph 0027 for reference to similarity module operating the distance module to determine similarity scores for various labeled vectors obtained from vector index based on comparing these vectors with unlabeled vector; Paragraph 0036 for reference to the vector similarity module determining for a given vector included in a set, a secondary set of vectors that include k-nearest neighbor vectors to the given vector, where k is a tunable parameter; Paragraph 0047 for reference to the unlabeled vector being shown with five primary nearest neighbors which each have a label A; Paragraph 0047 for reference to the set of primary vectors being referred to as a coherent cluster because the vectors belong to the same class; Figure 2 and related text regarding item 260 ‘Similarity Module’, item 270 ‘Distance Module’, item 272 ‘Similarity Scores’; Figure 4 and related text regarding item 420 ‘Vector Similarity Module’; Figure 5 and related text regarding the example nearest neighbor; Examiner notes ‘k nearest neighbor tunable parameter’ as analogous to claimed ‘centroid’]
generating a comparison vector based on a comparison between the brand indicators for the most similar brand and the representative indicators for the unknown content representation [see at least Paragraph 0014 for reference to the use of vector space allowing for comparison of vectors; Paragraph 0021 for reference to automatically infer labels for unlabeled content based on the semantic meaning of labeled content by comparing labeled and unlabeled content in the vector space; Paragraph 0036 for reference to the Vector Similarity Module determining one or more secondary sets of vectors that match vectors including in set of matching labeled vectors; Paragraph 0036 for reference to the vector similarity module determining, for a given vector, a second set of vectors that include k-nearest neighbor vectors to the given vector; Figure 4 and related text regarding item 424 ‘Secondary Set(s) of Matching Vectors’]
determining a score by applying as the model a machine learning algorithm to the generated comparison vector and the generated advanced features [see at least Paragraph 0038 for reference to the confidence score module generating confidence scores for each vector in the set; Paragraph 0039 for reference to the equation used by the Confidence score module to calculate a confidence score for a given vector; Figure 4 and related text regarding item 430 ‘Confidence Score Module’ and item 432 ‘Confidence Score(s)’]
identifying the unknown content [see at least Paragraph 0038 for reference to confidence scores indicating whether the labels of vectors in set are accurate representations of their respective classes (i.e., whether they have been accurately labeled); Paragraph 0038 for reference to confidence scores may advantageously allow the disclosed system to determine whether labeled vectors stored in database have been erroneously labeled (e.g., by a human annotator of some other labeling system)]
While Cavallari discloses the limitations above, it does not disclose a computer system for classifying content as real or fake based on the brand registry; determining a risk score by applying as the risk model a machine learning algorithm to the generated comparison vector and the generated advanced features; and identifying the unknown content as real or fake based on the determined risk score; and based on the classification of the unknown content as real or fake: when the unknown content is classified as fake, blocking access to the unknown content by instruction network hardware to prevent network access to a location associated with the unknown content or by security software executing on an endpoint computing device; when the unknown content is classified as real, allowing access to the unknown content by not blocking access to the unknown content.
However, Leddy discloses the following:
A computer system for generating a brand registry and classifying content as real or fake based on the brand registry, the computer system comprising: memory comprising a non-transitory computer readable medium and storing the brand registry, and a risk model [see at least Paragraph 0054 for reference to the invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored and/or provided by a memory coupled to a processor; Paragraph 0056 for reference to the system configured to pre-validate electronic communications before they are seen by users and protect users against evolving scams; Paragraph 0061 for reference to the dynamic filter updating system is an alternative view of the scam evaluation system; Paragraph 0251 for reference to the repositories are implemented as database tables or markers on a record indicating how results are to be segmented; Paragraph 1159 for reference to the physical components of the system including a memory and central processing unit; Paragraph 1614 for reference to the system maintaining a database of CDN providers utilized by each brand; Paragraph 1768 for reference to the platform including a content database that includes a collection of terms associated with various authoritative entities; Figure 1A-1C and related text for reference to a system for dynamic filter updating; Figure 12 and related text regarding item 1201 ‘memory’]
generate the brand registry comprising: receiving brand content for multiple brands [see at least Paragraph 1611 for reference to a spoofed communication appearing to come from a well-known brand such as Amazon.com™ may contain content copied from a legitimate communication sent by that brand; Paragraph 1768 for reference to the platform including a content database that includes a collection of terms associated with various authoritative entities; Figure 16 and related text regarding item 1608 ‘Content Evaluation Engine’ and item 1616 ‘Content Database’; Figure 18 and related text regarding item 1802 ‘Receive electronic communication’]
for each piece of the received brand content, determining brand content vectors and brand indicators associated with the brand content vectors by: generating as a brand content representation the representation of the piece of brand content [see at least Paragraph 0154-0159 for reference to the normalization process of content to a sorted list of components; Paragraph 0278 for reference to the Image filter detecting images that have been previously detected in scam messages by utilizing an Image Filter Rule for each image which contains an image identifier, or simplified/vector representation of the image; Paragraph 1202 for reference to the system performing vectorization of message content; Paragraph 1434 for reference to vector representation of the message being determined by the system]
including in the brand content vectors the representative vectors of the generated brand content representation [see at least Paragraph 0100 for reference to the system utilizing the vector filter rule to determine correspondence to a whitelisted brand; Paragraph 159 for reference to the normalized and sorted list of components is compared to all similarly sorted lists associated with (a) trusted entities, (b) common brands, (c) special words; Paragraph 0278 for reference to the Image filter detecting images that have been previously detected in scam messages by utilizing an Image Filter Rule for each image which contains an image identifier, or simplified/vector representation of the image]
classify unknown content as real or fake comprising: receiving the unknown content [see at least Paragraph 1780 for reference to the process for classifying communications beginning with receiving electronic communications; Figure 18 and related text regarding item 1802 ‘Receive electronic communication’]
generating as an unknown content representation the representation of the unknown content [see at least Paragraph 1202 for reference to the system performing vectorization of message content; Paragraph 1434 for reference to vector representation of the message being determined by the system; Paragraph 1653 for reference to machine learning filters, such as the Vector Filter and Storyline Filter, that are not high-cost, based on a configuration that creates a representation that fits in cache]
generating advanced features based on basic raw data features extracted from the unknown content [see at least Paragraph 0063 for reference to the system Filter Engine filtering messages including parsing incoming messages and extracting components/features/elements (e.g., phrases, URLs, IP addresses, etc.); Paragraph 0219 for reference to the system extracting URLs for analysis to determine various characteristics; Paragraph 0334 for reference to the image filter using OCR to extract text from embedded/attached images; Paragraph 0359 for reference to document filer extracting the text content to process for suspicious words; Paragraph 1435-1438 for reference to the processing of messages including the use of a reference counter to increment the signature record; Paragraph 1485 for reference to a Scam Vector Hash Table contains an entry for each scam message that resulted from the training and pruning]
generating a comparison vector based on a comparison between the brand indicators for the most similar brand and the representative indicators for the unknown content representation [see at least Paragraph 0066 for reference to the system applying a filter to compare data in the headers and content portion of a scrutinized email to data associated with trusted brands and headers; Paragraph 0100 for reference to the system utilizing the vector filter rule to determine correspondence to a whitelisted brand; Paragraph 159 for reference to the normalized and sorted list of components is compared to all similarly sorted lists associated with (a) trusted entities, (b) common brands, (c) special words; Paragraph 1122 for reference to the Vector Filter creating a scam vector and comparing it to the previously stored scam message vectors to find a suitable match; Paragraph 1922 for reference to comparison can also be made each time a Boolean value is set to true by determining if the vector in which this Boolean value is an element is all true. and it so, output "match" and conclude the processing of the message]
determining a risk score by applying as the risk model a machine learning algorithm to the generated comparison vector and the generated advanced features [see at least Paragraph 0136 for reference to obtained messages/communications can be used as training/test data upon which authored rules are trained and refined; Paragraph 1121-1122 for reference to the Vector filters being configured to identify previously seen scam messages or close variations based on training methods; Paragraph 1251 for reference to the system calculating a score for a training message based on the number and length of the returned matching signatures; Paragraph 1653 for reference to machine learning filters including vector filters; Paragraph 1695 for reference to filter and rule combinations being run in succession to determine a cumulative scam score to be evaluated against a threshold]
identifying the unknown content as real or fake based on the determined risk score [see at least Paragraph 0173 for reference to a high relative similarity, in many contexts, corresponds to a high risk since that could be very deceptive, and therefore, a high similarity score will commonly translate to a high scam score, which we also refer to a high risk score; Paragraph 0594 for reference to each sender's identifier, such as an email address or phone number, is associated with a score so that trusted associates and known good senders; Paragraph 0905 for reference to the system computing a scam score and finding that it exceeds a threshold, thus determining the high chance of threat; Paragraph 1512 for reference to the scam score indicating the likelihood of scam and is used with other information about the message such as the sender’s reputation or attachment information to make a decision]
based on the classification of the unknown content as real or fake: when the unknown content is classified as fake, blocking access to the unknown content by instruction network hardware to prevent network access to a location associated with the unknown content or by security software executing on an endpoint computing device [see at least Paragraph 0076-0077 for reference to in the content is determine to be deceptive then the system performs the action of blocking; Paragraph 0133 for reference to messages determined to be high-risk causing the message to be blocked; Paragraph 0152 for reference to the blocking being conditional based on content; Paragraph 0311-0313 for reference to the system temporarily or permanently blocking the sender, the channel identifier, or informing the user that their access has been permanently or temporarily blocked]
when the unknown content is classified as real, allowing access to the unknown content by not blocking access to the unknown content [see at least Paragraph 0276 for reference to messages being classified as red, yellow, and green and messages classified as green are not retained in storage; Paragraph 0594 for reference to each sender's identifier, such as an email address or phone number, is associated with a score so that trusted associates and known good senders]
Before the effective filing date, it would have been obvious to one of ordinary skill in the art to modify the unknown content classification of Cavallari to include the classification of real or fake; subsequent blocking or acceptance of unknown content; and risk score determination of Leddy. Doing so would provide an automated adaptive system that can protect users against evolving scams, as stated by Leddy (Paragraph 0056).
Regarding claim 12, the claim recites limitations already addressed by the rejection of claim 1. Regarding claim 12, Cavallari teaches a method for generating a brand registry and classifying content as real or fake based on the brand registry using a computer system [Paragraph 0051 & Figure 7]. Therefore, claim 12 is rejected as being unpatentable over the combination of Cavallari and Leddy.
Claims 2 and 13
While the combination of Cavallari and Leddy disclose the limitations above, Cavallari does not disclose wherein the generating of the brand registry further comprises, for each of the clusters identified as a brand: determining a brand name for the brand identifier determined for the identified brand; and including the determined brand name in the brand identifier stored in the brand registry for the brand.
Regarding Claim 2, Leddy disclose the following:
wherein the generating of the brand registry further comprises, for each of the clusters identified as a brand: determining a brand name for the brand identifier determined for the identified brand [see at least Paragraph 0159 for reference to after the normalization is performed, the normalized and sorted list of components is compared to all similarly sorted lists associated with (a) trusted entities, (b) common brands, and (c) special words; Paragraph 0219 for reference to extracted URLs being analyzed to determine if its associated with a known brand]
including the determined brand name in the brand identifier stored in the brand registry for the brand [see at least Paragraph 1768 for reference to the contents of database can be provided by multiple providers (e.g., authoritative entities can make use of APIs or other mechanisms to submit collections of terms and/or media associated with their respective brands/identities); Paragraph 1768 for reference to the platform including a content database that includes a collection of terms associated with various authoritative entities; Figure 16 and related text regarding item 1608 ‘Content Evaluation Engine’ and item 1616 ‘Content Database’]
Before the effective filing date, it would have been obvious to one of ordinary skill in the art to modify the comparison vector method of Cavallari to include the brandname identification of Leddy. Doing so helps to avoid false positives (i.e., blocking that should not have taken place) due to many very similar Rules triggering on one and the same message, as stated by Leddy (Paragraph 0668).
Regarding claim 13, the claim recites limitations already addressed by the rejection of claim 2.
Claims 3 and 14
While the combination of Cavallari and Leddy disclose the limitations above, Cavallari does not disclose the comparison vector is a Boolean vector; and each element of the Boolean vector indicates whether an indicator of the brand indicators for the most similar brand matches a same indicator of the representative indicators for the unknown content representation.
Regarding Claim 3, Leddy discloses the following:
the comparison vector is a Boolean vector [see at least Paragraph 1922 for reference to each rule being represented as a vector of Boolean values, where the vector has the same length as the associated rule contains words; Paragraph 1922 for reference to comparison can also be made each time a Boolean value is set to true by determining if the vector in which this Boolean value is an element is all true. and it so, output "match" and conclude the processing of the message]
each element of the Boolean vector indicates whether an indicator of the brand indicators for the most similar brand matches a same indicator of the representative indicators for the unknown content representation [see at least Paragraph 1922 for reference to if a word matches a term fully, then all Boolean values that are pointed to by the pointers associated with the term that the word matches are set to true; Paragraph 1922 for reference to if a full match is achieved, then the Boolean values associated with the pointers of this term are set to true; Paragraph 1922 for reference to after the entire document has been parsed in this manner, the system determines whether any of the vectors of Boolean values is all true, and if this is so. then the algorithm outputs that there is a match]
Before the effective filing date, it would have been obvious to one of ordinary skill in the art to modify the comparison vector method of Cavallari to include the comparison Boolean vector of Leddy. Doing so provides detection capabilities in situations where information is dispersed over multiple related messages, which causes the thread of messages to be considered dangerous, as stated by Leddy (Paragraph 1922).
Regarding claim 14, the claim recites limitations already addressed by the rejection of claim 3.
Claims 4 and 15
While the combination of Cavallari and Leddy disclose the limitations above, regarding Claim 4, Cavallari discloses the following:
wherein the most similar brand is determined by finding the stored brand representation having the centroid of the cluster with a smallest cosine distance to the representative vectors of the unknown content representation [see at least Paragraph 0027 for reference to similarity module operating the distance module to determine similarity scores for various labeled vectors obtained from vector index based on comparing these vectors with unlabeled vector; Paragraph 0027 for reference to distance module comparing labeled vectors to unlabeled vector using any of various similarity algorithms, including: Euclidean distance, cosine similarity, Jaccard similarity, minkowski distance, etc.; Paragraph 0036 for reference to the vector similarity module determining for a given vector included in a set, a secondary set of vectors that include k-nearest neighbor vectors to the given vector, where k is a tunable parameter; Paragraph 0047 for reference to the unlabeled vector being shown with five primary nearest neighbors which each have a label A; Paragraph 0047 for reference to the set of primary vectors being referred to as a coherent cluster because the vectors belong to the same class; Figure 2 and related text regarding item 260 ‘Similarity Module’, item 270 ‘Distance Module’, item 272 ‘Similarity Scores’; Figure 4 and related text regarding item 420 ‘Vector Similarity Module’; Figure 5 and related text regarding the example nearest neighbor; Examiner notes ‘k nearest neighbor tunable parameter’ as analogous to claimed ‘centroid’]
Regarding claim 15, the claim recites limitations already addressed by the rejection of claim 4.
Claims 5 and 16
While the combination of Cavallari and Leddy disclose the limitations above, Cavallari does not disclose wherein the content indicators extracted from the received content include at least one of a security certificate associated with the received content or a domain identifier of the received content.
Regarding Claim 5, Leddy discloses the following:
wherein the content indicators extracted from the received content include at least one of a security certificate associated with the received content or a domain identifier of the received content [see at least Paragraph 0066 for reference to an example filter applied to content being a filter that detects deceptive email addresses, display names, or domain names; Paragraph 0150 for reference to the message being parsed and the URL being extracted; Paragraph 0151 for reference to the message being placed in the yellow bin due to the domain name; Paragraph 0152 for reference to the system detecting deceptive strings, such as display names, email addresses, domains, or URLs]
Before the effective filing date, it would have been obvious to one of ordinary skill in the art to modify the extraction of content indicators of Cavallari to include the domain identification of Leddy. Doing so would provide an automated adaptive system that can protect users against evolving scams, as stated by Leddy (Paragraph 0056).
Regarding claim 16, the claim recites limitations already addressed by the rejection of claim 5.
Claims 7 and 18
Regarding Claim 7, Cavallari discloses the following:
wherein the embedding machine learning algorithm includes at least one of a vision model or a natural language processing model [see at least Paragraph 0017 for reference to the embedding module embeds unlabeled content to generate unlabeled vector; Paragraph 0018 for reference to the term "embedding” refers to a set of natural language processing techniques which include mapping user-generated content to vectors of real; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
Regarding claim 18, the claim recites limitations already addressed by the rejection of claim 7.
Claims 8 and 19
Regarding Claim 8, Cavallari discloses the following:
wherein the visual indicators for the received content comprise at least one of a rendering of the content, a favicon of the content, or an image included in the content [see at least Paragraph 0012 for reference to user-generated content including images; Paragraph 0032 for reference to training module sending a query that specifies an image]
Regarding claim 19, the claim recites limitations already addressed by the rejection of claim 8.
Claim 9
While the combination of Cavallari and Leddy disclose the limitations above, regarding Claim 9, Cavallari discloses the following:
wherein the embedding machine learning algorithm for applied to the visual indicators is a hidden layer of a convolutional neural network (CNN) [see at least Paragraph 0017 for reference to the embedding module embeds unlabeled content to generate unlabeled vector; Paragraph 0018 for reference to the term "embedding” refers to a set of natural language processing techniques which include mapping user-generated content to vectors of real; Paragraph 0018 for reference to embedding module generates a vector representation (unlabeled vector) of unlabeled content after determining a semantic meaning associated with the content; Figure 1 and related text regarding item 120 ‘Embedding Module’; Figure 7 and related text regarding item 720 ‘Embed, using a machine learning module, the unlabeled content, where the embedding generates an unlabeled vector’]
Claims 10 and 20
While the combination of Cavallari and Leddy disclose the limitations above, regarding Claim 10, Cavallari discloses the following:
wherein the textual indicators for the received content comprise at least one of domain information for the content, all text from the content, or a copyright notice from the content [see at least Paragraph 0012 for reference to user-generated content including a phrase spoken by the user, text from an email, personal messages, text entered into a search engine, etc.; Paragraph 0031 for reference to training module sending a query that specifies a sentence, utterance, etc.]
Regarding claim 20, the claim recites limitations already addressed by the rejection of claim 10.
Claim 11
While the combination of Cavallari and Leddy disclose the limitations above, regarding Claim 11, Cavallari discloses the following:
wherein the brand content and the unknown content includes at least one of emails or webpages [see at least Paragraph 0012 for reference to user-generated content including a phrase spoken by the user, text from an email, personal messages, text entered into a search engine, etc.]
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Geng, Guang‐Gang, Xiao‐Dong Lee, and Yan‐Ming Zhang. "Combating phishing attacks via brand identity and authorization features." Security and communication networks 8.6 (2015): 888-898.
DOCUMENT ID
INVENTOR(S)
TITLE
US 10,354,273 B2
Jin et al.
Systems And Methods For Tracking Brand Reputation And Market Share
US 11,501,120 B1
Peterson et al.
Indicator centroids for malware handling
THIS ACTION IS MADE FINAL. Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to KRISTIN ELIZABETH GAVIN whose telephone number is (571)270-7019. The examiner can normally be reached M-F 7:30-4:30 PM EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jerry O'Connor can be reached at 571-272-6787. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KRISTIN E GAVIN/Primary Examiner, Art Unit 3624