Last updated: May 29, 2026

Application No. 18/898,306

DETECTING PHISHING WEBPAGES VIA TEXTUAL ANALYSIS FROM SCREENSHOTS

Non-Final OA §103§112

Filed

Sep 26, 2024

Priority

Sep 26, 2019 — CIP of 16/583,707 +1 more

Examiner

DO, KHANG D

Art Unit

2492

Tech Center

2400 — Computer Networks

Assignee

Fortinet Inc.

OA Round

1 (Non-Final)

Interview Optional

— +45.0% interview lift. Examiner has a relatively high allowance rate (81%); +45.0% interview lift. A written response may suffice.

Based on 336 resolved cases, 2023–2026

Examiner Intelligence

DO, KHANG D View full profile →

Grants 81% — above average

Career Allowance Rate

271 granted / 336 resolved

+22.7% vs TC avg

Strong +45% interview lift

Without

With

+45.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 7m

Avg Prosecution

10 currently pending

Career history

346

Total Applications

across all art units

Statute-Specific Performance

§101

2.6%

-37.4% vs TC avg

§103

85.6%

+45.6% vs TC avg

§102

2.2%

-37.8% vs TC avg

§112

3.9%

-36.1% vs TC avg

Black line = Tech Center average estimate • Based on career data from 336 resolved cases

Office Action

§103 §112

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

DETAILED ACTION
This non-final action is responsive to application filed on 09/26/2024. Claims 1-6 are pending, with claims 1, 5 and 6 being independent. 

Priority
This application is a continuation-in-part to U.S. Patent App. No. 18/125,916 filed 03/24/2023, which is a continuation-in-part to U.S. Patent App. No. 16/583,707 filed 09/26/2019.

Information Disclosure Statement
The information disclosure statement (IDS) submitted on 09/26/2024, 01/02/2025 an 05/02/2025 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.

Claim Objections
Claims 1 and 5-6 are objected to because of the following informalities: 
For claims 1 and 5-6,
the claims recite abbreviation “OCR” without defining the terms,
“the snapshot” should read “the screenshot”;
there is insufficient antecedent basis for “the web page text”,
	For claim 4, there is insufficient antecedent basis for “the probability estimation”.
For claim 6, 
there is insufficient antecedent basis for “the WLAN”,
the claim recites abbreviation “WLAN” without defining the term.
Appropriate correction is required.

Drawings
The drawings are objected to because there are typographical errors in elements 220, 420, 430 and 520-540. Also, description in element 550 is not consistent with description in paragraph 44. Corrected drawing sheets in compliance with 37 CFR 1.121(d) are required in reply to the Office action to avoid abandonment of the application. Any amended replacement drawing sheet should include all of the figures appearing on the immediate prior version of the sheet, even if only one figure is being amended. The figure or figure number of an amended drawing should not be labeled as “amended.” If a drawing figure is to be canceled, the appropriate figure must be removed from the replacement sheet, and where necessary, the remaining figures must be renumbered and appropriate changes made to the brief description of the several views of the drawings for consistency. Additional replacement sheets may be necessary to show the renumbering of the remaining figures. Each drawing sheet submitted after the filing date of an application must be labeled in the top margin as either “Replacement Sheet” or “New Sheet” pursuant to 37 CFR 1.121(d). If the changes are not accepted by the examiner, the applicant will be notified and informed of any required corrective action in the next Office action. The objection to the drawings will not be held in abeyance.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim 5 is rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor, or for pre-AIA  the applicant regards as the invention.
The claim recites “A non-transitory computer-readable medium in a network device for web site phishing detection using machine learning of keywords, the method comprising” (emphasis added). The claim is a medium claim but reciting the method comprising. It is not clear if the claim is a medium claim or method claim. It appears that this is a medium claim implementing the method.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains.  Patentability shall not be negated by the manner in which the invention was made.


Claims 1, 3 and 5-6 are rejected under 35 U.S.C. 103 as being unpatentable over Geng et al. (US 2015/0200963, published Jul. 16, 2015) and Pratt et al. (US 2021/0203691, filed Dec. 27, 2019).
As per claim 1, Geng discloses:
detecting a web page responsive to a web page request (Geng par. 33, web pages at the website under investigation are crawled); 
determining if the text is on keyword list (Geng par. 34, The character string of the title is next matched with phishing sensitive words (step 205)… Phishing sensitive words refer to a category of words that phishing websites often use as website keywords… Phishing sensitive words in a particular example can include: bank, payment, log on, lottery prize, securities, group purchase, official websites, Taobao™, Tencent™, and so on); 
text is on keyword list (Geng Fig. 1, There Are Phishing Category Words In The Title Of The Page For Y at 205);
responsive to a suspicious web page, generating web search results from the keywords (Geng Fig. 1 and par. 35, a search is conducted in a search engine using the title of the web page as the query keywords to obtain a search result (step 300)); 
responsive to the suspicious web page not appearing within top web search results (Geng Fig. 1, Determine Target URL Not Appear In The Search Result at 305), flagging the suspicious web page as a phishing web page (Geng Fig. 1, Determine Website Is A Phishing Website for N at 505.
Geng does not explicitly disclose:
generating text from a screenshot of the web page and a feature vector describing the text, wherein an OCR process identifies the text of the snapshot; 
if text is on keyword list, determining if web page is suspicious for phishing by inputting features of the web page text in a keyword feature model trained from keyword features of known phishing web pages and/or known legitimate web pages; 
taking a security action against the phishing web page.  
Pratt teaches:
generating text from a screenshot of the web page and a feature vector describing the text, wherein an OCR process identifies the text of the snapshot (from claim 3 of examined application, the feature vector comprises at least one of keyword list, keyword itself, size, and position; Pratt, par. 58-59, the screenshot analysis engine can perform optical character recognition (OCR) on the screenshot of the suspect webpage to extract suspect text… the screenshot analysis engine can determine whether certain keywords found in the suspect text like login, usemame, password, etc. are indicators of a malicious page (e.g., a phishing determination)); 
determining if web page is suspicious for phishing by inputting features of the web page text in a keyword feature model trained from keyword features of known phishing web pages and/or known legitimate web pages (Pratt par. 59, For text elements, the screenshot analysis engine can determine whether certain keywords found in the suspect text… The screenshot analysis engine can use a machine learning model (MLL) where screenshots of legitimate webpages can be fed to the MLL in order to learn features like color scheme, shape of the buttons, location of the elements on the legitimate webpages. The MLL can then be used to detect similar looking pages that are received by the ingestion module, such as where a high degree of similarity of a suspect webpage can indicate a high likelihood of phishing. The result from the screenshot analysis engine can be a decision indicating whether the web page is a phishing web page, a score (e.g., a confidence indication) of the decision, and supporting data such as indications of malicious portions of the OCR-ed suspect text and/or features of the suspect webpage); 
taking a security action against the phishing web page (Pratt par. 24, A malware and phishing detection and mediation (MAPDAM) platform can be used to detect, investigate, and/or perform mitigate actions to prevent and/or mitigate various malware attacks and phishing websites).  
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to modify the method of Geng with the teaching of Pratt to incorporate additional phishing analysis. One of ordinary skilled in the art would have been motivated because it offers the advantage of improving accuracy of the phishing detection.
As indicated above, Geng as modified discloses performing A (determining the text is on keyword list) then B (generating web search results from the keywords) and C (inputting features of the web page text in a keyword feature model), but does not explicitly disclose the order of these steps. However, there are only a finite number of orders that these steps can perform: A then B then C, A then C then B, C then A then B. It would have been obvious to one of ordinary skill in the art at the time of effective filing date of the claimed invention to try these alternatives in an attempt to determine the efficiency of the system.

As per claim 3, Geng as modified discloses the method of claim 1, wherein the feature vector comprises at least one of keyword list, keyword itself (Pratt, par. 58-59, the screenshot analysis engine can perform optical character recognition (OCR) on the screenshot of the suspect webpage to extract suspect text… the screenshot analysis engine can determine whether certain keywords found in the suspect text like login, usemame, password, etc. are indicators of a malicious page (e.g., a phishing determination)), size, and position. The same rationale as in claim 1 applies.

Claim 5 does not teach or further define over the limitations in claim 1. As such, claim 5 is rejected for the same reasons as set forth in claim 1.

Claim 6 does not teach or further define over the limitations in claim 1. As such, claim 6 is rejected for the same reasons as set forth in claim 1.
 
Claim 2 is rejected under 35 U.S.C. 103 as being unpatentable over Geng et al. (US 2015/0200963, published Jul. 16, 2015), Pratt et al. (US 2021/0203691, filed Dec. 27, 2019) and N (US 2021/0144174, filed Nov. 7, 2019).
As per claim 2, Geng as modified discloses the method of claim 1, but does not explicitly disclose wherein the network device comprises one or more of a gateway, an access point, a station, and a browser app.  
N teaches:
the network device comprises one or more of a gateway (N par. 50, gateway 108 may have a phishing website detection engine), an access point, a station, and a browser app.  
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify the method of Geng with the teaching of N for the network device comprises one or more of a gateway. One of ordinary skilled in the art would have been motivated because it offers the advantage of allowing the gateway to detect phishing attemp.

Claim 4 is rejected under 35 U.S.C. 103 as being unpatentable over Geng et al. (US 2015/0200963, published Jul. 16, 2015), Pratt et al. (US 2021/0203691, filed Dec. 27, 2019) and Mathieu (US 2016/0314350, published Oct. 27, 2016).
As per claim 4, Geng as modified discloses the method of claim 1, but does not explicitly disclose the probability estimation is based at least in part on a Hamming distance.  
Mathieu teaches:
the probability estimation is based at least in part on a Hamming distance (Mathieu par. 112, Hamming distance distribution may include two well defined regions (id and di), such that a threshold (Th) can be selected that provides a very high probability of a correct match based on the Hamming distance being less than the threshold).  
It would have been obvious to one skilled in the art before the effective filing date of the claimed invention to further modify the method of Geng with the teaching of Mathieu to incorporate technique relating to Hamming distance. One of ordinary skilled in the art would have been motivated because it offers the advantage of comparing between screenshots.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US 20100186088 A1; Automated Identification Of Phishing, Phony And Malicious Web Sites
A method and system for automated identification of phishing, phony, and malicious web sites are disclosed. According to one embodiment, a computer implemented method, comprises receiving a first input, the first input including a universal resource locator (URL) for a webpage. A second input is received, the second input including feedback information related to the webpage, the feedback information including an indication designating the webpage as safe or unsafe. A third input is received from a database, the third input including reputation information related to the webpage. Data is extracted from the webpage. A safety status is determined for the webpage, including whether the webpage is hazardous by using a threat score for the webpage and the second input, wherein calculating the threat score includes analyzing the extracted data from the webpage. The safety status for the webpage is reported.
US 20190068638 A1; Discovering Website Phishing Attacks
A method, computer system, and a computer program product for identifying a phishing attack is provided. The present invention may include receiving an alert of a suspicious URL. The present invention may include making an HTTP request to the suspicious URL. The present invention may include downloading and rendering the suspicious URL content. The present invention may include producing a screenshot of the rendered suspicious URL content. The present invention may include making an HTTP request to a domain landing page. The present invention may include downloading and rendering the domain landing page URL content. The present invention may include producing a screenshot of the rendered domain landing page URL content. The present invention may include generating a score based on comparing the produced first screenshot and the produced second screenshot.
US 20190104154 A1; Phishing Attack Detection
A computerized method for analyzing a subject URL to determine whether the subject URL is associated with a phishing attack is disclosed. The method includes steps of detecting keypoints within a screenshot of a webpage corresponding to the subject URL and determining a set of confidences based on an analysis of the detected keypoints with a model. Each confidence within the set of confidences is assigned to feature vector within a set of training feature vectors representing a training set of URLs used in generating the model. The method comprises performing an image comparison between the screenshot and a screenshot corresponding to a feature vector within the set of training feature vectors, the feature vector being assigned a highest confidence. Responsive to determining the image comparison result exceeds a predefined threshold, transmitting an alert indicating that the subject URL is associated with the phishing attack.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KHANG DO whose telephone number is (571)270-7837. The examiner can normally be reached Monday-Friday 8:00 - 5:00 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, RUPAL DHARIA can be reached at (571) 272-3880. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/KHANG DO/Primary Examiner, Art Unit 2492

Read full office action

Prosecution Timeline

Sep 26, 2024

Application Filed

Dec 18, 2025

Non-Final Rejection mailed — §103, §112 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/425,212

Patent 12641118

PERFORMING AUTOMATED DETECTION OF PHISHING WEB SITES USING EMBEDDED TRACKING ELEMENT

2y 3m to grant Granted May 26, 2026

17/981,743

Patent 12627708

SYSTEMS, METHODS, AND APPARATUSES FOR DETECTION OF DATA MISAPPROPRIATION ATTEMPTS ACROSS ELECTRONIC COMMUNICATION PLATFORMS

3y 6m to grant Granted May 12, 2026

18/030,907

Patent 12609954

ATTACK SCENARIO GENERATION APPARATUS, RISK ANALYSIS APPARATUS, METHOD, AND COMPUTER READABLE MEDIA

3y 0m to grant Granted Apr 21, 2026

17/973,794

Patent 12603884

ACCESSING AN ENCRYPTED PLATFORM

3y 5m to grant Granted Apr 14, 2026

18/477,658

Patent 12603918

SECURITY SYSTEM FOR DETECTING MALICIOUS ACTOR'S OBSERVATION

2y 6m to grant Granted Apr 14, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

81%

Grant Probability

99%

With Interview (+45.0%)

2y 7m (~11m remaining)

Median Time to Grant

Low

PTA Risk

Based on 336 resolved cases by this examiner. Grant probability derived from career allowance rate.