DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Pub.No.: US 2022/0188699 A1 to MATLICK et al(hereafter referenced as Matlick) in view of Pub.No.: US 2022/0030029 A1 to Kagan et al(hereafter referenced as Kagan)
Regarding claim 1, Matlick discloses “a method, comprising: crawling a plurality of webpages for a corresponding open directory”( website downloads the webpages , along with CCM tags 110 , to user computers ( e.g. , computer 230 of FIG . 2 ) [par.0051]) ; “determining that a source code archive included in a first open directory (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file ( if any ) [par.0213]).
Matlick does not explicitly disclose “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model; and performing one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive.
However, Kagan in an analogous art discloses “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model (known phishing sites and phishing kit Kagan[Fig.1/item 75]); “and performing one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive.” (known phishing sites and phishing kit Kagan[Fig.1/item 75])
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify Matlick’s Machine learning technique with Kagan’s phishing protection methods in order to provide additional security. One of ordinary skill in the art would have been motivated to combine because incorporating Kagan’s machine learning-based phishing detection techniques into Matlick’s system would predictably enhance Matlick’s ability to determine that a source code archive associated with a webpage or open directory is a phishing kit source code archive. Such a combination would further enable automated performance of one or more responsive actions—such as flagging, blocking, or remediation—upon determining that the source code archive included in the open directory associated with the first webpage is a phishing kit. The combination merely applies a known machine learning technique from Kagan to the known phishing detection framework of Matlick to improve detection accuracy and operational efficiency, yielding predictable results representing a routine optimization, and both are from the same field of endeavor.
Regarding claim 2 in view of claim 1, the references combined disclose “further comprising identifying all files included in the first open directory (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 3 in view of claim 2, the references combined disclose “further comprising filtering the files to generate a set of archive files” (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 4 in view of claim 3, the references combined disclose “wherein unnecessary files are removed from the files to generate the set of archive files” (anonymization is a type of information sanitization technique that removes personal , sensitive , and / or confidential data from data or datasets so that the person or information described or indicated by the data / datasets remain anonymous Matlick[par.0059])
Regarding claim 5 in view of claim 3, the references combined disclose “further comprising extracting one or more features from the set of archive files.”(extract features Matlick[Fig.18/item 1806]).
Regarding claim 6 in view of claim 5, the references combined disclose “wherein the extracted features include presence of credential exfiltration, cloaking artifacts, geolocation APIs, obfuscation APIs, variables, and/or suspicious filenames/folders” (a separate neural network model may be configured to recognize suspicious input types . A suspicious interactive component may be indicated , for example , by proximity of the interactive component to text or an input field requesting user identification , or by an interactive component that when clicked leads to a new URL Kagan[par.0047]).
Regarding claim 7 in view of claim 1, the references combined disclose “wherein the machine learning model is trained using supervised learning , unsupervised learning, semi-supervised, or reinforcement learning” (The machine learning model may be a neural network ( NN ) model . Recording the phishing attack further may include adding the unauthenticated web content to the phishing content and re - generating the machine learning model Kagan[par.0008])
Regarding claim 8 in view of claim 1, the references combined disclose “wherein the machine learning model is a random forest model” (resource classifier 1640 feeds training data 2220 that includes SS vectors 2224 and the associated known site classifications 2226 into an ML model 2228. For example , ML model 2228 may be a logistic regression ( LR ) model or Random Forest model Matlick [par.0289])
Regarding claim 9 in view of claim 1, the references combined disclose “wherein determining that the source code archive included in the first open directory associated with the first webpage is the phishing kit source code archive using the machine learning model includes providing one or more extracted features to the machine learning model”(known fishing website and phishing kit Kagan[Fig.1/item 74]).
Regarding claim 10 in view of claim 1, the references combined disclose “wherein the one or more actions include storing the set of archive
files in a phishing kit database” (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 11 in view of claim 1, the references combined disclose “wherein the one or more actions include storing indicators of compromise extracted from the set of archive files in a phishing kit database” (known phishing sites and phishing kit Kagan[Fig.1/item 75])
Regarding claim 12 in view of claim 1, the references combined disclose “wherein the one or more actions include adding to a blacklist the first webpage associated with the first open directory(see whitelist/blacklist Kagan[Fig.1/item 31]).
Regarding claim 13 in view of claim 1, the references combined disclose “wherein the one or more actions include determining paths from which the phishing kit source code archive is potentially launched” (known phishing sites and phishing kit Kagan[Fig.1/item 75])
Regarding claim 14, Matlick discloses “a system, comprising: a communication interface configured to crawl a plurality of webpages for a corresponding open directory” ( website downloads the webpages , along with CCM tags 110 , to user computers ( e.g. , computer 230 of FIG . 2 ) [par.0051]); and a processor coupled to the communication interface and configured to: determine that a source code archive included in a first open directory(the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file ( if any ) [par.0213]).
Matlick does not explicitly disclose “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model; and perform one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive”
How, Kagan in an analogous art discloses “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model(known phishing sites and phishing kit Kagan[Fig.1/item 75]); and perform one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive” (known phishing sites and phishing kit Kagan[Fig.1/item 75])
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify Matlick’s Machine learning technique with Kagan’s phishing protection methods in order to provide additional security. One of ordinary skill in the art would have been motivated to combine because incorporating Kagan’s machine learning-based phishing detection techniques into Matlick’s system would predictably enhance Matlick’s ability to determine that a source code archive associated with a webpage or open directory is a phishing kit source code archive. Such a combination would further enable automated performance of one or more responsive actions—such as flagging, blocking, or remediation—upon determining that the source code archive included in the open directory associated with the first webpage is a phishing kit. The combination merely applies a known machine learning technique from Kagan to the known phishing detection framework of Matlick to improve detection accuracy and operational efficiency, yielding predictable results representing a routine optimization, and both are from the same field of endeavor.
Regarding claim 15 in view of claim 14, the references combined disclose “wherein the processor is configured to identify all files included in the first open directory” (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 16 in view of claim 15, the references combined disclose “wherein the processor is configured to filter the files to generate a set of the archive files” (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 17 in view of claim 16, the references combined disclose “wherein the processor is configured to extract one or more features from the set of archive files” (extract features Matlick[Fig.18/item 1806]).
Regarding claim 18 in view of claim 17, the references combined disclose “wherein the machine learning model is configured to determine that the source code archive included in the first open directory associated with the first webpage is the phishing kit source code archive based on the one or more extracted features” (The machine learning model may be a neural network ( NN ) model . Recording the phishing attack further may include adding the unauthenticated web content to the phishing content and re - generating the machine learning model Kagan[par.0008])
Regarding claim 19 in view of claim 14, the references combined disclose “wherein the one or more actions include storing the set of archive files in a phishing kit database” (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file Matlick[par.0213]).
Regarding claim 20, Matlick discloses “a computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for: crawling a plurality of webpages for a corresponding open directory(website downloads the webpages , along with CCM tags 110 , to user computers ( e.g. , computer 230 of FIG . 2 ) [par.0051]); determining that a source code archive included in a first open directory (the InOb 1744 is an archive file or a file path / directory , and each node 1748 is a file contained inside the archive file or file path / directory including the content of each file ( if any ) [par.0213]).
Matlick does not explicitly disclose “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model ; and performing one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive.”
However, Kagan in an analogous art discloses “associated with a first webpage of the plurality of webpages is a phishing kit source code archive using a machine learning model(known phishing sites and phishing kit Kagan[Fig.1/item 75]);; and performing one or more actions in response to determining that the source code archive included in the first open directory associated with the first webpage of the plurality of webpages is the phishing kit source code archive.” (known phishing sites and phishing kit Kagan[Fig.1/item 75])
Therefore, it would have been obvious to one of ordinary skill in the art at the time the invention was filed to modify Matlick’s Machine learning technique with Kagan’s phishing protection methods in order to provide additional security. One of ordinary skill in the art would have been motivated to combine because incorporating Kagan’s machine learning-based phishing detection techniques into Matlick’s system would predictably enhance Matlick’s ability to determine that a source code archive associated with a webpage or open directory is a phishing kit source code archive. Such a combination would further enable automated performance of one or more responsive actions—such as flagging, blocking, or remediation—upon determining that the source code archive included in the open directory associated with the first webpage is a phishing kit. The combination merely applies a known machine learning technique from Kagan to the known phishing detection framework of Matlick to improve detection accuracy and operational efficiency, yielding predictable results representing a routine optimization, and both are from the same field of endeavor.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MICHAEL D ANDERSON whose telephone number is (571)270-5159. The examiner can normally be reached Mon-Fri 9am-6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Jeffrey Pwu can be reached at (571) 272-6798. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MICHAEL D ANDERSON/Examiner, Art Unit 2433
/JEFFREY C PWU/Supervisory Patent Examiner, Art Unit 2433