Last updated: May 29, 2026

Application No. 17/993,131

Systems and Methods for Automatic URL Identification From Data

Non-Final OA §101§103

Filed

Nov 23, 2022

Priority

Nov 23, 2021 — provisional 63/282,212

Examiner

ADAMS, CHARLES D

Art Unit

2152

Tech Center

2100 — Computer Architecture & Software

Assignee

Insurance Services Office Inc.

OA Round

5 (Non-Final)

This examiner grants 44% of cases after interview

— +44.1% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.

Based on 425 resolved cases, 2023–2026

Examiner Intelligence

ADAMS, CHARLES D View full profile →

Grants 44% of resolved cases

Career Allowance Rate

189 granted / 425 resolved

-10.5% vs TC avg

Strong +44% interview lift

Without

With

+44.1%

Interview Lift

resolved cases with interview

Typical timeline

4y 11m

Avg Prosecution

21 currently pending

Career history

456

Total Applications

across all art units

Statute-Specific Performance

§101

6.0%

-34.0% vs TC avg

§103

89.9%

+49.9% vs TC avg

§102

3.1%

-36.9% vs TC avg

§112

0.7%

-39.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 425 resolved cases

Office Action

§101 §103

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant's submission filed on 22 January 2026 has been entered.
 
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 8-10 and 12-14 are rejected under 35 U.S.C. 101 because the claimed invention is directed to non-statutory subject matter.  
The claims do not fall within at least one of the four categories of patent eligible subject matter because the claims are directed towards a system, yet it is not clear that the system contains hardware. If the system does not contain hardware, then it is not a machine, and is thus not one of the four categories of patent eligible subject matter. 
The system does contain a “processor.” In paragraph [0024] of the published application (US Pre-Grant Publication 2023/0161831), the specification describes “a URL identification processor” that “could” include a hardware processor “such as a computer system, computer server, cloud processing service, mobile device, etc.” 
This definition of the “URL identification processor” does not require nor guarantee that a hardware processor is present in the system of claim 8. Rather, the system only “could” include hardware. 
Because hardware is not clearly present in the claims, the claims cannot be a machine. The claims are also not a process, manufacture, or composition of matter. As such, the claims are not clearly directed towards patent eligible subject matter. 

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-3, 5-10, 12-17, and 19-21 are rejected under 35 U.S.C. 103 as being unpatentable over Goodman et al. (US Pre-Grant Publication 2005/0022031) in view of Chandra et al. (US Pre-Grant Publication 2013/0173639), in view of Pogrebezky et al. (US Pre-Grant Publication 2020/0242170), and further in view of Pugh et al. (US Pre-Grant Publication 2008/0201401)

As to claim 1, Goodman et al. teaches a method for automatic identification of a Uniform Resource Locator (URL) from data, comprising the steps of: 
receiving a data item at a processor (see paragraph [0030]. The system of Goodman may receive a message. A message is a data item); 
processing the data item to identify at least one URL present in the data item (see paragraph [0030]. The system of Goodman extracts features from the message. Among the features extracted is URLs. Examiner notes that the claimed embodiment – in which a URL is “present in the data item” appears to differ from the embodiments set forth on pages 5-6 of the specification, in which the data item appears to store merchant data that is used as part of a query to discover URLs); 
processing the at least one URL using at least one of a fuzzy matching algorithm or an exact matching algorithm to validate the at least one URL (see paragraph [0032]. The system of Goodman is able to examine URLs extracted from messages. This is a “processing” step. The system is attempting to determine if a message is good or spam. The inspection of the URLs of the message validate the URLs to aid in this determination, see paragraph [0041]. As noted in paragraph [0032], the system uses machine learning techniques and match based techniques); and 
if the at least one URL is validated by the fuzzy matching algorithm or the exact matching algorithm, generating and transmitting an output file that includes the at least one URL (see paragraph [0041]. Messages with URLs that have been validated are marked as “good” and passed on to the recipient). 
Goodman does not explicitly teach: 
wherein the at least one URL is validated by the processor: 
(i) establishing communication with a search engine via an application programming interface (API), 
(ii) retrieving a URL list from the search engine, 
(iii) automatically scraping URL content from a main content page and a contact page of each website listed in the URL list, and 
(iv) validating metadata relating to the at least one URL against the scraped URL content using the fuzzy matching algorithm or the exact matching algorithm, and 
(v) validating the metadata against a plurality of IP addresses stored in a database; and
wherein the steps of automatically scraping URL content from the main content page and the contact page, validating the metadata relating to the at least one URL against scraped URL content, and validating the metadata against the plurality of IP addresses are performed in parallel by the processor to increase processing speed
Chandra teaches: 
wherein the at least one URL is validated by the processor: 
(i) establishing communication with a search engine via an application programming interface (API) (see Chandra paragraphs [0044]-[0045]. A request is sent to a search engine via an API), 
(ii) retrieving a URL list from the search engine (see Chandra paragraphs [0044]-[0045]. A list of results is returned from the search engine. The list includes URLs),
(iii) automatically scraping URL content from a main content page … each website listed in the URL list (see Chandra paragraphs [0045]-[0047]. URL content is scraped from each website to filter and identify authoritative websites), and 
(iv) validating metadata relating to the at least one URL against the scraped URL content using the fuzzy matching algorithm or the exact matching algorithm (see Chandra paragraphs [0045]-[0047]). 
It would have been obvious to one of ordinary skill in the art before the earliest filing date of the invention to have modified Goodman by the teachings of Chandra because both references are directed towards analyzing and verifying URLs and Chandra merely provides additional methods to filter URLs. This additional analysis and filtering step will help a validation system be certain that a URL is verified.
Pogrebezky teaches: 
automatically scraping URL content from a main content page and a contact page of each website listed in the URL list (see Pogrebezky paragraphs [0123]-[0124]. A web crawler may crawl the home webpage for each company (“main content page”) and may also crawl the “contact us page” for additional information. It is noted that this is performed for each “company,” paragraphs [0119]-[0121] starting with the website for each respective company. Thus, this processing is done for “each website listed in the URL list”); 
wherein the steps of automatically scraping URL content from the main content page and the contact page, validating the metadata relating to the at least one URL against scraped URL content, and validating the metadata against the plurality of IP addresses are performed in parallel by the processor to increase processing speed (see Pogrebezky paragraphs [0120]-[0124] for a process of scraping URL content from a main content page and a contact page and validating the metadata. These steps are described as being a part of the workflow of Figure 4 of Pogrebezky. It is noted in paragraph [0121], Pogrebezky states that steps may be performed simultaneously. Thus, Pogrebezky teaches to perform data analysis steps in parallel. It is noted that Pugh, cited below, teaches the additional data analysis step of “validating the metadata against the plurality of IP addresses.” However, because this is merely another data processing step of crawling, analyzing, and validating URL content, it would be obvious to one of ordinary skill in the art in view of Pogrebezky paragraph [0121] to perform this data analysis step with the other data analysis steps “simultaneously”)
It would have been obvious to one of ordinary skill in the art before the earliest filing date of the invention to have modified Goodman by the teachings of Pogrebezky because both references are directed towards analyzing URLs and Pogrebezky merely provides additional data extraction methods and targets. This additional data analysis will help the validation system of Goodman collect additional data about entities to more completely validate any information target. 
Pugh teaches: 
wherein the at least one URL is validated by the processor: 
…
(ii) retrieving a URL list  … (see Pugh paragraph [0095]. Pugh retrieves a URL from an active window);
(iii) automatically scraping URL content from a main content page … of each website listed in the URL list (see Pugh paragraph [0095]. The hostname of the URL is scraped from the URL), and 
(iv) validating metadata relating to the at least one URL against the scraped URL content using the fuzzy matching algorithm or the exact matching algorithm (see paragraph [0095]-[0096]. The hostname from the URL is validated against hostnames and IP addresses stored in a client or server database), and 
(v) validating the metadata against a plurality of IP addresses stored in a database (see paragraph [0095]-[0096]. The hostname is validated against IP addresses stored in a client or server database). 
It would have been obvious to one of ordinary skill in the art before the earliest filing date of the invention to have modified Goodman by the teachings of Pugh because both references are directed towards analyzing content to verify the validity of URLs and Pugh merely provides additional methods to corroborate this data contained with a URL. This additional verification will help a validation system be certain that a URL is verified.

As to claim 2, Goodman teaches the method of Claim 1, further comprising matching one or more non-URL data items to ensure that only valid URLs are identified (see Goodman paragraph [0041]. Various features other than URLs can be considered and used. Also see Goodman paragraphs [0042] and [0046]). 

As to claim 3, Goodman teaches the method of Claim 1, further comprising pre-processing the data item to perform at least one of sorting the data item according to a client-level sorting or standardizing columns in the data item (see Goodman paragraph [0066]. URLs may be sorted with the order of the URLs recorded in a sorted order of appearance). 

As to claim 5, Goodman as modified by Chandra teaches the method of Claim 1, further comprising cross-checking the at least one URL based on one or more of a business name or an e-mail address (see Chandra paragraphs [0045]-[0048]. URLs are cross-checked via names associated with businesses). 

As to claim 6, Goodman teaches the method of Claim 1, further comprising validating the at least one URL utilizing at least one matching criteria including one or more of a business name, a merchant name, a doing business as (DBA) name, a transacting business as (T/A) name, a street address, a city, a postal code, a state, a province, a telephone number, an e-mail address, a name, a business description, or a county (see Goodman paragraph [0066]. A URL and name such as “Microsoft” may be used to validate URLs).  

As to claim 7, Goodman as modified by Chandra teaches further comprising scoring the at least one URL to determine relevancy of the at least one URL to metadata associated with a merchant Chandra paragraphs [0045]-[0048]. Relevancy may be determined by metadata related to a merchant, such as Netflix or imdb).

As to claims 8 and 15, see the rejection of claim 1. 
As to claims 9 and 16, see the rejection of claim 2. 
As to claims 10 and 17, see the rejection of claim 3. 
As to claims 12 and 19, see the rejection of claim 5. 
As to claims 13 and 20, see the rejection of claim 6. 
As to claims 14 and 21, see the rejection of claim 7. 

Response to Arguments
Applicant's arguments filed 22 January 2026 have been fully considered but they are not persuasive. 

Applicant argues that “none of the cited portions of Chandra et al.,Chen, and Pugh et al. actually disclose the foregoing features. While paragraphs 0044-0047 of Chandra et al. disclose displaying search results in a display screen 300 in connection with a source-specific search (paragraph 0044), filtering the results using a filtering component 230 that only keeps URLs potentially related to an entity type of interest (paragraph 0045), and various patterns that can be used for filtration of the results (paragraphs 0046 and 0047), there is no discussion in paragraphs 0044-0047 of validating metadata relating to a URL against scraped URL content as required by independent Claims 1, 8, and 15.”
In response to this argument, it is noted that filtering irrelevant results out of a list of results by comparing results to an extracted URL pattern to ensure that results match the pattern is a “validating metadata relating to a URL against scraped URL content.” 
Examiner reminds Applicants that unclaimed features from the specification, such as any particular steps undertaken during a validation process, receive no patentable weight until claimed. 

Applicant continues, arguing that “Finally, paragraphs 0095-0096 of Pugh et al. also fail to cure the deficiencies of the cited portions of Chandra et al. and Chen, in that paragraphs 0095-0096 of Pugh et al. only discuss comparing host names against authenticated IP addresses but say nothing about validating metadata relating to a URL against scraped URL content as required by independent Claims 1, 8, and 15. As such, Applicant submits that none of the cited references, taken alone or in any combination, teach or suggest what is currently recited in independent Claims 1, 8, and 15 and their associated dependent claims.”
In response to this argument, the hostname in the cited portions of Pugh is compared to a database of authenticated IP addresses and a search is made for authenticated IP addresses to authenticate a hostname. This authentication is “validating metadata relating to a URL (a hostname) against scraped URL content (authenticated IP addresses).” 
Examiner reminds Applicants that unclaimed features from the specification, such as any particular steps undertaken during a validation process, receive no patentable weight until claimed. 

Applicant’s remaining arguments with respect to claims have been considered but are moot because the new ground of rejection does not rely on any reference applied in the prior rejection of record for any teaching or matter specifically challenged in the argument.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to CHARLES D ADAMS whose telephone number is (571)272-3938. The examiner can normally be reached M-F, 9-5:30 EST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Neveen Abel-Jalil can be reached at 571-270-0474. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/CHARLES D ADAMS/Primary Examiner, Art Unit 2152

Read full office action

Prosecution Timeline

Show 4 earlier events

Jan 30, 2025

Request for Continued Examination

Feb 05, 2025

Response after Non-Final Action

Feb 24, 2025

Non-Final Rejection mailed — §101, §103

Aug 19, 2025

Response Filed

Oct 22, 2025

Final Rejection mailed — §101, §103

Jan 22, 2026

Request for Continued Examination

Jan 29, 2026

Response after Non-Final Action

Feb 24, 2026

Non-Final Rejection mailed — §101, §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/318,382

Patent 12639334

DATA STRUCTURE SYNCHRONIZATION WITH WEBHOOKS

3y 0m to grant Granted May 26, 2026

18/472,925

Patent 12639175

DEVICE AND METHOD FOR MULTI-SOURCE RECOVERY OF ITEMS

2y 8m to grant Granted May 26, 2026

17/670,896

Patent 12602392

SCALABLE METADATA-DRIVEN DATA INGESTION PIPELINE

4y 2m to grant Granted Apr 14, 2026

18/200,736

Patent 12591595

ADAPATIVE SYSTEM FOR PROCESSING DISTRIBUTED DATA FILES AND A METHOD THEREOF

2y 10m to grant Granted Mar 31, 2026

18/313,463

Patent 12572546

METHODS AND SYSTEMS FOR DISTRIBUTED DATA ANALYSIS

2y 10m to grant Granted Mar 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

5-6

Expected OA Rounds

44%

Grant Probability

89%

With Interview (+44.1%)

4y 11m (~1y 5m remaining)

Median Time to Grant

High

PTA Risk

Based on 425 resolved cases by this examiner. Grant probability derived from career allowance rate.