Last updated: May 29, 2026

Application No. 16/668,544

EXTRACTING UNSTRUCTURED DEMOGRAPHIC INFORMATION FROM A DATA SOURCE IN A STRUCTURED MANNER

Final Rejection §103

Filed

Oct 30, 2019

Examiner

STORK, KYLE R

Art Unit

2128

Tech Center

2100 — Computer Architecture & Software

Assignee

Hi Insights Inc.

OA Round

10 (Final)

Interview Optional

— +28.1% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 64% grant rate with +28.1% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 869 resolved cases, 2023–2026

Examiner Intelligence

STORK, KYLE R View full profile →

Grants 64% of resolved cases

Career Allowance Rate

555 granted / 869 resolved

+8.9% vs TC avg

Strong +28% interview lift

Without

With

+28.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 11m

Avg Prosecution

34 currently pending

Career history

920

Total Applications

across all art units

Statute-Specific Performance

§101

4.5%

-35.5% vs TC avg

§103

84.8%

+44.8% vs TC avg

§102

3.3%

-36.7% vs TC avg

§112

0.6%

-39.4% vs TC avg

Black line = Tech Center average estimate • Based on career data from 869 resolved cases

Office Action

§103

DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This final office action is in response to the amendment filed 22 December 2025.
Claims 1-2, 4-12, and 14-24 are pending. Claims 1 and 11 are independent claims.


Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  

The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.

Claims 1-2, 4-12, 14-20, 22, and 24 are rejected under 35 U.S.C. 103 as being unpatentable over Smith et al. (US 2016/0147899, published 26 May 2016, hereafter Smith) and further in view of Apps et al. (US 2007/0094060, published 26 April 2007, hereafter Apps) and further in view of Tosik et al. (US 11138249, filed 22 August 2018, hereafter Tosik) and further in view of Viau (US 2014/0330832, published 6 November 2014) and further in view of Fujimaki et al. (US 2018/0025072, published 25 January 2018, hereafter Fujimaki).
As per independent claim 1, Smith discloses a method of identifying demographic information in a marked up document, comprising:
(a) detecting, based on crawling a website, a plurality of fields representing demographic information in a marked up document (paragraphs 0029-0032)
(b) extracting a set of features of the markup document based on the detected demographic information (paragraphs 0029-0030: Here, data is extracted relating to sender, recipients, and contents of a message/document)
(c) based at least in part on the set of features, determining whether the plurality of fields represent demographic information of a single provider using an intelligent system (paragraph 0035) according to other marked up documents (paragraph 0037: Here, appropriate records and associated data are identified)
(d) determining whether the plurality of fields represent demographic information of a single healthcare provider (paragraphs 0035 and 0037)
(e) when the plurality of fields are determined to represent demographic information of a single provider, associating the plurality of fields to represent demographic information of the single provider (paragraph 0006: Here, a contact profile of an address book is updated based on received information)
Additionally, Smith fails to specifically disclose:
representing the marked up document in a document object model including a plurality of interconnected nodes 
determining where, in the document object model, each of the plurality of fields is located 
calculating number of hops as a distance between the respective locations of the plurality of fields in the rendered marked-up document, wherein the distance is identified between a pair of the plurality of fields selected between a name at a leaf node of the document object model and another type of the detected demographic information represented in the document object model
wherein the determining (d) occurs based on the number of hops 
However, Apps discloses:
representing the marked up document in a document object model including a plurality of interconnected nodes (paragraph 0079)
determining where, in the document object model, each of the plurality of fields is located (paragraph 0079)
wherein the determining (d) occurs based on the number of hops (paragraph 0079)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Apps with Smith, with a reasonable expectation of success, as it would have enabled for calculating distances based upon rendered information. This would have allowed for identification of whether contents are properly associated within the document.
Further, Tosik, which is analogous to the claimed invention because it is directed toward calculating a distance based upon hops, discloses calculating number of hops starting from a leaf node and ending upon another leaf node as a distance between the respective locations of the plurality of fields in the rendered marked-up document, wherein the distance is identified between a pair of the plurality of fields selected between a name at a leaf node of the document object model and another type of the detected demographic information represented in the document object model (column 6, lines 9-22: Here, a degree of affinity (relatedness) is calculated between a concept (name) and an item (another type of detected demographic information) based upon a number of hops, measured as a distance, along the path traversed between nodes).
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Tosik with Smith-Apps, with a reasonable expectation of success, as it would have allowed determining affinity/relatedness between items based upon a tree (Tosik: column 6, lines 9-22). This would have provided the advantage of improved the speed and reduced the processing power required to determine relatedness of content (Tosik: column 6, lines 28-34).
	Further, Viau, which is analogous to the claimed invention because it is directed toward calculating hops between different types of detected demographic information (paragraph 0085: Here, a distance is calculated based on a number of hops), discloses wherein the distance is identified between a pair of the plurality of items selected between a type of the detected demographic information at a leaf node of the document object model and another type of the detected demographic information at another leaf node of the document object model (Figure 19; paragraphs 0084-0085 and 0097: Here, the objects and their relationships are represented in a concept graph. Each of these objects includes a type (paragraph 0084) and a relationships score may be calculated based on a distance represented by a number of hops (paragraph 0085)). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Viau with Smith-Apps-Tosik, with a reasonable expectation of success, as it would have allowed for determining the distance between different nodes having different demographic information (Viau: paragraphs 0085 and 0097).
	Finally, Fujimaki, which is analogous to the claimed invention because it is directed toward storing demographic information, discloses wherein the type of the detected demographic information is stored at the leaf node of the tree model and the other type of the detected demographic information is stored at the other leaf node of the tree model (paragraphs 0078 and 0083: Here, a tree structure is comprised of a plurality of leaf nodes. Each of these leaf nodes stores demographic information and the classifier clusters nodes with associated demographic information stored as leaf nodes. Each of these leafs represents features (demographics) associated with particular stores (paragraph 0081)). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Fujimak with Smith-Apps-Tosik-Viau, with a reasonable expectation of success, as it would have allowed for classifying and clustering data and storing demographic information (Fujimaki: paragraph 0081).
	As per dependent claim 4, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Apps discloses:
	(e) determining how many a plurality of fields representing demographic information of a particular type are in a marked document (Figure 7; paragraph 0080)
	wherein determining occurs based at least in part on the number of fields of the particular type determined in (e) (Figure 7; paragraph 0080)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Apps with Smith, with a reasonable expectation of success, as it would have allowed for extracting and identifying contents based upon a field set. This would have enabled the extraction and identification of data based upon a complete set of data being received.
As per dependent claim 5, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 4, and the same rejection is incorporated herein. Smith discloses wherein the particular type is at least one of an address or phone number (paragraph 0032).
As per dependent claim 7, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 6, and the same rejection is incorporated herein. Apps discloses identifying fields in the sample set of pages based on known tags (paragraph 0209). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Apps with Smith, with a reasonable expectation of success, as it would have allowed for extracting and identifying contents based upon a learning module. This would have enabled the extraction and identification of data to evolve as more contents are ingested.
As per dependent claim 9, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Apps discloses wherein the extracting the set of features comprises identifying a distance between two or more types of demographic information from among the plurality of types of demographic information (paragraphs 0070 and 0209). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Apps with Smith, with a reasonable expectation of success, as it would have allowed for extracting and identifying contents based upon a learning module. This would have enabled the extraction and identification of data to evolve as more contents are ingested.
As per dependent claim 10, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Smith discloses wherein the extracting the set of features comprises identifying a number of pairs of demographic information on a given site of a data source (paragraph 0056).
With respect to claims 11, 14-15, and 17-20, the applicant discloses the limitations substantially similar to those in claims 1, 4-5, and 7-10, respectively. Claims 11, 14-15, and 17-20 are similarly rejected.
	As per dependent claim 22, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Smith discloses wherein the determining further comprises:
	parse the demographic information to identify a type of demographic information (paragraphs 0029-0032)
	extracting features of the identified type of demographic information (paragraphs 0029-0032: Here, data such as an email is parsed to identify contact information. This information is extracted and used to create and populate a contact record based on extracting contents such as name, company name, job title, and phone number).
	With respect to claim 24, the applicant discloses the limitations substantially similar to those in claim 22. Claim 24 is similarly rejected.

Claims 2, 6, 8, 12, 16, and 18 are rejected under 35 U.S.C. 103 as being unpatentable over Smith, Apps, Tosik, Viau, and Fujimaki and further in view of and further in view of Singh et al. (US 2021/0034858, filed 12 September 2019, hereafter Singh).
As per dependent claim 2, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Smith fails to specifically disclose using a machine learning model based on the determining in (c). However, Singh, which is analogous to the claimed invention because it is directed toward a machine learning model to identify related content, discloses using a machine learning model trained according to supervised machine learning algorithms and other documents (paragraph 0026) to use geometric distances between the plurality of medical form documents (paragraph 0017), the plurality of fields to determine whether any two or more fields of the demographic information represent a cluster (paragraphs 0029 and 0038). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Singh with Smith-Apps-Tosik, with a reasonable expectation of success, as it would have allowed for grouping of contents based upon a trained machine learning algorithm. This would have provided a user with the advantage of more accurately grouping components based upon ingested documents. 
As per dependent claim 6, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Smith fails to specifically disclose using a sample set of pages and corresponding location of identified fields. 
Singh discloses wherein training the machine learning model using a sample set of pages and corresponding location of identified fields (paragraphs 0029 and 0038). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Singh with Smith, with a reasonable expectation of success, as it would have allowed for grouping of contents based upon a trained machine learning algorithm. This would have provided a user with the advantage of more accurately grouping components based upon ingested documents.
As per dependent claim 8, Smith, Apps, Tosik, Viau, Fujimaki, and Singh disclose the limitations similar to those in claim 2, and the same rejection is incorporated herein. Apps discloses wherein the training the model comprises training the model using one or more of a support vector machine algorithm, linear regression algorithm, logistic regression algorithm, naïve Bayes algorithm, linear discriminant analysis algorithm, decision tree algorithm, k-nearest neighbor algorithm, neural networks algorithm, and a similarity learning algorithm (paragraphs 0040 and 0070). It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Apps with Smith, with a reasonable expectation of success, as it would have allowed for extracting and identifying contents based upon a learning module. This would have enabled the extraction and identification of data to evolve as more contents are ingested.
	With respect to claims 12, 16, and 18, the applicant discloses the limitations substantially similar to those in claims 2, 6, and 8, respectively. Claims 12 and 16 are similarly rejected.

Claims 21 and 23 are rejected under 35 U.S.C. 103 as being unpatentable over Smith, Apps, Tosik, Viau, and Fujimaki and further in view of Harris et al. (US 11526502, filed 25 September 2019, hereafter Harris).
As per dependent claim 21, Smith, Apps, Tosik, Viau, and Fujimaki disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Smith fails to specifically disclose:
generating a list of tasks for each of the plurality of data sources
assigning a task from the list of tasks to a corresponding data extractor
causing the corresponding data extractor to navigate the corresponding data source to the respective site and extract the demographic information from the respective site
However, Harris, which his analogous to the claimed invention because it is directed toward extracting contents, discloses:
generating a list of tasks for each of the plurality of data sources (Figure 5; column 12, lines 35-59)
assigning a task from the list of tasks to a corresponding data extractor (Figure 5; column 12, lines 35-59 and column 13, line 42-column 14, line 10: Here, a set of SQL-like queries exist for extracting data. These queries act as filters and identify the specific queries required to extract the data from the data source, including delimiters used to separate the data)
causing the corresponding data extractor to navigate the corresponding data source to the respective site and extract the demographic information from the respective site (Figures 5 and 9; column 20, lines 17-42)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Harris with Smith-Apps-Tosik, with a reasonable expectation of success, as it would have allowed for customizing the extractor to identify data of specific data sets. This would have provided the user with clean data and would have limited inappropriate data in the result set.
With respect to claim 23, the applicant discloses the limitations similar to those in claim 21. Claim 23 is similarly rejected.


Response to Arguments
Applicant’s arguments have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Smith, Apps, Tosik, Viau, and Fujimaki.


Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure:
Bettstetter et al. (Hop Distances in Homogeneous Ad Hoc Networks, 2003): Discloses determining the minimum number of hops between two nodes in a tree (Abstract)

Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Any inquiry concerning this communication or earlier communications from the examiner should be directed to KYLE R STORK whose telephone number is (571)272-4130. The examiner can normally be reached 8am - 2pm; 4pm - 6pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Omar Fernandez Rivas can be reached at 571/272-2589. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/KYLE R STORK/Primary Examiner, Art Unit 2128

Read full office action

Prosecution Timeline

Show 23 earlier events

Jul 17, 2024

Non-Final Rejection mailed — §103

Jan 17, 2025

Response Filed

Mar 05, 2025

Final Rejection mailed — §103

Sep 05, 2025

Request for Continued Examination

Sep 10, 2025

Response after Non-Final Action

Sep 23, 2025

Non-Final Rejection mailed — §103

Dec 22, 2025

Response Filed

Jan 21, 2026

Final Rejection mailed — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

17/172,519

Patent 12585935

EXECUTION BEHAVIOR ANALYSIS TEXT-BASED ENSEMBLE MALWARE DETECTOR

5y 1m to grant Granted Mar 24, 2026

17/317,435

Patent 12585937

SYSTEMS AND METHODS FOR DEEP LEARNING ENHANCED GARBAGE COLLECTION

4y 10m to grant Granted Mar 24, 2026

18/668,376

Patent 12585869

RECOMMENDATION PLATFORM FOR SKILL DEVELOPMENT

1y 10m to grant Granted Mar 24, 2026

17/126,091

Patent 12579454

PROVIDING EXPLAINABLE MACHINE LEARNING MODEL RESULTS USING DISTRIBUTED LEDGERS

5y 3m to grant Granted Mar 17, 2026

17/538,539

Patent 12579412

SPIKE NEURAL NETWORK CIRCUIT INCLUDING SELF-CORRECTING CONTROL CIRCUIT AND METHOD OF OPERATION THEREOF

4y 3m to grant Granted Mar 17, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

11-12

Expected OA Rounds

64%

Grant Probability

92%

With Interview (+28.1%)

3y 11m (~0m remaining)

Median Time to Grant

High

PTA Risk

Based on 869 resolved cases by this examiner. Grant probability derived from career allowance rate.