Last updated: April 19, 2026

Application No. 18/734,488

SYSTEMS AND METHODS FOR PROVIDING CURATED DATASETS ACCORDING TO DATA FROM DISPARATE DATA SOURCES

Non-Final OA §103

Filed

Jun 05, 2024

Examiner

GANGER, LAUREN ZANNAH

Art Unit

2156

Tech Center

2100 — Computer Architecture & Software

Assignee

Wells Fargo Bank N A

OA Round

3 (Non-Final)

Interview Optional

— +12.0% interview lift. This examiner has a relatively high allow rate; a written response may suffice.

Based on 271 resolved cases, 2023–2026

Examiner Intelligence

GANGER, LAUREN ZANNAH View full profile →

Grants 82% — above average

Career Allow Rate

221 granted / 271 resolved

+26.5% vs TC avg

Moderate +12% lift

Without

With

+12.0%

Interview Lift

resolved cases with interview

Typical timeline

2y 9m

Avg Prosecution

7 currently pending

Career history

278

Total Applications

across all art units

Statute-Specific Performance

§101

10.8%

-29.2% vs TC avg

§103

44.1%

+4.1% vs TC avg

§102

22.5%

-17.5% vs TC avg

§112

9.7%

-30.3% vs TC avg

Black line = Tech Center average estimate • Based on career data from 271 resolved cases

Office Action

§103

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Response to Amendment
The amendment filed 1/26/2026 has been entered. Claims 1, 9, and 17 stand amended. Claims 1-20 are currently pending.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA  to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 1-20  is/are rejected under 35 U.S.C. 103 as being unpatentable over Hudetz et al. in US Patent Application Publication № 2024/0370479, hereinafter called Hudetz, in combination with Gentilcore et al. in US Patent Application Publication № 2022/0269703, hereinafter called Gentilcore.

In regard to claim 1, Hudetz teaches a method comprising:
scraping, by a first computing system, one or more first data sources of the first computing system, and one or more second data sources of one or more external computing systems, to compile a first dataset (“Examples of data sources 302 include without limitation databases, web scraping, sensors and Internet of Things (IoT) devices, image and video cameras, audio devices, text generators, publicly available databases, private databases, and many other data sources 302. The data sources 302 may be remote from the artificial intelligence architecture 300 and accessed via a network, local to the artificial intelligence architecture 300 an accessed via a network interface, or may be a combination of local and remote data sources 302.” Paragraph 0098);
standardizing, by the first computing system, the first dataset to generate a standardized dataset (“The document manager 120 may process a document container 128 to generate a document image 140. The document image 140 is a unified or standard file format for an electronic document used by a given EDMP implemented by the system 100.” paragraph 0072);
applying, by the first computing system, a first artificial intelligence (AI) algorithm to assign labels to data entries of the standardized dataset (“This can be useful for tasks such as document content classification or sentiment analysis, where the search model 704 assigns a label or score to a portion of a document or the entire document based on its content” paragraph 0140; “One or more of the information blocks 710 and/or the document vectors 726 may optionally include block labels assigned using a machine learning model, such as a classifier.” paragraph 0164);
compiling, by the first computing system, the standardized dataset having the labels assigned to the respective data entries in a database (“A corpus can include a variety of document types such as web pages, books, news articles, social media posts, scientific papers, and more. The corpus may be created for a specific domain or purpose, and it may be annotated with metadata or labels to facilitate analysis. Document corpora are commonly used in research and industry to train machine learning models and to develop NLP applications.” Paragraph 0120);
establishing, by the first computing system in response to one or more credentials associated with a user (“The client device 112 may have utilized various work flows to identify the signers and associated network addresses (e.g., email address, short message service, multimedia message service, chat message, social message, etc.). For example, the client 134 may utilize workflows to identify multiple parties to the lease including bankers, landlord, and tenant. Further, the client 134 may utilize workflows to identify network addresses ( e.g., email address) for each of the signers.” Paragraph 0079), a session between the first computing system and a computing device associated with the user (i.e. workflow, “Further for
example, the client 134 may utilize workflows to configure communication of the document image 140 in parallel to multiple parties including the first party, second party, third party, and so forth, to obtain the signatures of each of the parties irrespective of any temporal order of their signatures.” Paragraph 0079);
providing, by the first computing system during the session, an Al interface for display on the computing device (i.e. a GUI, “The method may also include receiving the search query from a search box of a graphical user interface (GUI) on a web page or a click event on a GUI element on the web page.” Paragraph 0362; note that this GUI meets the broadest reasonable interpretation of an AI interface in at least that the interface includes AI features, “In one aspect, a method, includes receiving a search query for information within an electronic document in a natural language representation, generating a contextualized embedding for the search query to form a search vector, retrieving a set of candidate document vectors that are semantically similar to the search vector from a document index of contextualized embeddings for the electronic document, sending a request to a generative artificial intelligence (AI) model for an abstractive summary of document content for a subset of candidate document vectors, the abstractive summary to comprise a natural language representation, and receiving a response with the abstractive summary from the generative AI model.” Paragraph 0361);
receiving, by the first computing system via the AI interface of the first computing system, a query from a computing device (“The search manager 124 may receive a search query 144, encode it to a contextualized embedding in real-time, and leverage vector search to retrieve search results 146 with semantically similar document content within an electronic document 706.” Paragraph 0144); 
processing, by the first computing system, the query, wherein processing the query comprises tokenizing one or more words included in the query (“In one embodiment, for example, the search model 704 may implement a BERT based encoder. BERT is a transformer-based neural network architecture that is widely used for generating contextualized embeddings in natural language processing tasks. The main components of the BERT model are the encoders, which are responsible for generating the contextualized embeddings for each token in the input sequence” paragraph 0138);
and  generating, by the first computing system, a response to the query for delivering via the Al interface to the computing device (“The search process may produce a set of search results 146.” Paragraph 0148).
In regard to claims 9 and 17, they are substantially similar to claim 1 and accordingly are rejected under similar reasoning.
However, while Hudetz does teach establishing, by the first computing system in response to one or more credentials associated with a user, a session between the first computing system and a computing device associated with the user, and providing, by the first computing system during the session, an Al interface for display on the computing device; he fails to expressly teach  determining, by the first computing system in response to authenticating one or more credentials associated with a user, access rights using an identifier within the one or more credentials; or providing, by the first computing system during the session, an Al interface for display on the computing device according to the access rights of the user.
Gentilcore teaches determining, by the first computing system in response to authenticating one or more credentials associated with a user, access rights using an identifier within the one or more credentials (“The networked computing environment 100 may provide access to protected resources (e.g., networks, servers, storage devices, files, and computing applications) based on access rights (e.g., read, write, create, delete, or execute rights) that are tailored to particular users of the computing environment ( e.g., a particular employee or a group of users that are identified as belonging to a particular group or classification). An access control system may perform various functions for managing access to resources including authentication, authorization, and auditing. Authentication may refer to the process of verifying that credentials provided by a user or entity are valid or to the process of confirming the identity associated with a user or entity (e.g., confirming that a correct password has been entered for a given usemame ). Authorization may refer to the granting of a right or permission to access a protected resource or to the process of determining whether an authenticated user is authorized to access a protected resource. Auditing may refer to the process of storing records” Paragraph 0043);
 or providing, by the first computing system during the session, an Al interface for display on the computing device according to the access rights of the user (“In some cases, an access control system may manage access to a protected resource by requiring authentication information or authenticated credentials ( e.g., a valid username and password) before granting access to the protected resource. For example, an access control system may allow a remote computing device (e.g., a mobile phone) to search or access a protected resource, such as a file, webpage, application, or cloud-based application, via a web browser if valid credentials can be provided to the access control system” paragraph 0043).
It would have been obvious to one of ordinary skill in the art before the effective filing date of the instant invention to modify the AI document search system taught by Hudetz to include authentication of credentials to determine access rights to access via a web browser, as taught by Gentilcore. It would have been obvious because it represents the application of a known technique (i.e. the authentication of user credentials to determine access rights, especially to a search system web interface, as taught by Gentilcore in at least paragraph 0043) to a known system (i.e. the AI-based document search system, which includes a web page search GUI, as taught by Hudetz in at least paragraph 0219) ready for improvement to yield predictable results (i.e. the web page search system will use authentication of user credentials to determine access privileges). One would have been motivated to do so in order to ensure compliance with regulations, as taught by Gentilcore (“In some cases, a particular set of data may be associated with an ACL that determines which users within an organization may access the particular set of data. In one example, to ensure compliance with data security and retention regulations, the particular set of data may comprise sensitive or confidential information that is restricted to viewing by only a first group of users. In another example, the particular set of data may comprise source code and technical documentation for a particular product that is restricted to viewing by only a second group of users.” paragraph 0056)

In regard to claim 2, Hudetz further teaches that the query comprises an inquiry for information relating to an enterprise (“In some embodiments, the document corpus 508 may be proprietary and confidential in nature and associated with a particular defined entity, such as an individual,  a business, a business unit, a company, an organization, an enterprise, or other defined legal or business structure.” Paragraph 0130), and wherein the response includes values for a plurality of fields relating to the enterprise (the fields would relate to the enterprise if the documents from which they are drawn relate to the enterprise) [Examiner’s note: this dependent claim appears to be directed entirely towards an intended use of the claimed invention, and accordingly this claim does not appear to carry patentable weight].
In regard to claims 10 and 20, they are substantially similar to claim 2 and accordingly are rejected under similar reasoning.

In regard to claim 3, Hudetz further teaches that the one or more first data sources comprise a customer relationship management (CRM) platform and a document database (“In some cases, the document corpus may be associated with a particular entity, such as a customer or client of the electronic document management company, and may therefore contain proprietary, strategic and valuable business information.” Paragraph 0046).
In regard to claims 11 and 19, they are substantially similar to claim 3 and accordingly are rejected under similar reasoning.

In regard to claim 4, Hudetz further teaches that generating the response comprises:
generating, by the first computing system, a plurality of tokens representing the query (i.e. each value in an embedding token, “The search manager 124 may receive a search query 144, encode it to a contextualized embedding in real-time, and leverage vector search to retrieve search results 146 with semantically similar document content within an electronic document 706.” Paragraph 0144); wherein the plurality of tokens are generated by the tokenizing of the one or more words included in the query ( The search manager 124 may use the search model 704 to generate a contextualized embedding for the search query 144 to form a search vector. As previously discussed, a contextualized embedding may comprise a vector representation of a sequence of words in the search query 144 that includes contextual information for the sequence of words.” Paragraph 0145, note that the tokenizing is expressly taught in paragraph 0138, “The main components of the BERT model are the encoders, which are responsible for generating the contextualized embeddings for each token in the input sequence. BERT”);
encoding, by the first computing system, each token into a corresponding encoded token (“The search manager 124 may use the search model 704 to generate a contextualized embedding for the search query 144 to form a search vector. As previously discussed, a contextualized embedding may comprise a vector representation of a sequence of words in the search query 144 that includes contextual information for the sequence of words.” Paragraph 0145);
applying, by the first computing system, the encoded tokens to an AI model, to determine a context associated with the query (“Additionally, or alternatively, the search query 144 may be modified or expanded using context information 734. The context information 734 may be any information that provides some context for the search query 144. For example, the context information 734 may comprise a previous search query 144 by the same user, a search query 144 submitted by other users, or prior search results 146 from a previous search query 144.” Paragraph 0147);
requesting, by the first computing system, one or more data entries from the database and/or from the one or more first data sources or the one or more second data sources, the one or more data entries requested according to the determined context (“The search manager 124 may search a document index 730 of contextualized embeddings for the electronic document 706 with the search vector, which is itself a contextualized embedding of the same type as those stored in the document index 730. Each contextualized embedding may comprise a vector representation of a sequence of words in the electronic document that includes contextual information for the sequence of words.” Paragraph 0148);
applying, by the first computing system, data corresponding to the one or more data entries and the encoded tokens to the AI model (“The search manager 124 may search a document index 730 of contextualized embeddings for the electronic document 706 with the search vector, which is itself a contextualized embedding of the same type as those stored in the document index 730.” Paragraph 0148); 
and generating, by the first computing system, the response based on an output from the AI model (“The search process may produce a set of search results 146. The search results 146 may include a set of P candidate document” paragraph 0148).
In regard to claims 12 and 18, they are substantially similar to claim 4 and accordingly are rejected under similar reasoning.

In regard to claim 6, Hudetz further teaches that the first computing system generates the response to the query using at least a portion of the data from the database populated with the standardized dataset (“In some embodiments, as with the document vectors 726, the candidate document vectors 718 may include or make reference to text components 606 for an electronic document 706. Alternatively, the text components 606 may be encoded into a different format other than a vector, such as text strings, for example.” Paragraph 149, wherein “The search model 704 can then aggregate the embeddings of the document tokens using an attention mechanism to weight the importance of each token based on its relevance to the query. Specifically, the search model 704 can compute the attention scores between the query embedding and each document token embedding using the dot product or the cosine similarity” paragraph 0150).
In regard to claim 14, it is substantially similar to claim 6 and accordingly is rejected under similar reasoning.

In regard to claim 7, Hudetz further teaches:
 training, by the first computing system, the first AI algorithm, using a training dataset including a plurality of standardized data entries and corresponding labels associated with respective data entries (“A supervised algorithm is a type of machine learning algorithm that uses labeled data to train a machine learning model. In supervised learning, the machine learning algorithm is given a set of input data and corresponding output data, which are used to train the model to make predictions or classifications” paragraph 0094, wherein “The ML algorithm 326 of the artificial intelligence architecture 300 may be implemented using various types of ML algorithms including supervised algorithms, unsupervised algorithms, semi-supervised algorithms, or a combination thereof.” Paragraph 0097);
 and deploying, by the first computing system (“The model evaluator 206 may be communicatively coupled to a model inferencer 208. The model inference 208 provides AI/ML model inference output ( e.g., predictions or decisions). Once the ML model 312 is trained and evaluated, it can be deployed in a production environment where it can be used to make predictions on new” paragraph 0104), the first AI algorithm responsive to the first AI algorithm satisfying a training criteria (“The model trainer 204 may be communicatively coupled to a model evaluator 206. After an ML model 312 is trained, the ML model 312 needs to be evaluated to assess its performance. This is done using various metrics such as accuracy, precision, recall, and F1 score” paragraph 0103, wherein “The training process involves feeding the pre-processed data 318 into the ML algorithm 326 to produce or optimize an ML model 312. The training process adjusts its parameters until it achieves an initial level of satisfactory performance” paragraph 0102).
In regard to claim 15, it is substantially similar to claim 7 and accordingly is rejected under similar reasoning.

In regard to claim 8, Hudetz further teaches that the first computing system generates the response to the query by applying data corresponding to the query to a second AI algorithm, the second AI algorithm configured to generate the response to the query using data from the database (i.e. another trained algorithm, “In particular, the search manager 124 may train, evaluate, revise and deploy AI/ML algorithms to assist in receiving and understanding a search query 144 using NLU techniques, semantically searching for relevant information within electronic documents 142 to produce a set of search results 146, and summarizing the search results 146 in a natural language representation for better understanding and consumption by a human reader. System 200 illustrates an AI/ML infrastructure and environment suitable for deploying AI/ML algorithms to support operations for the search manager 124.” Paragraph 0088; alternatively or additionally, “The model inferencer 208 may also perform model monitoring and maintenance, which involves continuously monitoring performance of the search model 704 in the production environment and making any necessary updates or modifications to maintain its accuracy and effectiveness.” Paragraph 0104).
In regard to claim 16, it is substantially similar to claim 8 and accordingly is rejected under similar reasoning.


In regard to claim 5, Gentilcore further teaches scrubbing (i.e. filtering), by the first computing system, the response to the query based on a user of the first computing device (“The query handler 216 may then query the search index 204 with a filter that restricts the retrieved set of relevant documents such that the ACLs for the retrieved documents permit the user to access or view each of the retrieved set of relevant documents” paragraph 0065).
In regard to claim 13, it is substantially similar to claim 5 and accordingly is rejected under similar reasoning.




Response to Arguments
Applicant’s arguments, see pages 8-10, filed 1/26/2026, with respect to the rejection(s) of claim(s) 1-20 under 35 U.S.C. 103 have been fully considered and are persuasive.  Therefore, the rejection has been withdrawn.  However, upon further consideration, a new ground(s) of rejection is made in view of Hudetz and Gentilcore. For more information please refer to the relevant sections above.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Lauren Z Ganger whose telephone number is (571)272-0270. The examiner can normally be reached 10:00 AM - 7:30 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Ajay Bhatia can be reached at (571) 272-3906. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/AJAY M BHATIA/Supervisory Patent Examiner, Art Unit 2156

Read full office action

Prosecution Timeline

Jun 05, 2024

Application Filed

Mar 20, 2025

Non-Final Rejection — §103

Jun 13, 2025

Interview Requested

Jun 24, 2025

Applicant Interview (Telephonic)

Jun 26, 2025

Response Filed

Jun 27, 2025

Examiner Interview Summary

Oct 21, 2025

Final Rejection — §103

Jan 12, 2026

Interview Requested

Jan 26, 2026

Request for Continued Examination

Jan 31, 2026

Response after Non-Final Action

Mar 10, 2026

Non-Final Rejection — §103 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

19/081,380

Patent 12602395

APPARATUS AND METHOD FOR FILTERING VISUALIZATIONS FROM OR ACROSS DIFFERENT ANALYTICS PLATFORMS

2y 5m to grant Granted Apr 14, 2026

18/941,091

Patent 12596678

HYPERGRAPH DATA STORAGE METHOD AND APPARATUS WITH TEMPORAL CHARACTERISTIC AND HYPERGRAPH DATA QUERY METHOD AND APPARATUS WITH TEMPORAL CHARACTERISTIC

2y 5m to grant Granted Apr 07, 2026

18/679,600

Patent 12561341

REAL-TIME REPLICATION OF DATABASE MANAGEMENT SYSTEM TRANSACTIONS INTO A DATA LAKEHOUSE

2y 5m to grant Granted Feb 24, 2026

18/808,541

Patent 12547639

ENRICHING EVENT STREAMS WITH ENTITY DATA

2y 5m to grant Granted Feb 10, 2026

18/203,195

Patent 12541547

PROFILE-ENRICHED EXPLANATIONS OF DATA-DRIVEN MODELS

2y 5m to grant Granted Feb 03, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.

Prosecution Projections

3-4

Expected OA Rounds

82%

Grant Probability

94%

With Interview (+12.0%)

2y 9m

Median Time to Grant

High

PTA Risk

Based on 271 resolved cases by this examiner. Grant probability derived from career allow rate.