Last updated: May 29, 2026

Application No. 19/088,095

PROXY SERVERS FOR MANAGING QUERIES TO LARGE LANGUAGE MODELS

Non-Final OA §102

Filed

Mar 24, 2025

Priority

Jan 19, 2024 — continuation of 12/261,827

Examiner

WU, YICUN

Art Unit

2153

Tech Center

2100 — Computer Architecture & Software

Assignee

Auradine, Inc.

OA Round

1 (Non-Final)

Interview Optional

— +17.1% interview lift. Examiner has a relatively high allowance rate (81%); +17.1% interview lift. A written response may suffice.

Based on 603 resolved cases, 2023–2026

Examiner Intelligence

WU, YICUN View full profile →

Grants 81% — above average

Career Allowance Rate

491 granted / 603 resolved

+26.4% vs TC avg

Strong +17% interview lift

Without

With

+17.1%

Interview Lift

resolved cases with interview

Typical timeline

3y 3m

Avg Prosecution

7 currently pending

Career history

614

Total Applications

across all art units

Statute-Specific Performance

§101

1.7%

-38.3% vs TC avg

§103

69.7%

+29.7% vs TC avg

§102

27.6%

-12.4% vs TC avg

§112

0.4%

-39.6% vs TC avg

Black line = Tech Center average estimate • Based on career data from 603 resolved cases

Office Action

§102

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Information Disclosure Statement
The information disclosure statement (IDS) submitted on 9/17/2025 is in compliance with the provisions of 37 CFR 1.97.  Accordingly, the information disclosure statement is being considered by the examiner.

Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees.  See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application.  See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer.  A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).

Independent claims 1 and 11 of the instant application are rejected under the judicially created doctrine of double patenting over claims 1 and 12 (respectively) of Xu et al.  (U.S. Patent No. 12261827) since the claims, if allowed, would improperly extend the "right to exclude" already granted in the patent.

INSTANT APPLICATION 
Xu et al.  (U.S. Patent No. 12261827)
A method performed by one or more computers for managing network traffic to and from a server, wherein the server is configured to: 
receive, from a client device, a query in a natural language, and 
(ii) generate a response to the query in the natural language using a large language model, and
wherein the method comprises:
receiving, from the client device via a network connection, a network message
comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server;
processing the new query, using a text encoder, to generate an embedding vector of the new query;
identifying, from amongst a plurality of entries of a vector database, a particular entry based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an embedding vector of a particular query stored in the particular entry, wherein each of the plurality of entries comprises: (i) an embedding vector of a respective query, and (ii) a corresponding response to the respective query;
determining whether the similarity metric is greater than a threshold similarity value;
based on determining that the similarity metric is greater than the threshold similarity value, retrieving, from the particular entry, a cached response to the particular query, wherein the cached response was generated by the LLM and stored in association with the particular query prior to the network message being received; and sending the cached response to the client device.
1. A method performed by one or more computers for managing network traffic to and from a server configured to: (i) receive, from a client device, a query in a natural language, and (ii) generate a response to the query in the natural language, the method comprising: receiving, from the client device via a network connection, a network message comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server; processing the new query, using a text encoder, to generate an embedding vector of the new query; identifying, from amongst a plurality of entries of a vector database, a particular entry based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an embedding vector of a particular query stored in the particular entry, wherein each of the plurality of entries comprises: (i) an embedding vector of a respective query, and (ii) a corresponding response to the respective query; determining whether the similarity metric is greater than a threshold similarity value; based on determining that the similarity metric is greater than the threshold similarity value, sampling, from a distribution of random numbers, a random number; determining whether the random number satisfies a threshold condition; and based on determining that the random number satisfies the threshold condition, transmitting, to the server, the new query, receiving, from the server, a response to the new query, processing the response to the new query and the response corresponding to the particular query, using the text encoder, to generate embedding vectors of the response to the new query and the response corresponding to the particular query, calculating a second similarity metric between: (i) the embedding vector of the response to the new query and (ii) the embedding vector of the response corresponding to the particular query, determining whether the second similarity metric is greater than a second threshold similarity value, and based on determining that the second similarity metric is greater than the second threshold similarity value, sending the response to the new query or the response corresponding to the particular query to the client device.




As shown above, claims 1 and 12 of Xu et al.  (U.S. Patent No. 12261827) contains at least the elements of claims 1 and 11 of the instant application and as such anticipates claims 1 and 11 of the instant application.

“A later application claim is not patentably distinct from an earlier patent claim if the later claim is obvious over, or anticipated by, the earlier claim.  In re Longi, 759 F.2d at 896, 225 USPQ at 651 (affirming a holding of obviousness-type double patenting because the claims at issue were obvious over claims in four prior art patents); In re Berg, 140 F.3d at 1437, 46 USPQ2d at 1233 (Fed. Cir. 1998) (affirming a holding of obviousness-type double patenting where a patent application claim to a genus is anticipated by a patent claim to a species within that genus).”  ELI LILLY AND COMPANY  v  BARR LABORATORIES, INC., United States Court of Appeals for the Federal Circuit, ON PETITION FOR REHEARING EN BANC (DECIDED: May 30, 2001).

Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.

Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by U.S. Patent No. 20230135179 to Mielke et al.  
As to claim 1, Mielke discloses a method performed by one or more computers for managing network traffic to and from a server, wherein the server is configured to: 
receive, from a client device, a query in a natural language (nlu [0006]), and 
(ii) generate a response to the query in the natural language using a large language model (LLM) (nlu [0006]), and
wherein the method comprises:
receiving, from the client device via a network connection, a network message
comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server (fig. 1,2);
processing the new query, using a text encoder, to generate an embedding vector of the
new query (vector [0091]);
identifying, from amongst a plurality of entries of a vector database, a particular entry
based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an
embedding vector of a particular query stored in the particular entry (similarity metric [0331]),
wherein each of the plurality of entries comprises: (i) an embedding vector of a
respective query, and (ii) a corresponding response to the respective query ([0331]);
determining whether the similarity metric is greater than a threshold similarity value (threshold [0348]);
based on determining that the similarity metric is greater than the threshold similarity
value (threshold [0348]), 
retrieving, from the particular entry, a cached response to the particular query (data cache. [0353]), wherein the cached response was generated by the LLM and stored in association with the particular query prior to the network message being received (data cache. [0353]); and
sending the cached response to the client device (data cache. [0353]).

As to claim 2, Mielke discloses a method of claim 1, comprising:
sampling, from a distribution of random numbers, a random number ([0272]); and
determining that the random number satisfies a threshold condition ([0272]),
wherein retrieving the cached response is based on the random number satisfying the
threshold condition ([0272]).

As to claim 3, Mielke discloses a method of claim 2, wherein each of the plurality of entries further comprises
a respective hit rate characterizing a frequency at which the corresponding response of the entry is retrieved. ([0261][0317]).

As to claim 4, Mielke discloses a method of claim 3, comprising, based on determining that the similarity metric is greater than the threshold similarity value, and before determining that the random number satisfies the threshold condition:
updating a hit rate for the particular entry ([0261][0317]); and
generating a threshold number corresponding to the threshold condition based on the hit rate for the particular entry ([0261][0317]).

As to claim 5, Mielke discloses a method of claim 4, wherein generating the threshold number is performed such that a probability of the random number satisfying the threshold condition is more likely as the hit rate increases ([0261][0317]).

As to claim 6, Mielke discloses a method of claim 1, wherein the similarity metric comprises a cosine similarity or an inverse distance metric ([0261][0317]).

As to claim 7, Mielke discloses a method of claim 1, wherein the plurality of entries are organized in the vector database based on inter-entry query similarities, and
wherein identifying the particular entry comprises iteratively evaluating neighboring
entries in the vector database.(fig. 3).

As to claim 8, Mielke discloses a method of claim 1, wherein identifying the particular entry comprises:
performing, with respect to the embedding vector of the new query, a vector search on the embedding vectors of the queries stored in the plurality of entries ([0261][0317]); and
identifying, from the vector search, the particular entry as the respective entry having the similarity metric with a greatest respective value ([0261][0317]).

As to claim 9, Mielke discloses a method of claim 8, wherein the vector search comprises a k-nearest-neighbors search. (nearest [0096]).

As to claim 10, Mielke discloses a method of claim 1, further comprising, upon determining that a second similarity metric corresponding to a second new query is not greater than the threshold similarity value:
transmitting, to the server, the second new query (fig. 2);
receiving, from the server, a response to the second new query, the response to the second new query being generated by the LLM (fig. 2);
storing, in a new entry of the vector database, (i) an embedding vector of the second new query, and (ii) the response to the second new query ([0261][0317]) (fig. 2); and
transmitting, to the client device via the network connection, a network message
comprising the response to the second new query (fig. 2).

As to claims 11-20, the limitations of these claims have been noted in the rejection above. They are therefore rejected as set forth above.

Conclusion
	Any inquiry concerning this communication or earlier communications from the examiner should be directed to Yicun Wu whose telephone number is 571-272-4087.  The examiner can normally be reached on 8:00 am to 4:30 pm, Monday -Friday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Kavita Stanley, can be reached on (571) 571-272-8352. The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2100.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR.
Status information for unpublished applications is available through Private PAIR only.
For more information about the PAIR system:
"http://portal.uspto.gov/external/portal/pair"
Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) 866-217-9197 (toll-free)
If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





Yicun Wu
Patent Examiner
Technology Center 2100
/YICUN WU/
Primary Examiner, Art Unit 2153

Read full office action

Prosecution Timeline

Mar 24, 2025

Application Filed

Jan 26, 2026

Non-Final Rejection mailed — §102 (current)

Precedent Cases

Applications granted by this same examiner with similar technology

18/936,451

Patent 12625846

CONTROLLING ACTIONS IN A FILE SYSTEM ENVIRONMENT USING BUCKETS CORRESPONDING TO PRIORITY

1y 6m to grant Granted May 12, 2026

17/706,145

Patent 12602351

Methods and Systems for Archiving File System Data Stored by a Networked Storage System

4y 0m to grant Granted Apr 14, 2026

18/541,123

Patent 12547643

UNIFIED CONTEXT-AWARE CONTENT ARCHIVE SYSTEM

2y 1m to grant Granted Feb 10, 2026

17/973,322

Patent 12541693

GENERATING AND UPGRADING KNOWLEDGE GRAPH DATA STRUCTURES

3y 3m to grant Granted Feb 03, 2026

19/007,602

Patent 12536239

METHODS AND SYSTEMS FOR REFRESHING CURRENT PAGE INFORMATION

1y 0m to grant Granted Jan 27, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

81%

Grant Probability

98%

With Interview (+17.1%)

3y 3m (~2y 1m remaining)

Median Time to Grant

Low

PTA Risk

Based on 603 resolved cases by this examiner. Grant probability derived from career allowance rate.