Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Information Disclosure Statement
The information disclosure statement (IDS) submitted on 9/17/2025 is in compliance with the provisions of 37 CFR 1.97. Accordingly, the information disclosure statement is being considered by the examiner.
Double Patenting
The nonstatutory double patenting rejection is based on a judicially created doctrine grounded in public policy (a policy reflected in the statute) so as to prevent the unjustified or improper timewise extension of the "right to exclude" granted by a patent and to prevent possible harassment by multiple assignees. See In re Goodman, 11 F.3d 1046, 29 USPQ2d 2010 (Fed. Cir. 1993); In re Longi, 759 F.2d 887, 225 USPQ 645 (Fed. Cir. 1985); In re Van Ornum, 686 F.2d 937, 214 USPQ 761 (CCPA 1982); In re Vogel, 422 F.2d 438, 164 USPQ 619 (CCPA 1970); and, In re Thorington, 418 F.2d 528, 163 USPQ 644 (CCPA 1969).
A timely filed terminal disclaimer in compliance with 37 CFR 1.321(c) may be used to overcome an actual or provisional rejection based on a nonstatutory double patenting ground provided the conflicting application or patent is shown to be commonly owned with this application. See 37 CFR 1.130(b).
Effective January 1, 1994, a registered attorney or agent of record may sign a terminal disclaimer. A terminal disclaimer signed by the assignee must fully comply with 37 CFR 3.73(b).
Independent claims 1 and 11 of the instant application are rejected under the judicially created doctrine of double patenting over claims 1 and 12 (respectively) of Xu et al. (U.S. Patent No. 12261827) since the claims, if allowed, would improperly extend the "right to exclude" already granted in the patent.
INSTANT APPLICATION
Xu et al. (U.S. Patent No. 12261827)
A method performed by one or more computers for managing network traffic to and from a server, wherein the server is configured to:
receive, from a client device, a query in a natural language, and
(ii) generate a response to the query in the natural language using a large language model, and
wherein the method comprises:
receiving, from the client device via a network connection, a network message
comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server;
processing the new query, using a text encoder, to generate an embedding vector of the new query;
identifying, from amongst a plurality of entries of a vector database, a particular entry based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an embedding vector of a particular query stored in the particular entry, wherein each of the plurality of entries comprises: (i) an embedding vector of a respective query, and (ii) a corresponding response to the respective query;
determining whether the similarity metric is greater than a threshold similarity value;
based on determining that the similarity metric is greater than the threshold similarity value, retrieving, from the particular entry, a cached response to the particular query, wherein the cached response was generated by the LLM and stored in association with the particular query prior to the network message being received; and sending the cached response to the client device.
1. A method performed by one or more computers for managing network traffic to and from a server configured to: (i) receive, from a client device, a query in a natural language, and (ii) generate a response to the query in the natural language, the method comprising: receiving, from the client device via a network connection, a network message comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server; processing the new query, using a text encoder, to generate an embedding vector of the new query; identifying, from amongst a plurality of entries of a vector database, a particular entry based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an embedding vector of a particular query stored in the particular entry, wherein each of the plurality of entries comprises: (i) an embedding vector of a respective query, and (ii) a corresponding response to the respective query; determining whether the similarity metric is greater than a threshold similarity value; based on determining that the similarity metric is greater than the threshold similarity value, sampling, from a distribution of random numbers, a random number; determining whether the random number satisfies a threshold condition; and based on determining that the random number satisfies the threshold condition, transmitting, to the server, the new query, receiving, from the server, a response to the new query, processing the response to the new query and the response corresponding to the particular query, using the text encoder, to generate embedding vectors of the response to the new query and the response corresponding to the particular query, calculating a second similarity metric between: (i) the embedding vector of the response to the new query and (ii) the embedding vector of the response corresponding to the particular query, determining whether the second similarity metric is greater than a second threshold similarity value, and based on determining that the second similarity metric is greater than the second threshold similarity value, sending the response to the new query or the response corresponding to the particular query to the client device.
As shown above, claims 1 and 12 of Xu et al. (U.S. Patent No. 12261827) contains at least the elements of claims 1 and 11 of the instant application and as such anticipates claims 1 and 11 of the instant application.
“A later application claim is not patentably distinct from an earlier patent claim if the later claim is obvious over, or anticipated by, the earlier claim. In re Longi, 759 F.2d at 896, 225 USPQ at 651 (affirming a holding of obviousness-type double patenting because the claims at issue were obvious over claims in four prior art patents); In re Berg, 140 F.3d at 1437, 46 USPQ2d at 1233 (Fed. Cir. 1998) (affirming a holding of obviousness-type double patenting where a patent application claim to a genus is anticipated by a patent claim to a species within that genus).” ELI LILLY AND COMPANY v BARR LABORATORIES, INC., United States Court of Appeals for the Federal Circuit, ON PETITION FOR REHEARING EN BANC (DECIDED: May 30, 2001).
Claim Rejections - 35 USC § 102
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –
(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale or otherwise available to the public before the effective filing date of the claimed invention.
Claims 1-20 are rejected under 35 U.S.C. 102(a)(1) as being anticipated by U.S. Patent No. 20230135179 to Mielke et al.
As to claim 1, Mielke discloses a method performed by one or more computers for managing network traffic to and from a server, wherein the server is configured to:
receive, from a client device, a query in a natural language (nlu [0006]), and
(ii) generate a response to the query in the natural language using a large language model (LLM) (nlu [0006]), and
wherein the method comprises:
receiving, from the client device via a network connection, a network message
comprising a new query for the server, wherein the one or more computers are communicatively coupled to the server (fig. 1,2);
processing the new query, using a text encoder, to generate an embedding vector of the
new query (vector [0091]);
identifying, from amongst a plurality of entries of a vector database, a particular entry
based on a similarity metric between: (i) the embedding vector of the new query, and (ii) an
embedding vector of a particular query stored in the particular entry (similarity metric [0331]),
wherein each of the plurality of entries comprises: (i) an embedding vector of a
respective query, and (ii) a corresponding response to the respective query ([0331]);
determining whether the similarity metric is greater than a threshold similarity value (threshold [0348]);
based on determining that the similarity metric is greater than the threshold similarity
value (threshold [0348]),
retrieving, from the particular entry, a cached response to the particular query (data cache. [0353]), wherein the cached response was generated by the LLM and stored in association with the particular query prior to the network message being received (data cache. [0353]); and
sending the cached response to the client device (data cache. [0353]).
As to claim 2, Mielke discloses a method of claim 1, comprising:
sampling, from a distribution of random numbers, a random number ([0272]); and
determining that the random number satisfies a threshold condition ([0272]),
wherein retrieving the cached response is based on the random number satisfying the
threshold condition ([0272]).
As to claim 3, Mielke discloses a method of claim 2, wherein each of the plurality of entries further comprises
a respective hit rate characterizing a frequency at which the corresponding response of the entry is retrieved. ([0261][0317]).
As to claim 4, Mielke discloses a method of claim 3, comprising, based on determining that the similarity metric is greater than the threshold similarity value, and before determining that the random number satisfies the threshold condition:
updating a hit rate for the particular entry ([0261][0317]); and
generating a threshold number corresponding to the threshold condition based on the hit rate for the particular entry ([0261][0317]).
As to claim 5, Mielke discloses a method of claim 4, wherein generating the threshold number is performed such that a probability of the random number satisfying the threshold condition is more likely as the hit rate increases ([0261][0317]).
As to claim 6, Mielke discloses a method of claim 1, wherein the similarity metric comprises a cosine similarity or an inverse distance metric ([0261][0317]).
As to claim 7, Mielke discloses a method of claim 1, wherein the plurality of entries are organized in the vector database based on inter-entry query similarities, and
wherein identifying the particular entry comprises iteratively evaluating neighboring
entries in the vector database.(fig. 3).
As to claim 8, Mielke discloses a method of claim 1, wherein identifying the particular entry comprises:
performing, with respect to the embedding vector of the new query, a vector search on the embedding vectors of the queries stored in the plurality of entries ([0261][0317]); and
identifying, from the vector search, the particular entry as the respective entry having the similarity metric with a greatest respective value ([0261][0317]).
As to claim 9, Mielke discloses a method of claim 8, wherein the vector search comprises a k-nearest-neighbors search. (nearest [0096]).
As to claim 10, Mielke discloses a method of claim 1, further comprising, upon determining that a second similarity metric corresponding to a second new query is not greater than the threshold similarity value:
transmitting, to the server, the second new query (fig. 2);
receiving, from the server, a response to the second new query, the response to the second new query being generated by the LLM (fig. 2);
storing, in a new entry of the vector database, (i) an embedding vector of the second new query, and (ii) the response to the second new query ([0261][0317]) (fig. 2); and
transmitting, to the client device via the network connection, a network message
comprising the response to the second new query (fig. 2).
As to claims 11-20, the limitations of these claims have been noted in the rejection above. They are therefore rejected as set forth above.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Yicun Wu whose telephone number is 571-272-4087. The examiner can normally be reached on 8:00 am to 4:30 pm, Monday -Friday.
If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, Kavita Stanley, can be reached on (571) 571-272-8352. The fax phone numbers for the organization where this application or proceeding is assigned are 571-273-8300.
Any inquiry of a general nature or relating to the status of this application or proceeding should be directed to the receptionist whose telephone number is 571-272-2100.
Information regarding the status of an application may be obtained from the Patent Application Information Retrieval (PAIR) system. Status information for published applications may be obtained from either Private PAIR or Public PAIR.
Status information for unpublished applications is available through Private PAIR only.
For more information about the PAIR system:
"http://portal.uspto.gov/external/portal/pair"
Should you have questions on access to the Private PAIR system, contact the Electronic Business Center (EBC) 866-217-9197 (toll-free)
If you would like assistance from a USPTO Customer Service Representative or access to the automated information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
Yicun Wu
Patent Examiner
Technology Center 2100
/YICUN WU/
Primary Examiner, Art Unit 2153