Last updated: May 04, 2026

Application No. 18/772,080

ADVANCED SEMANTIC CACHING WITH CDN FOR RAG-BASED LLM APPLICATIONS

Non-Final OA §101

Filed

Jul 12, 2024

Examiner

PATEL, SHREYANS A

Art Unit

2659

Tech Center

2600 — Communications

Assignee

Microsoft Technology Licensing, LLC

OA Round

1 (Non-Final)

Interview Optional

— +7.4% interview lift. Interview already conducted in this application's prosecution history. This examiner has a 89% grant rate with +7.4% interview lift. Since an interview has already been tried, recommend written response with narrowed claims based on precedent claim evolution patterns.

Based on 404 resolved cases, 2023–2026

Examiner Intelligence

PATEL, SHREYANS A View full profile →

Grants 89% — above average

Career Allowance Rate

360 granted / 404 resolved

+27.1% vs TC avg

Moderate +7% lift

Without

With

+7.4%

Interview Lift

resolved cases with interview

Fast prosecutor

2y 0m

Avg Prosecution

45 currently pending

Career history

449

Total Applications

across all art units

Statute-Specific Performance

§101

21.3%

-18.7% vs TC avg

§103

36.1%

-3.9% vs TC avg

§102

22.6%

-17.4% vs TC avg

§112

8.8%

-31.2% vs TC avg

Black line = Tech Center average estimate • Based on career data from 404 resolved cases

Office Action

§101

DETAILED ACTION

Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101.
Claims 1, 9 and 17 are directed to an abstract idea of organizing, comparing, and distributing information based on semantic similarity and access permissions. The claimed steps of receiving a natural language prompt, matching the prompt to a semantically similar cached response, determining whether a user has access permission, and sending a response upon satisfaction of that condition constitute mental processes and methods of organizing human activity that could be performed conceptually by a human or with pen-and-paper equivalents. In particular, semantically comparing a request to previously stored information and deciding whether to provide that information based on authorization criteria are longstanding information-management practices that fall squarely within abstract concepts such as information retrieval, classification, and conditional dissemination.
The claims do not integrate the identified abstract idea into a practical application. Although the claim recites implementation in a “content delivery network (CDN)” and references a “RAG-based large language model,” these elements are invoked only as generic computing environments in which the abstract idea is executed. The claim does not recite any specific improvement to CDN operation, cache architecture, access-control mechanisms, or large language model technology. The steps are functionally described at a high level and amount to applying the abstract idea using conventional computer components to achieve predictable results, rather than effecting a technological improvement.
The claim(s) does/do not include additional elements that are sufficient to amount to significantly more than the judicial exception because the claims are (i) mere instructions to implement the idea on a computer, and/or (ii) recitation of generic computer structure that serves to perform generic computer functions that are well-understood, routine, and conventional activities previously known to the pertinent industry. Viewed as a whole, these additional claim element(s) do not provide meaningful limitation(s) to transform the abstract idea into a patent eligible application of the abstract idea such that the claim(s) amounts to significantly more than the abstract idea itself. Therefore, the claim(s) are rejected under 35 U.S.C. 101 as being directed to non-statutory subject matter. There is further no improvement to the computing device. 
Dependent claims 2-8, 10-16 and 18-20 further recite an abstract idea performable by a human and do not amount to significantly more than the abstract idea as they do not provide steps other than what is conventionally known in data retrieval systems.
Claims 2, 10 and 18: directed to the abstract idea of conditional information routing based on access permissions, which constitutes an organizing and decision-making process applied using generic computer components without a technological improvement.
Claims 3, 11 and 19: merely refines the abstract idea by specifying withholding cached information while forwarding a request, which remains a mental process for controlling information flow and does not add significantly more than Claim 1.
Claims 4, 12 and 20: is directed to the abstract idea of associating information with classification tags to control access, a fundamental data organization technique implemented using a generic database.
Claims 5 and 13: recites the abstract idea of administrative control over stored information via purge commands, which reflects routine information lifecycle management rather than a technological improvement.
Claims 6 and 14: directed to the abstract idea of labeling information with document identifiers to track provenance, which is a longstanding information categorization practice performed on generic computing infrastructure.
Claims 7 and 15: applies the abstract idea by removing stored information based on document classifications and control commands, which is a conventional data governance operation lacking an inventive concept.
Claims 8 and 16: directed to the abstract idea of selectively responding to a request using previously stored information instead of recomputing it, which is a basic caching and decision-making practice implemented on generic computer technology.

Allowable Subject Matter
Claims 1-20 would be allowable if the Applicant can overcome the 101 Abstract Idea rejection set forth.
The following is a statement of reasons for the indication of allowable subject matter:
Dang et al. teaches a semantic caching system for question-answering in which natural-language user queries are received, converted into embedding representations, and compared against a cached question-answer dataset to identify semantically similar prior questions, after which the corresponding cached answers are returned without re-executing the underlying generation pipeline. Dang further discloses that cached answers are generated offline by identifying candidate documents from a document corpus and applying a transformer-based model to extract or generate answers from those documents, and that the cached mappings are stored in a database for efficient reuse.
Tewari et al. teaches a secure content delivery system architecture using cached content. A user request is associated with user authentication or authorization data (e.g., secure URLs or hash-based tokens), and a content server determines whether the requesting user is authorized to access the requested cached content before delivering it. Tewari therefore teaches CDN-level request handling, association of requests with user-specific authorization information, and conditional delivery of cached content based on access permission.
Dang nor Tewari disclose evaluating user access rights to underlying source documents prior to returning a cached answer and limits authorization checks to requested cached content rather than to the provenance documents used during response generation. Additionally, neither reference explicitly  teach a “RAG-based LLM.”.

Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Fu et al. (“GPTCache: An Open-Source Semantic Cache for LLM Applications Enabling Faster Answers and Cost Savings”; 2023) – The rise of ChatGPT 1 has led to the development of artificial intelligence (AI) applications, particularly those that rely on large language models (LLMs). However, recalling LLM APIs can be expensive, and the response speed may slow down during LLMs’ peak times, causing frustration among developers. Potential solutions to this problem include using better LLM models or investing in more computing resources. However, these options may increase product development costs and decrease development speed. GPTCache 2 is an open-source semantic cache that stores LLM responses to address this issue. When integrating an AI application with GPTCache, user queries are first sent to GPTCache for a response before being sent to LLMs like ChatGPT. If GPTCache has the answer to a question, it quickly returns the answer to the user without having to query the LLM. This approach saves costs on API recalls and makes response times much faster. For instance, integrating GPTCache with the GPT service offered by OpenAI can increase response speed 2-10 times when the cache is hit. Moreover, network fluctuations will not affect GPTCache’s response time, making it highly stable. This paper presents GPTCache and its architecture, how it functions and performs, and the use cases for which it is most advantageous.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SHREYANS A PATEL whose telephone number is (571)270-0689. The examiner can normally be reached Monday-Friday 8am-5pm PST.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Pierre Desir can be reached at 571-272-7799. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

SHREYANS A. PATEL
Primary Examiner
Art Unit 2653



/SHREYANS A PATEL/Examiner, Art Unit 2659

Read full office action

Prosecution Timeline

Jul 12, 2024

Application Filed

Feb 02, 2026

Non-Final Rejection — §101

Mar 11, 2026

Examiner Interview Summary

Mar 11, 2026

Applicant Interview (Telephonic)

Precedent Cases

Applications granted by this same examiner with similar technology

18/132,165

Patent 12608559

METHOD AND SYSTEM FOR ENHANCING A MUTIMODAL INPUT CONTENT

3y 0m to grant Granted Apr 21, 2026

18/696,802

Patent 12609128

METHOD FOR IMPROVING FAR-FIELD SPEECH INTERACTION PERFORMANCE, AND FAR-FIELD SPEECH INTERACTION SYSTEM

2y 0m to grant Granted Apr 21, 2026

17/934,906

Patent 12586597

ENHANCED AUDIO FILE GENERATOR

3y 6m to grant Granted Mar 24, 2026

18/744,449

Patent 12586561

TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM, AND A METHOD OF CALCULATING AN EXPRESSIVITY SCORE

1y 9m to grant Granted Mar 24, 2026

17/983,671

Patent 12548549

ON-DEVICE PERSONALIZATION OF SPEECH SYNTHESIS FOR TRAINING OF SPEECH RECOGNITION MODEL(S)

3y 3m to grant Granted Feb 10, 2026

Study what changed to get past this examiner. Based on 5 most recent grants.

Strategy Recommendation AI-generated — please review before filing

Get a prosecution strategy drawn from examiner precedents, rejection analysis, and claim mapping.

Typically takes 5-10 seconds — AI-generated, attorney review required before filing

Prosecution Projections

1-2

Expected OA Rounds

89%

Grant Probability

96%

With Interview (+7.4%)

2y 0m (~2m remaining)

Median Time to Grant

Low

PTA Risk

Based on 404 resolved cases by this examiner. Grant probability derived from career allowance rate.