Last updated: April 19, 2026
Application No. 17/688,154
ADAPTIVE THROTTLING WITH TENANT-BASED CONCURRENT RATE LIMITS FOR A MULTI-TENANT SYSTEM

Non-Final OA §103§112
Filed
Mar 07, 2022
Examiner
WU, BENJAMIN C
Art Unit
2195
Tech Center
2100 — Computer Architecture & Software
Assignee
Oracle International Corporation
OA Round
3 (Non-Final)
Interview Optional

— +16.4% interview lift. This examiner has a relatively high allow rate; a written response may suffice.
Based on 522 resolved cases, 2023–2026
Examiner Intelligence

WU, BENJAMIN C View full profile →
Grants 87% — above average
Career Allow Rate
456 granted / 522 resolved
+32.4% vs TC avg
Strong +16% interview lift
Without
With
+16.4%
Interview Lift
resolved cases with interview
Typical timeline
3y 0m
Avg Prosecution
29 currently pending
Career history
551
Total Applications
across all art units
Statute-Specific Performance

§101
19.8%
-20.2% vs TC avg
§103
48.4%
+8.4% vs TC avg
§102
0.8%
-39.2% vs TC avg
§112
16.1%
-23.9% vs TC avg
Black line = Tech Center average estimate • Based on career data from 522 resolved cases
Office Action

§103 §112
DETAILED ACTION
Notice of Pre-AIA  or AIA  Status
1.	The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .


Continued Examination Under 37 CFR 1.114
2.	A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection.  Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114.  Applicant’s submissions filed on 11/17/2025 have been entered.


3.	Claims 1–20 are pending for examination in the request for continued examination filed on 11/17/2025.


Claim Rejections - 35 USC § 112
The following is a quotation of the first paragraph of 35 U.S.C. 112(a):
(a) IN GENERAL.—The specification shall contain a written description of the invention, and of the manner and process of making and using it, in such full, clear, concise, and exact terms as to enable any person skilled in the art to which it pertains, or with which it is most nearly connected, to make and use the same, and shall set forth the best mode contemplated by the inventor or joint inventor of carrying out the invention.



4.	Claims 1–20 are rejected under 35 U.S.C. 112(a) as failing to comply with the written description requirement. The claims contains subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor or a joint inventor, at the time the application was filed, had possession of the claimed invention.

5.	As to independent claims 1, 8, and 15, they recite the limitation of “wherein the preconfigured default limited processing capacity and the preconfigured default limited number of requests are set as a tenant-level quota independent of a number of connections associated with the tenant.” However, this limitation is not supported by the specification as the specification fails to sufficiently disclose that the “preconfigured default limited number of requests” is imposed on each tenant irrespective of the “number of connections associated with the tenant” (the specification does not reasonably discloses “connections” associated with each tenant).

Therefore, claims 1–20 contain subject matter which was not described in the specification in such a way as to reasonably convey to one skilled in the relevant art that the inventor(s), at the time the application was filed, had possession of the claimed inventions.


Examiner’s Remarks
6.	Examiner refers to and explicitly cites particular pages, sections, figures, paragraphs or columns and lines in the references as applied to Applicant’s claims to the extent practicable to streamline prosecution.
Although the cited portions of the references are representative of the best teachings in the art and are applied to meet the specific limitations of the claims, other uncited but related teachings of the references may be equally applicable as well.  It is respectfully requested that, in preparing responses to the rejections, the Applicant fully considers not only the cited portions of the references, but also the references in their entirety, as potentially teaching, suggesting or rendering obvious all or one or more aspects of the claimed invention.


Abbreviations
7.	Where appropriate, the following abbreviations will be used when referencing Applicant’s submissions and specific teachings of the reference(s):
i.	figure / figures:		Fig. / Figs.
ii.	column / columns:		Col. / Cols.
iii.	page / pages:			p. / pp.

References Cited
8.	(A)	Brooks et al., US 2019/0037026 A1 (“Brooks”).
	(B)	Ambekar et al., US 11,477,322 B1 (“Ambekar”).
	(C)	Veppumthara et al., US 10,944,812 B1 (“Veppumthara”) * (new)
	(D)	Raheja et al., US 2022/0035689 A1 (“Raheja”).
	(E)	Syed et al., US 11,616,725 B1 (“Syed”).

These references were cited in the previous Office action.


Notice re prior art available under both pre-AIA  and AIA 
9.	In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102 of this title, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

A.
10.	Claims 1, 4–8, 11–15, and 18–20 are rejected under 35 U.S.C. 103 as being unpatentable over (A) Brooks in view of (B) Ambekar and (C) Veppumthara.

See “References Cited” section, above, for full citations of references.

11.	Regarding claim 1, (A) Brooks teaches/suggests the invention substantially as claimed, including:
	“A computer-implemented method, the method comprising:
receiving, by a computing device, a request directed to a first tenant of a multi-tenant cloud infrastructure system, the first tenant being granted access to a preconfigured default limited processing capacity to process a preconfigured default limited number of requests concurrently;”
(¶ 9: server connections to the server, each connection being capable of transmitting client requests to the server and each connection being configured to support a default, maximum permitted number of concurrent requests; a transaction processing component of a server operable to process client requests and issue responses to the clients that originated the requests;
¶ 12: monitoring in a connection capacity controller how many client-server connections are current and keeping a log of the default, maximum numbers of permitted concurrent requests that each current client-server connection is configured to support;
¶ 73: The default will be the value that was defined when the connection was established;
¶ 74: overload state is defined directly by comparing the number of concurrent requests being handled and the maximum permitted number, then an overload state could be defined as the maximum number being reached, being approached (e.g. 90 or 95% full capacity) or being exceed;
¶ 56: i.e. a maximum request limit lower than a default value set when the connection was established;

¶¶ 79–81: It will be understood that embodiments of the present disclosure may be implemented using cloud computing. Specifically one or more of the servers and network resources may be hosted in the cloud … Cloud computing is a model of service delivery for enabling convenient, on-demand network access to a shared pool of configurable computing resources ( e.g., networks, network bandwidth, servers, processing, memory, storage, applications, virtual machines, and services).
¶ 85: the provider's computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to demand);



	“determining, by the computing device, whether to throttle the request, or permit the request and grant the first tenant access to additional processing capacity beyond the
preconfigured default limited processing capacity to process the request, the determination comprising:”
(¶ 9: monitor how many client-server connections are current and to keep a log of the default, maximum numbers of permitted concurrent requests that each current client-server connection is configured to support; a server capacity monitor operable to monitor loading of the server having regard to a maximum number of permitted concurrent requests the server is configured to support, wherein, through monitoring, when the server capacity monitor detects that the server is an overloaded state, the server capacity monitor is operable to issue a command to the connection capacity controller to reduce
the maximum permitted numbers of concurrent requests on the current client-server connections;
¶ 11: The server capacity monitor monitors loading of the server by comparing how many requests the server is currently processing with the server's maximum permitted number of concurrent requests;
¶ 74: Here an overload is defined as a situation in which the server cannot process the amount of requests that are pending, e.g. there are some queued requests, and an underload or spare capacity is defined as a situation in which the server is able to accept new requests;
Claim 3: wherein the connection capacity controller is configured to reject requests received on a client-server connection when the connection has reached or exceeded its maximum permitted number of concurrent requests).

“determining a total number of requests to the first tenant and a second tenant in the multi-tenant cloud infrastructure system, the total number of requests to the first tenant and the second tenant in the multi-tenant cloud infrastructure system including the received request to the first tenant”
(¶ 11: The server capacity monitor monitors loading of the server by comparing how many requests the server is currently processing with the server's maximum permitted number of concurrent requests;
¶ 9: process client requests and issue responses to the clients that originated the requests);


	“determining a stress limit of the multi-tenant cloud infrastructure system by applying a stress factor value to a maximum number of requests the multi-tenant cloud infrastructure system is capable of processing concurrently”
(¶ 74: If overload state is defined directly by comparing the number of concurrent requests being handled and the maximum permitted number, then an overload state could be defined as the maximum number being reached, being approached (e.g. 90 or 95% full capacity) or being exceeded;
¶ 66: reduction in the session limit necessary to reduce the incoming workload by 10%. An example for the first connection, if this has a current high water mark usage of 95 outstanding requests, then a maximum request limit for the first connection would be set which would be 10% less than 95, i.e. 86;
¶ 69: so that the CCC 24 can decide to increase the session limits as appropriate, either back to their default limits, or to some intermediate value between the current reduce limit and the default limit);


“comparing the determined total number of requests to the first tenant and the second tenant in the multi-tenant cloud infrastructure system against the determined the stress limit”
(¶ 11: The server capacity monitor monitors loading of the server by comparing how many requests the server is currently processing with the server's maximum permitted number of concurrent requests;
¶ 74: overload state is defined directly by comparing the number of concurrent requests being handled and the maximum permitted number); and


	Brooks teaches “a number of connections associated with the tenant” 
(¶ 76: connection capacity controller, which then acts to restrict the overall connection capacity on the current connections. It does this by reducing the maximum permitted numbers of concurrent requests on one or more of the current client-server connections. The connection capacity controller applies logic based on analysis of the origin of recent and/or pending requests, i.e. which clients they came from, to decide whether to restrict all current connections proportionally or whether to restrict the current connections disproportionately);

but do not teach “wherein the preconfigured default limited processing capacity and the preconfigured default limited number of requests are set as a tenant-level quota independent of a number of connections associated with the tenant.”


(B) Ambekar, in the context of Brook’s teachings, however teaches or suggests implementing:
“a multi-tenant cloud infrastructure system …
“wherein the preconfigured default limited processing capacity and the preconfigured default limited number of requests are set as a tenant-level quota independent of a number of connections associated with the tenant”
(Col. 2, lines 1–3: prioritizing tenants for a service for a request router in a cloud-based Software as a Service (SaaS) platform contact center;
Col. 2, lines 30–38: Each tier-level has a corresponding quota of service requests from a total number of allowed requests and the corresponding quota of service requests is a number of allowed requests per tenant tier-level and (v) providing the tenant tier-level and a number of allowed requests per tenant tier-level to the request-router, to provide the service to the tenant and other tenants having the determined tier-level;
Col. 12, lines 1–10: provide the service to the tenant and other tenants having the determined tier-level, in a preconfigured time-window, based on the corresponding quota of service requests per tenant tier-level from the total number of allowed requests. Each request to the microservices 450 is responded along time. The corresponding quota of service requests is a number of allowed requests per tenant tier level
see Brooks, ¶ 58: capacity of each of multiple connections can be controlled by the server individually or as a group); and


“a total number of requests to the first tenant and a second tenant in the multi-tenant cloud infrastructure system, the total number of requests to the first tenant and the second tenant in the multi-tenant cloud infrastructure system including the received request to the first tenant”
(Col. 2, lines 1–3: prioritizing tenants for a service for a request router in a cloud-based Software as a Service (SaaS) platform contact center;
Col. 2, lines 30–38: Each tier-level has a corresponding quota of service requests from a total number of allowed requests and the corresponding quota of service requests is a number of allowed requests per tenant tier-level and (v) providing the tenant tier-level and a number of allowed requests per tenant tier-level to the request-router, to provide the service to the tenant and other tenants having the determined tier-level).


It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of (B) Ambekar with those of (A) Brooks to monitor, track, and throttle (reject) requests from different tenants or clients. The motivation or advantage to do so is to provide different tiers or levels of services to tenants based on their priorities.

	Brooks and Ambekar do not teach “permitting, based at least in part on the comparing by the computing device, the first tenant access to the additional processing capacity beyond the preconfigured default limited processing capacity to concurrently process a number of requests greater than the preconfigured default limited number of requests.”


(C) Veppumthara, in the context of Brooks and Ambekar’s teachings, however teaches or suggests:
“permitting, based at least in part on the comparing by the computing device, the first tenant access to the additional processing capacity beyond the preconfigured default limited processing capacity to concurrently process a number of requests greater than the preconfigured default limited number of requests”
(Col. 5, lines 40–55: In some implementations, certain burst scenarios (where traffic exceeds the transaction limit on a node) may be accommodated by a “burst” token bucket that allows the additional transactions during a certain timeframe with a caveat that the number of transactions in future timeframes will need to be correspondingly reduced to allow for recovery of the burst token bucket. To illustrate, consider a node with a limit of 100 transactions per second that is asked to handle 200 transactions during one second. Drawing 100 tokens from the burst token bucket may allow the node to temporarily handle 200 transactions for one second and then zero the next second (to replenish the 100 tokens to the burst token bucket), or 110 transactions for ten seconds and then 0 the next second (to replenish the 100 tokens to the burst token bucket)).


It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of (C) Veppumthara with those of Brooks and Ambekar to selectively grant (process) additional requests temporarily beyond a server's maximum permitted number of concurrent requests . The motivation or advantage to do so is to provide for the servicing of temporary or unexpected burst requests marginally beyond the default, maximum numbers of permitted concurrent requests that each client-server connection is configured to support (e.g. slightly above a 90% capacity).



12.	Regarding claim 4, Ambekar teaches/suggests:
	“determining a class of the first tenant, wherein the multi-tenant cloud infrastructure system comprises a hierarchical class system, wherein the first tenant is associated with a first class and the second tenant is associated with a second class, wherein the first tenant is entitled to more processing capacity than the second tenant based at least in part on being associated with the first class, and wherein determining whether to allow or reject the first tenant access to additional processing capacity to process the request is based at least in part on the first tenant being associated with the first class”
(Col. 2, lines 1–3: prioritizing tenants for a service for a request router in a cloud-based Software as a Service (SaaS) platform contact center;
Col. 2, lines 30–38: Each tier-level has a corresponding quota of service requests from a total number of allowed requests and the corresponding quota of service requests is a number of allowed requests per tenant tier-level and (v) providing the tenant tier-level and a number of allowed requests per tenant tier-level to the request-router, to provide the service to the tenant and other tenants having the determined tier-level;
Col. 12, lines 1–10: provide the service to the tenant and other tenants having the determined tier-level, in a preconfigured time-window, based on the corresponding quota of service requests per tenant tier-level from the total number of allowed requests. Each request to the microservices 450 is responded along time. The corresponding quota of service requests is a number of allowed requests per tenant tier level;
Col. 14, lines 19–23: because higher tier-level may have more bandwidth e.g., more allowed service requests per second and commonly very few tenants are allocated to that tier level;
Figs. 15A to 15C, illustrating tenants prioritization across different tiers).



13.	Regarding claim 5, Brooks and Ambekar teach/suggest:
“the request is received at a data plane of the multi-tenant cloud infrastructure system”
(Brooks, ¶ 9: each connection being capable of transmitting client requests to the server and each connection being configured to support a default, maximum permitted number of concurrent requests;
¶ 31: Distributed data processing system 100 may in one example be the Internet with network 102 representing a worldwide collection of networks and gateways that use the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. At the heart of the Internet is a backbone of high-speed data communication lines;
Fig. 3 and ¶ 50: The network connections persist until terminated, so may be long lived. Selected ones of the clients are shown having current network connections to the server, noting that multiple concurrent network connections are illustrated. Each established network connection is managed by sending and receiving messages 30, typically in packet form, from server to client and client to server respectively. Client-to-server messages may contain tasks or requests for the server, and server-to-client messages may contain responses relating to such tasks or requests;
Ambekar, Col. 5, line 65 to Col. 6, line 3: A cloud-based Software as a Service (SaaS) platform contact-center, is a bundle of contact center application services provided to tenants in a cloud environment, over the internet).


14.	Regarding claim 6, Brooks teaches/suggests:
“incrementing the total number of requests to the first tenant and a second tenant in the multi-tenant cloud infrastructure system by one in response to permitting the first tenant access to the additional processing capacity”
(Brooks, ¶ 9: monitor how many client-server connections are current and to keep a log of the default, maximum numbers of permitted concurrent requests that each current client-server connection is configured to support;
¶ 11: The server capacity monitor monitors loading of the server by comparing how many requests the server is currently processing with the server's maximum permitted number of concurrent requests).


15.	Regarding claim 7, Brooks teaches/suggests:
.	“wherein the stress factor value is determined by a cloud services provider managing the multi-tenant cloud infrastructure system based on historical performance data of the multi-tenant cloud infrastructure system”
(¶ 74: a server capacity monitor monitors the load state of the server, for example having regard to a maximum number of permitted concurrent requests the server is configured to support … If overload state is defined directly by comparing the number of concurrent requests being handled and the maximum permitted number, then an overload state could be defined as the maximum number being reached, being approached (e.g. 90 or 95% full capacity) or being exceeded;
¶ 76: If yes (i.e. server overload detected), then in Step S66, the server capacity monitor issues a command to the connection capacity controller, which then acts to restrict the overall connection capacity on the current connections. It does this by reducing the maximum permitted numbers of concurrent requests on one or more of the current client-server connections;

¶¶ 63–66: SCM 26 through its monitoring activity becomes aware of this overload condition …. SCM 26 informs CCC 24 to achieve a target total reduction of 10% of combined incoming workload, where the reduction should be allocated proportionally, i.e. 90% to the first connection and 10% to the second connection;
¶ 66: CCC 24 calculates the reduction in the session limit necessary to reduce the incoming workload by 10%. An example for the first connection, if this has a current high water mark usage of 95 outstanding requests, then a maximum request limit for the first connection would be set which would be 10% less than 95, i.e. 86;

¶ 9: a transaction processing server capable of managing multiple, concurrent client connections, the server comprising: a client-server connector operable to establish, maintain and terminate individual client-server connections to the server, each connection being capable of transmitting client requests to the server and each connection being configured to support a default, maximum permitted number of concurrent requests;
Fig. 7 and ¶¶ 79–98: describing implementing embodiments of the disclosure using cloud computing.).


16.	Regarding claims 8 and 11–14, they are the corresponding system claims reciting similar limitations of commensurate scope as the method of claims 1 and 4–7, respectively. Therefore, they are rejected on the same basis as claims 1 and 4–7 above, including the following rationale:

	Brooks teaches “a processor; and a computer-readable medium including instructions that, when executed by the processor, cause the processor to …” (Fig. 2 and ¶¶ 33–35: Data processing system 200 is an example of a computer, such as server 104 or client 110 in FIG. 1, in which computer-usable program code or instructions implementing the processes may be located).


17.	Regarding claims 15 and 18–20, they are the corresponding computer program product claims reciting similar limitations of commensurate scope as the method of claims 1 and 4–6, respectively. Therefore, they are rejected on the same basis as claims 1 and 4–6 above.




B.
18.	Claims 2, 9, and 16 are rejected under 35 U.S.C. 103 as being unpatentable over (A) Brooks in view of (B) Ambekar and (C) Veppumthara, as applied to claims 1, 8, and 15 above, and further in view of (D) Raheja.

19.	Regarding claim 2, Brooks, Ambekar, and Veppumthara do not teach “authenticating the received request by an application programming interface gateway based at least in part on a set of credentials.”

	(D) Raheja however teaches or suggests:
	“authenticating the received request by an application programming interface gateway based at least in part on a set of credentials”
(¶ 44: use of APIs to facilitate communications between a backend application managing data an external agent wishes to access and the external agent requesting such access;
¶ 84: a call processing availability requirement. The SLA setting may be a percentage value of API calls received at the gateway that successfully act to access resources at the backend application the API services;
¶ 99: In an embodiment, upon receiving an API call request from an external agent, a gateway may transmit a request for routing policies to the SLA monitor 451 or the rate limit service 452. This request may include an identification of the external agent placing the request in an embodiment. In other embodiments, this request may also include authentication credentials for the external agent. The SLA monitor 451 in such an embodiment may transmit the request to the rate limit service 452, which monitors all incoming requests associated with that external agent. If the number of calls associated with that external agent over a preset period of time (e.g., one minute, one hour, 24 hours) exceeds the preset rate limit associated with an individual external agent within the high-level gateway operation policies associated with that API, the rate limit service 452 may transmit an instruction to the SLA monitor 451 or directly to the gateway from which the request was transmitted, to reject the API call requested by the external agent. 
¶ 100: In another embodiment, the serverless elastic-scale API gateway management system 450 may include an authentication service 453, which may operate to check the external agent's authentication credentials).

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of (D) Raheja with those of (A) Brooks, Ambekar, and Veppumthara to implement an API gateway service (function) to authenticate requests. The motivation or advantage to do so is to facilitate communications between backend servers and client requests and to enforce and manage request rates/limits.


20.	Regarding claim 9, it is the corresponding system claim reciting similar limitations of commensurate scope as the method of claim 2. Therefore, it is rejected on the same basis as claim 2 above.


21.	Regarding claim 16, it is the corresponding computer program product claim reciting similar limitations of commensurate scope as the method of claim 2. Therefore, it is rejected on the same basis as claim 2 above.


C.
22.	Claims 3, 10, and 17 are rejected under 35 U.S.C. 103 as being unpatentable over (A) Brooks in view of (B) Ambekar and  (C) Veppumthara, as applied to claims 1, 8, and 15 above, and further in view of (E) Syed.


23.	Regarding claim 3, Brooks teaches/suggests:
“wherein the total number of requests in the multi-tenant cloud infrastructure system is less than the stress limit”
(¶ 74: If overload state is defined directly by comparing the number of concurrent requests being handled and the maximum permitted number, then an overload state could be defined as the maximum number being reached, being approached (e.g. 90 or 95% full capacity) or being exceeded;
¶ 66: reduction in the session limit necessary to reduce the incoming workload by 10%. An example for the first connection, if this has a current high water mark usage of 95 outstanding requests, then a maximum request limit for the first connection would be set which would be 10% less than 95, i.e. 86;
¶ 69: so that the CCC 24 can decide to increase the session limits as appropriate, either back to their default limits, or to some intermediate value between the current reduce limit and the default limit).


Brooks, Ambekar, and Veppumthara do not teach “wherein permitting the tenant access to the additional processing capacity to concurrently process the number of requests greater than the limited number of requests comprises retrieving a token from a global bucket, wherein the global bucket comprises a collection of tokens, and wherein each token of the global bucket comprises a unit of processing capacity shared by the first tenant and the second tenant of the multi-tenant cloud infrastructure system.”

	(E) Syed however teaches or suggests:
“wherein permitting the tenant access to the additional processing capacity to concurrently process the number of requests greater than the limited number of requests comprises retrieving a token from a global bucket, wherein the global bucket comprises a collection of tokens, and wherein each token of the global bucket comprises a unit of processing capacity shared by the first tenant and the second tenant of the multi-tenant cloud infrastructure system”
(Col. 3, lines 1–10: to implement a global token bucket. In a simple version of this solution, a global token bucket maintains a number of tokens for each key corresponding to the throttle limit for that key. When a service host receives a request for a key, it requests a token from the global token bucket. If the token bucket has enough tokens for that key, it decrements the token count and indicates to the service to process the request; when the number of tokens reaches zero, any subsequent requests are throttled. The token bucket is then refilled every interval to the throttle limit;
Col. 3, lines 38–45: bucket. When a service host exhausts its tokens for a given key, it may request additional tokens from the global token bucket. This facilitates centralized maintenance of the total number of tokens, and allows the number of service hosts to scale up and down without needing to adjust the token buckets of every other service host. This solution can also help to ensure full utilization of resources;
Col. 4, lines 1–15: When a service host receives a request for a key corresponding to a local token bucket that is empty is empty, the service host can request more keys from the global token bucket. When the global cache hosting the global token bucket receives a request for more tokens from an empty bucket, the cache can compare the time of the last refill against the interval length, and if the interval has been exceeded, refill the bucket and dispense tokens to the requesting service host. If the interval has not expired, the cache can refuse the request to dispense tokens, and the service host may enter a throttle state and throttle all requests for that key until the interval has expired;
Col. 8, lines 32–40: A service host 106 that receives a request for access to a resource must first be in possession of one or more tokens associated with the requested resource. In order to acquire these tokens, the service host may request tokens from the global cache 114. By centralizing access to the tokens, the global cache 114 can ensure that no tokens for the resource are dispensed in excess of that resource's maximum utilization).

It would have been obvious to a person of ordinary skill in the art before the effective filing date of the claimed invention to combine the teachings of (D) Syed with those of Brooks, Ambekar, and Veppumthara to implement a global token bucket for managing on-demand access to shared resources. The motivation or advantage to do so is to centralized control over resource requests and access so as to ensure their continuous availability (while preventing their overutilization).


24.	Regarding claim 10, it is the corresponding system claim reciting similar limitations of commensurate scope as the method of claim 3. Therefore, it is rejected on the same basis as claim 3 above.

25.	Regarding claim 17, it is the corresponding computer program product claim reciting similar limitations of commensurate scope as the method of claim 3. Therefore, it is rejected on the same basis as claim 3 above.


Response to Arguments
26.	Applicant’s arguments with respect to the claims have been considered but are moot because the arguments do not apply to any of the references or teaching being used/applied in the current rejection.

In the Remarks, the Applicant also contends the following:

a.	Attempt to combine Ambekar with Brooks as proposed by the Office, doing so would still leave the system fundamentally tied to Brooks’ per-connection limits. Brooks' connection capacity controller adjusts the maximum permitted number of concurrent requests per connection (see, e.g., paragraphs 63-66, 74, 76), and there is no teaching or suggestion to discard this per-connection model in favor of a tenant-level, connection-independent quota as now claimed.

The Examiner disagrees:
	
As to (a), incorporating Ambekar’s teachings directed to the use of a tenant- and tiered-based quota on the maximum number of allowed requests per tenant at each tier level does NOT require a fundament change to Brooks’ configuration on the default, maximum numbers of permitted concurrent requests supported by each connection.
Brooks imposes a limit on the number client requests per connection based on server load conditions. Brooks in view of Ambekar teaches an additional overall limit (quota) to the number of requests per tenant/client and service tier (with the combined teachings teaches/suggests implementing and enforcing both limits on a per-connection and per-tenant/client basis).

The Examiner further emphasizes that this rejection is based on a combination of references and one cannot show nonobviousness (of the combination) by attacking references individually where the rejections are based on combinations of references.  See In re Keller, 642 F.2d 413, 208 USPQ 871 (CCPA 1981); In re Merck & Co., 800 F.2d 1091, 231 USPQ 375 (Fed. Cir. 1986).

Accordingly, Ambekar’s teachings of: 
(1) a preconfigured default limited processing capacity and the preconfigured default limited number of requests are set as a tenant-level quota independent of a number of connections associated with the tenant (i.e. a quota of maximum allowed service requests per tenant tier-level),

must be incorporated and understood in the context of Brooks’ teachings on the default, maximum numbers of permitted concurrent requests supported by each of client’s (tenant) connections, with a client requesting services over one or multiple connections.



Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to BENJAMIN C WU whose telephone number is (571)270-5906. The examiner can normally be reached Monday through Friday, 8:30 A.M. to 5:00 P.M..
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.

If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Aimee J. Li can be reached on (571)272-4169. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.

/BENJAMIN C WU/Primary Examiner, Art Unit 2195                                                                                                                                                                                                        
November 29, 2025
Read full office action
Prosecution Timeline

Mar 07, 2022
Application Filed
Oct 31, 2024
Non-Final Rejection — §103, §112
Feb 20, 2025
Examiner Interview Summary
Feb 20, 2025
Applicant Interview (Telephonic)
Mar 05, 2025
Response Filed
Jun 12, 2025
Final Rejection — §103, §112
Nov 17, 2025
Request for Continued Examination
Nov 24, 2025
Response after Non-Final Action
Nov 29, 2025
Non-Final Rejection — §103, §112 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

18/352,293
Patent 12602258
INSTANTIATING SOFTWARE DEFINED STORAGE NODES ON EDGE INFORMATION HANDLING SYSTEMS
2y 5m to grant Granted Apr 14, 2026
18/247,243
Patent 12585508
RECONSTRUCTING AND VERIFYING PROPRIETARY CLOUD BASED ON STATE TRANSITION
2y 5m to grant Granted Mar 24, 2026
17/817,109
Patent 12579006
SYSTEMS AND METHODS FOR UNIVERSAL AUTO-SCALING
2y 5m to grant Granted Mar 17, 2026
18/182,878
Patent 12572388
COMPUTING RESOURCE SCHEDULING BASEDON EXPECTED CYCLES
2y 5m to grant Granted Mar 10, 2026
17/587,663
Patent 12566646
Accessing Critical Resource in a Non-Uniform Memory Access (NUMA) System
2y 5m to grant Granted Mar 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

3-4
Expected OA Rounds
87%
Grant Probability
99%
With Interview (+16.4%)
3y 0m
Median Time to Grant
High
PTA Risk
Based on 522 resolved cases by this examiner. Grant probability derived from career allow rate.