Office Action Analysis: 18448192 — MANAGING INSTANCES OF SERVERLESS FUNCTIONS IN A CLOUD COMPUTING SYSTEM

Office Action

§101 §103
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an
abstract idea without significantly more.
Step 1:
Claim 1 is directed to A method for managing instances of serverless functions in a cloud computing system, comprising: a series of steps, and is therefore directed to a process, which is one of the four statutory categories.
Step 2A, Prong One:
Claim 1 recites the limitations:
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function;
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system;
obtaining a request queue length of the serverless function;
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function;
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function;
all of which can be performed in the human mind through observation, evaluation, judgement and
opinion, with the aid of pen and paper, and are therefore reciting a mental process.
Accordingly, claim 1 recites a judicial exception (i.e., an abstract idea).
Step 2A, Prong Two:
The additional elements recited in claim 1 include:
creating an instance of the serverless function on each of the identified compute nodes.
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system
obtaining a request queue length of the serverless function;
Regarding the additional element (i), the limitation recited amounts to insignificant extra-solution activity, as it is merely instructions to apply the judicial exception, which is not indicative of integration into a practical application. See MPEP 2106.04(d) and 2106.05(g).
Regarding the additional elements (ii), (iii), and (iv), the limitation recited amounts to insignificant extra-solution activity, as it is merely data gathering, which is not indicative of integration into a practical application. See MPEP 2106.05(g).
Furthermore, the combination of additional elements results in mere instructions to implement
the exception on a computer and outputting the result of the exception, which is insignificant extra-solution activity. This combination of additional elements fails to integrate the judicial exception into a
practical application. See MPEP 2106.04(d).
Step 2B:
Regarding the additional element (i), the limitation recited is insignificant extra-solution activity
which amounts to a computer-implemented process, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05(a) and 2106.05(g).
	Regarding the additional elements (ii), (iii), and (iv), the limitation recited is insignificant extra-solution activity which amounts to data gathering, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found that adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05.
The combination of these additional elements amounts to a method comprising steps which can
be performed mentally implemented by generic computing components, and comprising a step of
insignificant extra-solution and well-understood, routine and conventional activity.
Therefore, the additional elements, when considered individually and in combination, fail to add
an inventive concept to the claim.
Consequently, claim 1 as a whole does not amount to significantly more than the recited judicial
exceptions and the claim is not eligible.
Claim 2 is dependent on claim 1, and therefore inherits the same judicial exception recited in claim 1.
The only additional element recited in claim 2 is monitoring the request queue length of the serverless function, which is mere data gathering, which is insignificant extra-solution activity. Accordingly, for the same reasons presented with respect to claim 1, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 2 is not eligible.
Claim 3 is dependent on claim 2, and therefore inherits the same judicial exception recited in claim 2. Further, claim 3 recites (i) determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (ii) identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	The only additional element recited in claim 3 is creating the additional instance of the serverless function on each of the additional compute node, which is merely instructions to apply the exception for the same reasons presented with respect to claim 1.
	Accordingly, for the same reasons presented with respect to claims 1 and 2, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 3 is not eligible.
Claim 4 is dependent on claim 3, and therefore inherits the same judicial exception recited in claim 3. Further, claim 4 recites the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes. Since a person would still be able to identify and perform analysis on the data recited by claim 4 in the human mind, through observation, evaluation, judgement and opinion, with the aid of pen and paper, the limitation of claim 4 is still reciting a mental process. 
	Claim 4 does not recite any additional elements beyond those recited in claims 1, 2, or 3. Accordingly, for the same reasons presented with respect to claims 1, 2 and 3, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 4 is not eligible.
Claim 5 is dependent on claim 1, and therefore inherits the same judicial exception recited in claim 1. 
Further, claim 5 recites ranking the command queue lengths for the GPU of each of the plurality of compute nodes and selecting the number of lowest ranked compute nodes, which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	Claim 5 does not recite any additional elements beyond those recited in claim 1. Accordingly, for the same reasons presented with respect to claim 1, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 5 is not eligible.
Claim 6 is dependent on claim 1, and therefore inherits the same judicial exception recited in claim 1. Further, claim 6 recites identifying compute nodes based at least in part on one or more of a GPU utilization of each of the plurality of compute nodes and an available memory capacity of the GPU of each of the plurality of compute nodes, which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	Claim 6 does not recite any additional elements beyond those recited in claim 1. Accordingly, for the same reasons presented with respect to claim 1, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 6 is not eligible.
Claim 7 is dependent on claim 1, and therefore inherits the same judicial exception recited in claim 1. Further, claim 7 recites a calculation obtained by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer, which is mere data gathering, and does not amount to significantly more than the recited judicial exceptions.
	Claim 7 does not recite any additional elements beyond those recited in claim 1. Accordingly, for the same reasons presented with respect to claim 1, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 7 is not eligible.
Claim 8 is dependent on claim 1, and therefore inherits the same judicial exception recited in claim 1. Further, claim 8 recites a calculation obtained by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer and adding a margin constant that is a positive integer which is mere data gathering, and does not amount to significantly more than the recited judicial exceptions.
	Claim 8 does not recite any additional elements beyond those recited in claim 1. Accordingly, for the same reasons presented with respect to claim 1, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 8 is not eligible.
Step 1:
Claim 9 is directed to A computing system having a memory having computer readable instructions and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations comprising: a series of steps, and is therefore directed to a process, which is one of the four statutory categories.
Step 2A, Prong One:
Claim 9 recites the limitations:
obtaining a service level objective for a serverless fu2ction, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function;
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system;
obtaining a request queue length of the serverless function;
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function;
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function;
all of which can be performed in the human mind through observation, evaluation, judgement and
opinion, with the aid of pen and paper, and are therefore reciting a mental process.
Accordingly, claim 9 recites a judicial exception (i.e., an abstract idea).
Step 2A, Prong Two:
The additional elements recited in claim 9 include:
creating an instance of the serverless function on each of the identified compute nodes.
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system
obtaining a request queue length of the serverless function;
Regarding the additional element (i), the limitation recited amounts to insignificant extra-solution activity, as it is merely instructions to apply the judicial exception, which is not indicative of integration into a practical application. See MPEP 2106.04(d) and 2106.05(g).
Regarding the additional elements (ii), (iii), and (iv), the limitation recited amounts to insignificant extra-solution activity, as it is merely data gathering, which is not indicative of integration into a practical application. See MPEP 2106.05(g).
Furthermore, the combination of additional elements results in mere instructions to implement
the exception on a computer and outputting the result of the exception, which is insignificant extra-solution activity. This combination of additional elements fails to integrate the judicial exception into a
practical application. See MPEP 2106.04(d).
Step 2B:
Regarding the additional element (i), the limitation recited is insignificant extra-solution activity
which amounts to a computer-implemented process, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05(a) and 2106.05(g).
	Regarding the additional elements (ii), (iii), and (iv), the limitation recited is insignificant extra-solution activity which amounts to data gathering, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found that adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05.
The combination of these additional elements amounts to a method comprising steps which can
be performed mentally implemented by generic computing components, and comprising a step of
insignificant extra-solution and well-understood, routine and conventional activity.
Therefore, the additional elements, when considered individually and in combination, fail to add
an inventive concept to the claim.
Consequently, claim 9 as a whole does not amount to significantly more than the recited judicial
exceptions and the claim is not eligible.
Claim 10 is dependent on claim 9, and therefore inherits the same judicial exception recited in claim 9.
The only additional element recited in claim 10 is monitoring the request queue length of the serverless function, which is mere data gathering, which is insignificant extra-solution activity. Accordingly, for the same reasons presented with respect to claim 9, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 10 is not eligible.
Claim 11 is dependent on claim 10, and therefore inherits the same judicial exception recited in claim 10. Further, claim 11 recites (i) determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (ii) identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	The only additional element recited in claim 11 is creating the additional instance of the serverless function on each of the additional compute node, which is merely instructions to apply the exception for the same reasons presented with respect to claim 9.
	Accordingly, for the same reasons presented with respect to claims 9 and 10, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 11 is not eligible.
Claim 12 is dependent on claim 11, and therefore inherits the same judicial exception recited in claim 11. Further, claim 12 recites the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes. Since a person would still be able to identify and perform analysis on the data recited by claim 12 in the human mind, through observation, evaluation, judgement and opinion, with the aid of pen and paper, the limitation of claim 12 is still reciting a mental process. 
	Claim 12 does not recite any additional elements beyond those recited in claims 9, 10, or 11. Accordingly, for the same reasons presented with respect to claims 9, 10 and 11, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 12 is not eligible.
Claim 13 is dependent on claim 9, and therefore inherits the same judicial exception recited in claim 9. 
Further, claim 13 recites ranking the command queue lengths for the GPU of each of the plurality of compute nodes and selecting the number of lowest ranked compute nodes, which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	Claim 13 does not recite any additional elements beyond those recited in claim 9. Accordingly, for the same reasons presented with respect to claim 9, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 13 is not eligible.
Claim 14 is dependent on claim 9, and therefore inherits the same judicial exception recited in claim 9. Further, claim 14 recites identifying compute nodes based at least in part on one or more of a GPU utilization of each of the plurality of compute nodes and an available memory capacity of the GPU of each of the plurality of compute nodes, which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	Claim 14 does not recite any additional elements beyond those recited in claim 9. Accordingly, for the same reasons presented with respect to claim 9, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 14 is not eligible.
Claim 15 is dependent on claim 9, and therefore inherits the same judicial exception recited in claim 9. Further, claim 15 recites a calculation obtained by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer, which is mere data gathering, and does not amount to significantly more than the recited judicial exceptions.
	Claim 15 does not recite any additional elements beyond those recited in claim 9. Accordingly, for the same reasons presented with respect to claim 9, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 15 is not eligible.
Claim 16 is dependent on claim 9, and therefore inherits the same judicial exception recited in claim 9. Further, claim 16 recites a calculation obtained by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer and adding a margin constant that is a positive integer which is mere data gathering, and does not amount to significantly more than the recited judicial exceptions.
	Claim 16 does not recite any additional elements beyond those recited in claim 9. Accordingly, for the same reasons presented with respect to claim 9, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 16 is not eligible.
Step 1:
Claim 9 is directed to A method for managing instances of serverless functions in a cloud computing system, comprising: a series of steps, and is therefore directed to a process, which is one of the four statutory categories.
Step 2A, Prong One:
Claim 17 recites the limitations:
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function;
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system;
obtaining a request queue length of the serverless function;
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function;
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function;
all of which can be performed in the human mind through observation, evaluation, judgement and
opinion, with the aid of pen and paper, and are therefore reciting a mental process.
Accordingly, claim 17 recites a judicial exception (i.e., an abstract idea).
Step 2A, Prong Two:
The additional elements recited in claim 17 include:
creating an instance of the serverless function on each of the identified compute nodes.
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function
obtaining a command queue length for a graphical processing unit (GPU) disposed on each of a plurality of compute nodes in the cloud computing system
obtaining a request queue length of the serverless function;
Regarding the additional element (i), the limitation recited amounts to insignificant extra-solution activity, as it is merely instructions to apply the judicial exception, which is not indicative of integration into a practical application. See MPEP 2106.04(d) and 2106.05(g).
Regarding the additional elements (ii), (iii), and (iv), the limitation recited amounts to insignificant extra-solution activity, as it is merely data gathering, which is not indicative of integration into a practical application. See MPEP 2106.05(g).
Furthermore, the combination of additional elements results in mere instructions to implement
the exception on a computer and outputting the result of the exception, which is insignificant extra-solution activity. This combination of additional elements fails to integrate the judicial exception into a
practical application. See MPEP 2106.04(d).
Step 2B:
Regarding the additional element (i), the limitation recited is insignificant extra-solution activity
which amounts to a computer-implemented process, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05(a) and 2106.05(g).
	Regarding the additional elements (ii), (iii), and (iv), the limitation recited is insignificant extra-solution activity which amounts to data gathering, which has been found by the courts to not be significantly more than an abstract idea (and thus ineligible). The courts have found that adding insignificant extra-solution activity and well-understood, routine and conventional activity is not enough to amount to significantly more than the recited judicial exception. See MPEP 2106.05.

The combination of these additional elements amounts to a method comprising steps which can
be performed mentally implemented by generic computing components, and comprising a step of
insignificant extra-solution and well-understood, routine and conventional activity.
Therefore, the additional elements, when considered individually and in combination, fail to add
an inventive concept to the claim.
Consequently, claim 17 as a whole does not amount to significantly more than the recited judicial exceptions and the claim is not eligible.
Claim 18 is dependent on claim 17, and therefore inherits the same judicial exception recited in claim 17.
The only additional element recited in claim 18 is monitoring the request queue length of the serverless function, which is mere data gathering, which is insignificant extra-solution activity. Accordingly, for the same reasons presented with respect to claim 17, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 18 is not eligible.
Claim 19 is dependent on claim 18, and therefore inherits the same judicial exception recited in claim 18. Further, claim 19 recites (i) determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (ii) identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; which can be performed in the human mind through observation, evaluation, judgement and opinion, with the aid of pen and paper, and are therefore reciting a mental process.
	The only additional element recited in claim 19 is creating the additional instance of the serverless function on each of the additional compute node, which is merely instructions to apply the exception for the same reasons presented with respect to claim 17.
	Accordingly, for the same reasons presented with respect to claims 17 and 18, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 19 is not eligible.
Claim 20 is dependent on claim 19, and therefore inherits the same judicial exception recited in claim 19. Further, claim 20 recites the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes. Since a person would still be able to identify and perform analysis on the data recited by claim 20 in the human mind, through observation, evaluation, judgement and opinion, with the aid of pen and paper, the limitation of claim 20 is still reciting a mental process. 
	Claim 20 does not recite any additional elements beyond those recited in claims 17, 18, or 19. Accordingly, for the same reasons presented with respect to claims 17, 18 and 19, the additional elements are not indicative of integration into a practical application, nor do they amount to significantly more than the recited judicial exceptions. Thus, claim 20 is not eligible.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claims 1-20 are rejected under 35 U.S.C. 103 as being unpatentable over Gunasekaran et al. (Fifer: Tackling Resource Underutilization in the Serverless Era) in view of Ibryam (US 11018965 B1).
Regarding claim 1, Gunasekaran teaches:
	A method for managing instances of serverless functions in a cloud computing system, (see e.g., page [003], lines [034]-[038], left column, “In this paper, we present, Fifer, which to the best of our knowledge, is the first work that employs stage-aware container provisioning and management of function chains for serverless platforms.”) comprising: 
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function; (see e.g., page [003], lines [033-035], left column, “Leveraging slack allows individual functions to be queued in batches at existing containers without violating the application-level SLOs”) (see e.g., page [005], lines [032-038], right column, “On top of these RM frameworks, one can additionally batch the requests by queuing them at every stage of an application, which we name as Request Batching RM (RBRM). The number of requests which can be queued in a container is defined as the batch size (B_size) of the container. Essentially, B_Size is the length of the processing queue each container.”)
obtaining a command queue length for a unit disposed on each of a plurality of compute nodes in the cloud computing system; (see e.g., page [009], lines [009-011], left column, “Each container has a local queue of length equal to the number of free-slots in the container.”)
obtaining a request queue length of the serverless function; (see e.g., page [006], lines [039-044], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage. We design a load balancer along with a load monitor for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function; (see e.g., page [006], lines [045-052], right column, “To accurately determine the number of containers needed at every stage which is a function of B_size and queue length, we need to periodically measure the queuing delay due to batching of  requests. As shown in Algorithm 1a, for a given monitoring interval at every stage, the LM monitors the scheduled  requests in the last 10s to determine if there are any SLO violations due to queuing delays.” )(see e.g., page [006], lines [010-021], right column, “by knowing the available slack and execution time at each stage, we can accurately determine the number of requests that can be executed in a batch in one container.”) (see e.g., page [006], lines [037-042], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage 1. We design a load balancer (LB) 2 along with a load monitor that are integrated to each stage (LM) 3 for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
identifying [based on the queue for each computer resource] compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function; (see e.g., page [007], lines [010-013], right column, “In Fifer, we design a scheduling policy such that, each stage will submit the request to the container with the least remaining free-slots where the number of free-slots is calculated using the container’s batch-size.”) The examiner would like to note that the container can be considered a type of computing node.
and creating an instance of the serverless function on each of the identified compute nodes. (see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to […]”) (see e.g., pages [008-009], lines [038-002], “by default, [the framework] creates a worker pod for each job, which in turn handles container creation and scheduling of tasks within the job and destroys the containers after job completion.”)
Gunasekaran fails to explicitly teach:
	based at least in part on the command queue length for the GPU of each of the plurality of compute nodes; and a
	graphical processing unit (GPU)
However, Ibryam teaches, in the context of the scaling of serverless functions, that the various processor types are functional if interchanged between one another. (see e.g., paragraph [043], “All the disclosed methods and procedures described herein can be implemented using one or more computer programs or components. […] The instructions may be provided as software or firmware, and/or may be implemented in whole or in part in hardware components such as GPUs, ASICs, or any other similar devices.”)
Gunasekaran and Ibryam are considered to be analogous art to the claimed invention as they are reasonably pertinent to the problem faced by the inventor of optimizing instances of serverless functions. Therefore, it would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date, to attempt to combine the systems for managing serverless functions taught by Gunasekaran with the computer elements to run such systems for running serverless functions as taught by Ibryam in order to utilize the strengths of the GPU (being able to run the same instruction on many pieces of different data extremely efficiently) in a setting that might call for it (running a group of commands with various input values repeatedly and rapidly) and to increase the potential batch size or otherwise increase throughput. One aspect that could be easily substituted is the  container with a GPU-based environment. GPUs have an analogous request queue to a container in this situation.
Regarding claim 2, Gunasekaran teaches:
The method of claim 1, further comprising monitoring the request queue length of the serverless function. (see e.g., page [006], figure 5, an image showing a load monitor for the request queue, as seen below)

    PNG
    media_image1.png
    190
    312
    media_image1.png
    Greyscale

Regarding claim 3, Gunasekaran teaches:
The method of claim 2, further comprising: determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (see e.g., pages [006-007], lines [050-006], “This is because there are not enough containers to handle all the queued requests. In that case, we estimate the additional containers needed using the Estimate_Containers function. By knowing the B_size and number of pending requests in the Queue (PQten), the function can estimate the number of containers”)
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; (see e.g.,  page [009], lines [040 – 048], left column, “we make modifications to the MostRequestedPriority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. […] We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”) 
and creating the additional instance of the serverless function on each of the additional compute node. (see e.g., page [006], lines [003-004], left column, “Additional containers would be spawned if the arrival rate [of requests] increases”)
Regarding claim 4, Gunasekaran teaches:
The method of claim 3, wherein the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes.
(see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to the Most Requested Priority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. For our experiments, each container requires 0.5 CPU-core and memory within 1GB. Hence, we set the CPU limit for all containers to be 0.5. We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”)
Regarding claim 5, Gunasekaran teaches:
The method of claim 1, wherein the compute nodes are identified from the plurality of compute nodes by ranking the command queue lengths for the GPU of each of the plurality of compute nodes and selecting the number of lowest ranked compute nodes. (see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to the Most Requested Priority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. For our experiments, each container requires 0.5 CPU-core and memory within 1GB. Hence, we set the CPU limit for all containers to be 0.5. We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”)
Regarding claim 6, Gunasekaran teaches:
The method of claim 1, wherein the compute nodes are identified from the plurality of compute nodes based at least in part on one or more of a GPU utilization of each of the plurality of compute nodes and an available memory capacity of the GPU of each of the plurality of compute nodes. (see e.g., page [007], lines [010-013], right column, “In Fifer, we design a scheduling policy such that, each stage will submit the request to the container with the least remaining free-slots where the number of free-slots is calculated using the container’s batch-size.”)
Regarding claim 7, Gunasekaran teaches:
The method of claim 1, wherein the number of instances is calculated by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer. (see e.g., page 8, Algorithm 1, line 12, “current_req <- len(stage.containers) * batchSize”) Current_req is equivalent to the request queue length, while stage.containers.length is equivalent to the number of instances. The service level objective of a serverless function (the number of concurrent functions runnable) is equivalent to a batch size.
Regarding claim 8, Gunasekaran teaches: 
The method of claim 1, wherein the number of instances is calculated by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer and adding a margin constant that is a positive integer. (see e.g., page 8, Algorithm 1, line 12, “current_req <- len(stage.containers) * batchSize”) Current_req is equivalent to the request queue length, while stage.containers.length is somewhat equivalent to the number of instances. The service level objective of a serverless function (the number of concurrent functions runnable) is equivalent to a batch size. (see e.g., page 8, Algorithm 1, line 15, “est_containers <- (PQ_LEN – current_req)” This takes the previous output and uses an integer (presumably positive, as it is a length value) as further processing.

Regarding claim 9, Gunasekaran teaches:
	A computing system having a memory having computer readable instructions and one or more processors for executing the computer readable instructions, the computer readable instructions controlling the one or more processors to perform operations (see e.g., page [003], lines [034]-[038], left column, “In this paper, we present, Fifer, which to the best of our knowledge, is the first work that employs stage-aware container provisioning and management of function chains for serverless platforms.”) comprising: 
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function; (see e.g., page [003], lines [033-035], left column, “Leveraging slack allows individual functions to be queued in batches at existing containers without violating the application-level SLOs”) (see e.g., page [005], lines [032-038], right column, “On top of these RM frameworks, one can additionally batch the requests by queuing them at every stage of an application, which we name as Request Batching RM (RBRM). The number of requests which can be queued in a container is defined as the batch size (B_size) of the container. Essentially, B_Size is the length of the processing queue each container.”)
obtaining a command queue length for a unit disposed on each of a plurality of compute nodes in the cloud computing system; (see e.g., page [009], lines [009-011], left column, “Each container has a local queue of length equal to the number of free-slots in the container.”)
obtaining a request queue length of the serverless function; (see e.g., page [006], lines [039-044], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage. We design a load balancer along with a load monitor for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function; (see e.g., page [006], lines [045-052], right column, “To accurately determine the number of containers needed at every stage which is a function of B_size and queue length, we need to periodically measure the queuing delay due to batching of  requests. As shown in Algorithm 1a, for a given monitoring interval at every stage, the LM monitors the scheduled  requests in the last 10s to determine if there are any SLO violations due to queuing delays.” )(see e.g., page [006], lines [010-021], right column, “by knowing the available slack and execution time at each stage, we can accurately determine the number of requests that can be executed in a batch in one container.”) (see e.g., page [006], lines [037-042], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage 1. We design a load balancer (LB) 2 along with a load monitor that are integrated to each stage (LM) 3 for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
identifying [based on the queue for each computer resource] compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function; (see e.g., page [007], lines [010-013], right column, “In Fifer, we design a scheduling policy such that, each stage will submit the request to the container with the least remaining free-slots where the number of free-slots is calculated using the container’s batch-size.”) The examiner would like to note that the container can be considered a type of computing node.
and creating an instance of the serverless function on each of the identified compute nodes. (see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to […]”) (see e.g., pages [008-009], lines [038-002], “by default, [the framework] creates a worker pod for each job, which in turn handles container creation and scheduling of tasks within the job and destroys the containers after job completion.”)
Gunasekaran fails to explicitly teach:
	based at least in part on the command queue length for the GPU of each of the plurality of compute nodes; and a
	graphical processing unit (GPU)
However, Ibryam teaches, in the context of the scaling of serverless functions, that the various processor types are functional if interchanged between one another. (see e.g., paragraph [043], “All the disclosed methods and procedures described herein can be implemented using one or more computer programs or components. […] The instructions may be provided as software or firmware, and/or may be implemented in whole or in part in hardware components such as GPUs, ASICs, or any other similar devices.”)
Gunasekaran and Ibryam are considered to be analogous art to the claimed invention as they are reasonably pertinent to the problem faced by the inventor of optimizing instances of serverless functions. Therefore, it would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date, to attempt to combine the systems for managing serverless functions taught by Gunasekaran with the computer elements to run such systems for running serverless functions as taught by Ibryam in order to utilize the strengths of the GPU (being able to run the same instruction on many pieces of different data extremely efficiently) in a setting that might call for it (running a group of commands with various input values repeatedly and rapidly) and to increase the potential batch size or otherwise increase throughput. One aspect that could be easily substituted is the  container with a GPU-based environment. GPUs have an analogous request queue to a container in this situation.


Regarding claim 10, Gunasekaran teaches:
The method of claim 9, further comprising monitoring the request queue length of the serverless function. (see e.g., page [006], figure 5, an image showing a load monitor for the request queue, as seen below)

    PNG
    media_image1.png
    190
    312
    media_image1.png
    Greyscale

Regarding claim 11, Gunasekaran teaches:
The method of claim 10, further comprising: determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (see e.g., pages [006-007], lines [050-006], “This is because there are not enough containers to handle all the queued requests. In that case, we estimate the additional containers needed using the Estimate_Containers function. By knowing the B_size and number of pending requests in the Queue (PQten), the function can estimate the number of containers”)
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; (see e.g.,  page [009], lines [040 – 048], left column, “we make modifications to the MostRequestedPriority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. […] We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”) 
and creating the additional instance of the serverless function on each of the additional compute node. (see e.g., page [006], lines [003-004], left column, “Additional containers would be spawned if the arrival rate [of requests] increases”)
Regarding claim 12, Gunasekaran teaches:
The method of claim 11, wherein the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes.
(see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to the Most Requested Priority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. For our experiments, each container requires 0.5 CPU-core and memory within 1GB. Hence, we set the CPU limit for all containers to be 0.5. We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”)


Regarding claim 13, Gunasekaran teaches:
The method of claim 9, wherein the compute nodes are identified from the plurality of compute nodes by ranking the command queue lengths for the GPU of each of the plurality of compute nodes and selecting the number of lowest ranked compute nodes. (see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to the Most Requested Priority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. For our experiments, each container requires 0.5 CPU-core and memory within 1GB. Hence, we set the CPU limit for all containers to be 0.5. We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”)
Regarding claim 14, Gunasekaran teaches:
The method of claim 9, wherein the compute nodes are identified from the plurality of compute nodes based at least in part on one or more of a GPU utilization of each of the plurality of compute nodes and an available memory capacity of the GPU of each of the plurality of compute nodes. (see e.g., page [007], lines [010-013], right column, “In Fifer, we design a scheduling policy such that, each stage will submit the request to the container with the least remaining free-slots where the number of free-slots is calculated using the container’s batch-size.”)
Regarding claim 15, Gunasekaran teaches:
The method of claim 9, wherein the number of instances is calculated by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer. (see e.g., page 8, Algorithm 1, line 12, “current_req <- len(stage.containers) * batchSize”) Current_req is equivalent to the request queue length, while stage.containers.length is equivalent to the number of instances. The service level objective of a serverless function (the number of concurrent functions runnable) is equivalent to a batch size.
Regarding claim 16, Gunasekaran teaches: 
The method of claim 9, wherein the number of instances is calculated by rounding up a result of dividing the request queue length of the serverless function by the service level objective of the serverless function to a next whole integer and adding a margin constant that is a positive integer. (see e.g., page 8, Algorithm 1, line 12, “current_req <- len(stage.containers) * batchSize”) Current_req is equivalent to the request queue length, while stage.containers.length is somewhat equivalent to the number of instances. The service level objective of a serverless function (the number of concurrent functions runnable) is equivalent to a batch size. (see e.g., page 8, Algorithm 1, line 15, “est_containers <- (PQ_LEN – current_req)” This takes the previous output and uses an integer (presumably positive, as it is a length value) as further processing.
Regarding claim 17, Gunasekaran teaches:
	A computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to perform operations (see e.g., page [003], lines [034]-[038], left column, “In this paper, we present, Fifer, which to the best of our knowledge, is the first work that employs stage-aware container provisioning and management of function chains for serverless platforms.”) comprising: 
obtaining a service level objective for a serverless function, wherein the service level objective specifies a maximum number of concurrent requests a compute node of the cloud computing system can process for the serverless function; (see e.g., page [003], lines [033-035], left column, “Leveraging slack allows individual functions to be queued in batches at existing containers without violating the application-level SLOs”) (see e.g., page [005], lines [032-038], right column, “On top of these RM frameworks, one can additionally batch the requests by queuing them at every stage of an application, which we name as Request Batching RM (RBRM). The number of requests which can be queued in a container is defined as the batch size (B_size) of the container. Essentially, B_Size is the length of the processing queue each container.”)
obtaining a command queue length for a unit disposed on each of a plurality of compute nodes in the cloud computing system; (see e.g., page [009], lines [009-011], left column, “Each container has a local queue of length equal to the number of free-slots in the container.”)
obtaining a request queue length of the serverless function; (see e.g., page [006], lines [039-044], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage. We design a load balancer along with a load monitor for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
calculating a number of instances of the serverless function to deploy in the cloud computing system, wherein the number of instances is determined based on the service level objective and the request queue length of the serverless function; (see e.g., page [006], lines [045-052], right column, “To accurately determine the number of containers needed at every stage which is a function of B_size and queue length, we need to periodically measure the queuing delay due to batching of  requests. As shown in Algorithm 1a, for a given monitoring interval at every stage, the LM monitors the scheduled  requests in the last 10s to determine if there are any SLO violations due to queuing delays.” )(see e.g., page [006], lines [010-021], right column, “by knowing the available slack and execution time at each stage, we can accurately determine the number of requests that can be executed in a batch in one container.”) (see e.g., page [006], lines [037-042], right column, “Fifer utilizes a request queue, which holds all the incoming tasks for each stage 1. We design a load balancer (LB) 2 along with a load monitor that are integrated to each stage (LM) 3 for efficiently scaling containers for the application. Since we know the execution time and available slack, the LB can calculate the batch size (B_size) for each stage.”)
identifying [based on the queue for each computer resource] compute nodes from the plurality of compute nodes to deploy each of the number of instances of the serverless function; (see e.g., page [007], lines [010-013], right column, “In Fifer, we design a scheduling policy such that, each stage will submit the request to the container with the least remaining free-slots where the number of free-slots is calculated using the container’s batch-size.”) The examiner would like to note that the container can be considered a type of computing node.
and creating an instance of the serverless function on each of the identified compute nodes. (see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to […]”) (see e.g., pages [008-009], lines [038-002], “by default, [the framework] creates a worker pod for each job, which in turn handles container creation and scheduling of tasks within the job and destroys the containers after job completion.”)
Gunasekaran fails to explicitly teach:
	based at least in part on the command queue length for the GPU of each of the plurality of compute nodes; and a
	graphical processing unit (GPU)
However, Ibryam teaches, in the context of the scaling of serverless functions, that the various processor types are functional if interchanged between one another. (see e.g., paragraph [043], “All the disclosed methods and procedures described herein can be implemented using one or more computer programs or components. […] The instructions may be provided as software or firmware, and/or may be implemented in whole or in part in hardware components such as GPUs, ASICs, or any other similar devices.”)
Gunasekaran and Ibryam are considered to be analogous art to the claimed invention as they are reasonably pertinent to the problem faced by the inventor of optimizing instances of serverless functions. Therefore, it would have been prima facie obvious to one of ordinary skill in the art, before the effective filing date, to attempt to combine the systems for managing serverless functions taught by Gunasekaran with the computer elements to run such systems for running serverless functions as taught by Ibryam in order to utilize the strengths of the GPU (being able to run the same instruction on many pieces of different data extremely efficiently) in a setting that might call for it (running a group of commands with various input values repeatedly and rapidly) and to increase the potential batch size or otherwise increase throughput. One aspect that could be easily substituted is the  container with a GPU-based environment. GPUs have an analogous request queue to a container in this situation.


Regarding claim 18, Gunasekaran teaches:
The method of claim 17, further comprising monitoring the request queue length of the serverless function. (see e.g., page [006], figure 5, an image showing a load monitor for the request queue, as seen below)

    PNG
    media_image1.png
    190
    312
    media_image1.png
    Greyscale

Regarding claim 19, Gunasekaran teaches:
The method of claim 18, further comprising: determining, based on the request queue length of the serverless function, that the number of compute nodes is insufficient to meet the service level objective; (see e.g., pages [006-007], lines [050-006], “This is because there are not enough containers to handle all the queued requests. In that case, we estimate the additional containers needed using the Estimate_Containers function. By knowing the B_size and number of pending requests in the Queue (PQten), the function can estimate the number of containers”)
identifying, based at least in part on the command queue length for the GPU of each of the plurality of compute nodes, an additional compute node from the plurality of compute nodes to deploy an additional instance of the serverless function; (see e.g.,  page [009], lines [040 – 048], left column, “we make modifications to the MostRequestedPriority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. […] We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”) 
and creating the additional instance of the serverless function on each of the additional compute node. (see e.g., page [006], lines [003-004], left column, “Additional containers would be spawned if the arrival rate [of requests] increases”)
Regarding claim 20, Gunasekaran teaches:
The method of claim 19, wherein the additional compute node is further identified based on an available memory capacity of the GPU of each of the plurality of compute nodes.
(see e.g., page [009], lines [039-048], left column, “In order to efficiently bin-pack containers into fewer nodes, we make modifications to the Most Requested Priority scheduling policy in Kubernetes such that it always chooses the node with the least-available-resources to satisfy the Pod requirements. For our experiments, each container requires 0.5 CPU-core and memory within 1GB. Hence, we set the CPU limit for all containers to be 0.5. We determine idle cores in a node by calculating the difference between number of cores in a node and the sum of cpu-shares for all allocated pods in that node.”)

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Connor Imiola Blackburn whose telephone number is (571)272-6547. The examiner can normally be reached M-Th 7-5.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kevin Young can be reached at (571) 270 - 3180. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.





/C.I.B./Examiner, Art Unit 2194                                                                                                                                                                                                        /KEVIN L YOUNG/Supervisory Patent Examiner, Art Unit 2194
Read full office action
MANAGING INSTANCES OF SERVERLESS FUNCTIONS IN A CLOUD COMPUTING SYSTEM

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

MANAGING INSTANCES OF SERVERLESS FUNCTIONS IN A CLOUD COMPUTING SYSTEM

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email