Office Action Analysis: 18176818 — Resource Allocation Method, Electronic Device and Storage Medium

Office Action

§101 §103 §112
DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This Office Action is in response to claims filed on 11/26/2025.
Claims 1-2, 4, 6, 8-9, and 17-28 are pending.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claims  rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claims 2, 18, and 20 recite the limitation “from the second first node set”. It is unclear which node set is being referenced, the first node set or the second node set, or if this references a different first node set. For the sake of compact prosecution, Examine will interpret this limitation to mean “from the second node set”.
Claims 4, 6, 21-22, and 25-26 recite the limitation “and/or”. It is unclear whether this is intended to mean “and” or “or”. For the sake of compact prosecution, Examiner will interpret this to mean “or”.
Claims 6, 22, and 26 recite the limitation “each node combination” in line 7. There is insufficient antecedent basis for this limitation in the claim. There is no prior mention in the claims or in the claims from which they depend of any node combination. It is unclear which nodes comprise each node combination. For the sake of compact prosecution, Examiner will interpret this to mean “a node combination of one or more nodes from the first, third, or fourth node sets”.
Claims 4, 6, 8, and 21-23, and 25-27 depend, directly or indirectly, from rejected claims and do not resolve the deficiencies thereof and are therefore rejected for at least the same reasons.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-2, 4, 6, 8-9, and 17-28 are rejected under 35 U.S.C. 101 because the claimed invention recites a judicial exception, is directed to that judicial exception, an abstract idea, as it has not been integrated into practical application and the claims further do not recite significantly more than the judicial exception. Examiner has evaluated the claims under the framework provided in the 2019 Patent Eligibility Guidance published in the Federal Register 01/07/2019 and has provided such analysis below.
Step 1: Claims 1-2, 4, 6, and 8-9 are directed to a method and fall within the statutory category of processes. Claims 17-18 and 25-28 are directed to an electronic device and fall within the statutory category of machines. Claims 19-24 are directed to a non-transitory computer-readable storage medium and fall within the statutory category of machines. Therefore, “Are the claims to a process, machine, manufacture or composition of matter?” Yes.
In order to evaluate the Step 2A inquiry “Is the claim directed to a law of nature, a natural phenomenon or an abstract idea?” we must determine, at Step 2A Prong 1, whether the claim recites a law of nature, a natural phenomenon or an abstract idea and further whether the claim recites additional elements that integrate the judicial exception into a practical application.
Step 2A Prong 1:
Claims 1, 17, and 19: The limitations of “creating, by the electronic device, the pod for the target task;” and “creating the pod for the target task;”, as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can mentally assign one or more containers to a task in order to create a pod. Further, the limitations of “selecting (, by the electronic device,) based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes;”, as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe available node information, available GPU resource information, and a plurality of nodes and based on these observations, though mental comparison, can mentally select a first node set satisfying the GPU resource requirement information. This may also be done with pencil and paper. Further, the limitations of “selecting (, by the electronic device,) based on the first node set, an extended node set from the plurality of nodes;”, as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a first node set and based on these observations, can mentally select an extended node set. This may also be done with pencil and paper. Further, the limitations of “and allocating (, by the electronic device,) a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource, for the pod, from the first node set or from the first and extended node sets”, as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a first node set and an extended node set and based on these observations, can mentally allocate, through mental assignment, a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource, for the pod. This may also be done with pencil and paper.
Therefore, Yes, claims 1, 17, and 19 recite a judicial exception.
Step 2A Prong 2:
Claims 1, 17, and 19: The judicial exception is not integrated into a practical application. In particular, the claims recite additional element recitations of “A resource allocation method/An electronic device/non-transitory computer-readable storage medium, applied to a target cluster in a machine learning scenario, wherein the target cluster comprises a plurality of nodes, and any one of the plurality of nodes is a bare machine, a physical machine, or a virtual machine; wherein the target cluster is independent of an electronic device but connected to the electronic device, or the electronic device is deployed on the target cluster, and the electronic device is configured to allocate resources for a pod corresponding to a target task executed by the target cluster;”, “wherein the first target node is a node where a target GPU resource allocated to the pod is located,”, “and the second target node is a node where the pod allocated to the pod is located”, “and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising:”, and “wherein the non- transitory computer-readable storage medium stores a computer instruction thereon, and the computer instruction is used to cause the computer to execute operations, comprising: ”, which are merely recitations of technological environment/field of use (see MPEP § 2106.05(h)) which does not integrate a judicial exception into practical application. Further, the claims recite additional element recitations of “acquiring (, by the electronic device,) Graphics Processing Unit (GPU) resource requirement information of the target task;” and “acquiring (, by the electronic device,) available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state;”, which are merely recitations of data gathering, which is insignificant extra solution activity (see MPEP §2106.05(g)) which does not integrate a judicial exception into practical application. Further, the claims recite additional element recitations of “wherein the electronic device comprises: at least one processor;”, which are merely recitations of generic computing components (see MPEP § 2106.05(f)) which does not integrate a judicial exception into practical application. 
Therefore, “Do the claims recite additional elements that integrate the judicial exception into a practical application? No, these additional elements do not integrate the abstract idea into a practical application and they do not impose any meaningful limits on practicing the abstract idea. The claims are directed to an abstract idea.
After having evaluated the inquires set forth in Steps 2A Prong 1 and 2, it has been concluded that claims 1, 17, and 19 not only recite a judicial exception but that the claims are directed to the judicial exception as the judicial exception has not been integrated into practical application.
Step 2B: 
Claims 1, 17, and 19: The claims do not include additional elements, alone or in combination, that are sufficient to amount to significantly more than the judicial exception. As discussed above with respect to integration of the abstract idea into a practical application, the additional elements amount to no more than generic computing components, field of use/technological environment, and insignificant extra solution activity which do not amount to significantly more than the abstract idea. Further, the insignificant extra solution activity is well-understood, routine, and conventional in the art. “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory” [MPEP§ 2106.05(d)(II)].
Therefore, “Do the claims recite additional elements that amount to significantly more than the judicial exception? No, these additional elements, alone or in combination, do not amount to significantly more than the judicial exception.
Having concluded analysis within the provided framework, Claims 1, 17, and 19 do not recite patent eligible subject matter under 35 U.S.C. § 101.
With regard to claims 2, 18, and 20, the claims recite additional abstract idea recitations of “determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe available node information and available GPU resource information, and based on these observations, though mental comparison, can mentally determine a plurality of candidate nodes which satisfy the amount of GPU resources requested. Further, the claims recite additional abstract idea recitations of “determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a plurality of candidate nodes and based on these observations, though mental comparison, can mentally determine at least one second node which satisfies the type of GPU card and topology structure of the GPU card. Further, the claims recite additional abstract idea recitations of “determining an idle quantity of GPU resource corresponding to each GPU of each second node in the second node set;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a quantity of GPU resource corresponding to each GPU of each second node in a second node set and based on these observations, can mentally determine an idle quantity of GPU resource corresponding to each GPU of each second node. Further, the claims recite additional abstract idea recitations of “and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a second node set and based on these observations, can mentally select at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task. Further, the claims recite additional element recitations of “wherein the GPU resource requirement information comprises amount of GPU resources requested by the target task, a type of a GPU card, and a topology structure of the GPU card,”, which are merely recitations of technological environment/field of use (see MPEP § 2106.05(h)) which does not integrate a judicial exception into practical application. Further, claims 2, 18, and 20 do not recite any further additional elements and for the same reasons as above with regard to integration into practical application and whether additional elements amount to significantly more, claims 2, 18, and 20 also fail both Step 2A prong 2, thus the claims are directed to the judicial exception as it has not been integrated into practical application, and fail Step 2B as not amounting to significantly more. Therefore, Claims 2, 18, and 20 do not recite patent eligible subject matter under 35 U.S.C. § 101.
With regard to claims 4, 21, and 25, the claims recite additional abstract idea recitations of “and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe information of a switch corresponding to each node in a first node set and, based on these observations, can mentally group nodes, through mental assignment, that correspond to a same switch as a first node or that correspond to a different switch from the first node. This may also be done with pencil and paper. Further, the claims recite additional element recitations of “acquiring information of a switch corresponding to each first node in the first node set;” which is merely a recitation of data gathering, which is insignificant extra solution activity (see MPEP §2106.05(g)) which does not integrate a judicial exception into practical application. Further, the claims recite additional element recitations of “wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node”, which are merely recitations of technological environment/field of use (see MPEP § 2106.05(h)) which does not integrate a judicial exception into practical application. Further, the insignificant extra solution activity is well-understood, routine, and conventional in the art. “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory” [MPEP§ 2106.05(d)(II)]. Further, claims 4, 21, and 25 do not recite any further additional elements and for the same reasons as above with regard to integration into practical application and whether additional elements amount to significantly more, claims 4, 21, and 25 also fail both Step 2A prong 2, thus the claim is directed to the judicial exception as it has not been integrated into practical application, and fail Step 2B as not amounting to significantly more. Therefore, Claims 4, 21, and 25 do not recite patent eligible subject matter under 35 U.S.C. § 101.
With regard to claims 6, 22, and 26, the claims recite additional abstract idea recitations of “determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe a third or fourth node set and, based on these observations, can mentally determine an attribute of each node and a corresponding weight value for the attribute. Further, the claims recite additional abstract idea recitations of “determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node;” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can mentally observe an attribute and corresponding weight value of each node, and based on these observations, can mentally determine a total weight value of each node combination. Further, the claims recite additional abstract idea recitations of “and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node” as drafted, is a process that, but for the recitation of generic computing components, under its broadest reasonable interpretation, covers performance of the limitation in the mind. For example, a person can observe and mentally compare a total weight value of each node combination and, based on these observations, can mentally determine a node combination with a highest total weight value. Further, the claims recite additional element recitations of “acquiring load situations respectively corresponding to the first node set, the third node set and the fourth node set;” which is merely a recitation of data gathering, which is insignificant extra solution activity (see MPEP §2106.05(g)) which does not integrate a judicial exception into practical application. Further, the insignificant extra solution activity is well-understood, routine, and conventional in the art. “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory” [MPEP§ 2106.05(d)(II)]. Further, claims 6, 22, and 26 does not recite any further additional elements and for the same reasons as above with regard to integration into practical application and whether additional elements amount to significantly more, claims 6, 22, and 26 also fail both Step 2A prong 2, thus the claim is directed to the judicial exception as it has not been integrated into practical application, and fail Step 2B as not amounting to significantly more. Therefore, Claims 6, 22, and 26 do not recite patent eligible subject matter under 35 U.S.C. § 101.
With regard to claims 8, 23, and 27, the claims recite additional element recitations of “wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node”, which are merely recitations of technological environment/field of use (see MPEP § 2106.05(h)) which does not integrate a judicial exception into practical application. Further, claims 8, 23, and 27 do not recite any further additional elements and for the same reasons as above with regard to integration into practical application and whether additional elements amount to significantly more, claims 8, 23, and 27 also fail both Step 2A prong 2, thus the claim is directed to the judicial exception as it has not been integrated into practical application, and fail Step 2B as not amounting to significantly more. Therefore, Claims 8, 23, and 27 do not recite patent eligible subject matter under 35 U.S.C. § 101.
With regard to claims 9, 24, and 28, the claims recite additional element recitations of “sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task” which are merely recitations of data transmission, which is insignificant extra solution activity (see MPEP §2106.05(g)) which does not integrate a judicial exception into practical application. Further, the insignificant extra solution activity is well-understood, routine, and conventional in the art. “The courts have recognized the following computer functions as well‐understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity. i. Receiving or transmitting data over a network…iv. Storing and retrieving information in memory” [MPEP§ 2106.05(d)(II)]. Further, claims 9, 24, and 28 do not recite any further additional elements and for the same reasons as above with regard to integration into practical application and whether additional elements amount to significantly more, claims 9, 24, and 28 also fail both Step 2A prong 2, thus the claims are directed to the judicial exception as it has not been integrated into practical application, and fail Step 2B as not amounting to significantly more. Therefore, Claims 9, 24, and 28 do not recite patent eligible subject matter under 35 U.S.C. § 101.
Therefore, Claims 1-2, 4, 6, 8-9, and 17-28 do not recite patent eligible subject matter under U.S.C. §101.

Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.


Claims 1-2, 4, 17-21, and 25 are rejected under 35 U.S.C. 103 as being unpatentable over Baillargeon (US 2023/0153162 A1) in view of Sun (US 2019/0197655 A1) in view of Zhao (US 2019/0312772 A1).

With regard to claim 1, Baillargeon teaches:
A resource allocation method, applied to a target cluster “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the target cluster comprises a plurality of nodes, and any one of the plurality of nodes is a bare machine, a physical machine, or a virtual machine; “A cluster consists of one or more master machines and multiple worker machines or nodes. The master runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3]. “From a radio access network (RAN) developer perspective, Kubernetes gives an infrastructure provider the tools to create powerful, production-ready applications to run on virtual machines or physical servers (known as workers or nodes)” [Baillargeon ¶ 2].
wherein the target cluster is independent of an electronic device but connected to the electronic device, or the electronic device is deployed on the target cluster, “FIG. 15 is a block diagram of an example implementation of an OAM cluster 12 in communication via network 18 with a workload cluster 20. The OAM cluster 12 has a communication interface 44, which may communicate with the network 18, either wirelessly or by wireline … The communication interface 44 may be configured to facilitate a connection to other devices, e.g., workload cluster 20, via network 18” [Baillargeon ¶ 109]. “The OAM cluster 12 also has processing circuitry 46. The processing circuitry 46 may include a memory 48 and a processor 50” [Baillargeon ¶ 110]. “These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” [Baillargeon ¶ 124].
and the electronic device is configured to allocate resources for a pod corresponding to a target task executed by the target cluster; “Referring now to the drawing figures, where like elements are like numbered, there is shown in FIG. 1 a diagram of an Operations, Administration and Maintenance (OAM) network 10 that may be employed to implement methods described herein for managing resource quotas” [Baillargeon ¶ 45]. “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the resource allocation method comprises: creating, by the electronic device, the pod for the target task; “Cluster users create resources (pods, services, etc.) in the namespace, and the resource quota system tracks resource allocation (not the same as resource utilization) to ensure that the resource allocation does not exceed hard resource limits defined in the resource quota specification” [Baillargeon ¶ 69].
acquiring, by the electronic device, Graphics Processing Unit (GPU) resource requirement information of the target task; “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60]. “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table].
and allocating, by the electronic device, a first target node … for the pod, “The resource quota targets are requests informing the cluster administrator to increase or decrease the container resource allocation (by adding new nodes to a cluster or allocating existing cluster capacity to other users) for a namespace on a specific cluster” [Baillargeon ¶ 10]. “Management of resources for cluster network control may be implemented at the network level 34, the cluster level 36, the node level 38, the pod level 40 and the container level 42” [Baillargeon ¶ 45].
	Baillargeon fails to teach in a machine learning scenario, acquiring, by the electronic device, available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating, by the electronic device, a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource.
	However, Sun teaches:	
in a machine learning scenario, “For example, GPUs are used to accelerate data processing in high-performance computing (HPC) and embedded computing systems, for various applications such as financial modeling, scientific research, machine learning, data mining, video data transcoding, image analysis, image recognition, virus pattern matching, augmented reality, encryption/decryption, weather forecasting, big data comparisons, and other applications with computational workloads that have an inherently parallel nature” [Sun ¶ 4].
acquiring, by the electronic device, available node information of the target cluster and available GPU resource information of the target cluster; “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].
wherein the available node information comprises information of a node in an idle state among the plurality of nodes, “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].  “If a single GPU server node allocation is determined (in block 418) to be sufficient to handle the GPU processing task(s) associated with the GPU service request, the GPU server allocation and scheduling module 142 will select a single registered GPU server node within the pool of GPU server nodes which has available GPU resources to handle the GPU processing task(s) (block 420)” [Sun ¶ 69]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59 Examiner notes a node in an idle state is interpreted as a node with idle resources].
and the available GPU resource information comprises information of a GPU resource of the node in the idle state; “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59].
and allocating, by the electronic device, a first target node having a target GPU resource satisfying the GPU resource requirement information “For example, the service attributes can specify a quality of service (QoS) and a priority level for executing the GPU processing tasks, wherein the allocation of one or more GPU server nodes within the server cluster 150 is dynamically determined so that the allocated GPU server nodes will collectively have sufficient processing resources to satisfy the service attributes specified in the GPU service request” [Sun ¶ 28]. 
and a second target node able to execute the target task by using the target GPU resource, “When executing the GPU processing tasks, the master GPU server node (second target node) will coordinate access to all GPU devices and resources access across the allocated (logically bound) master and slave GPU server nodes, returning processing results to the client system 110 only through the master GPU server node” [Sun ¶ 31]. “When the logical binding is complete, the GPU service controller 140 will return a response message to the GPU API 314 of the client system 310, wherein the response message comprises connection information to enable the GPU API 314 to connect to the elected master GPU server node to commence execution of the GPU processing task(s) associated with the GPU service request (block 424)” [Sun ¶ 72].
Sun is considered to be analogous to the claimed invention because it is in the same field of logical partitioning of resources. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon to incorporate the teachings of Sun and include: in a machine learning scenario, acquiring, by the electronic device, available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating, by the electronic device, a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource. Doing so would allow for more flexibility in the scheduling of tasks for server nodes. “In addition, to solve the issue of GPU scaling, embodiments of the invention provide techniques to extend GPUaaS functionality by allowing multiple GPU servers to logically bind together to build a logical server across multiple GPU server nodes, thereby combining GPU resources to create a pool of GPU resources that can be utilized for handling GPU processing tasks requested by a client. These scaling techniques allow the GPUaaS system to present a larger logical pool of GPU devices than is available on any one GPU server node, and provides flexibility for a system administrator to acquire and apply GPU resources in smaller increments as needed” [Sun ¶ 18].
Baillargeon in view of Sun fails to explicitly teach selecting, by the electronic device, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting, by the electronic device, based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets.
However, Zhao teaches:
selecting, by the electronic device, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
selecting, by the electronic device, based on the first node set, an extended node set from the plurality of nodes; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and allocating … from the first node set or from the first and extended node sets. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
Zhao is considered to be analogous to the claimed invention because it is in the same field of allocation of resources. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun to incorporate the teachings of Zhao and include selecting, by the electronic device, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting, by the electronic device, based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets. Doing so would allow for optimized communication resources in the provisioning of computing devices. “… maintain information regarding the hardware connection topology of server nodes within a heterogeneous cluster, as well as current bandwidth usage information regarding intra-node and inter-node communication links of the server nodes, and utilize such information to provision computing devices (e.g., GPUs) in a way that optimizes communication bus and networking resources (mitigates or eliminates waste of network resources), and which optimally utilizes bidirectional connection topologies, in a balanced manner, to mitigate communication bottlenecks between computing resources” [Zhao ¶ 14].

With regard to claim 2, Baillargeon in view of Sun in view of Zhao teaches the method of claim 1, as referenced above. Baillargeon further teaches wherein the GPU resource requirement information comprises amount of GPU resources requested by the target task, “Also, in some existing implementations, the cluster administrator must monitor usage of resources of a resource quota of resources assigned to a cluster to determine a desired or required capacity to be allocated to the cluster” [Baillargeon ¶ 8]. “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60].  “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table]. “A cluster auto-scaler (CA) is a tool of the cluster administrator to find pods that cannot be scheduled, and determines if adding a new cluster node similar to other nodes of the cluster would materially aid a desired allocation of resources, while attempting to meet cluster requirements” [Baillargeon ¶ 8]. 
Baillargeon fails to explicitly teach determining an idle quantity of GPU resource corresponding to each GPU of each second node.
However, Sun teaches determining an idle quantity of GPU resource corresponding to each GPU of each second node “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59]. “For instance, the client request may specify a number of GPU devices for handling the GPU processing tasks associated with the GPU service request, wherein the allocation of one or more GPU server nodes within the server cluster 150 is determined so that the allocated GPU server nodes comprise a total number of available GPU devices that meet the specified number of GPU devices as requested in the service request” [Sun ¶ 28].
Baillargeon in view of Sun fails to teach a type of a GPU card, and a topology structure of the GPU card, and selecting, by the electronic device, the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; the second node set; and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set.
However, Zhao teaches:
a type of a GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20].
and a topology structure of the GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20]. 
and selecting, by the electronic device, the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; “Next, the control server node will evaluate the candidate GPU devices using topology information in the topology database 146 to select an optimal set of GPU devices to provision for handling the service request (block 604)” [Zhao ¶ 71]. “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices (second node set) among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. SA). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72].
each second node in the second node set; “In this instance, the set of N GPU devices (second node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set. “In this instance, the set of N GPU devices which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72 Examiner notes the selected set of N GPU devices and their corresponding nodes are the first node set].

With regard to claim 4, Baillargeon in view of Sun in view of Zhao teaches the method of claim 2, as referenced above. Baillargeon in view of Sun fails to teach wherein selecting, by the electronic device, the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node.
However, Zhao teaches:
wherein selecting, by the electronic device, the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; “FIG. 4 illustrates an example hardware topology of a GPU server node 400, and a corresponding system topology view 420, which can be generated and reported by a reporting agent 162 (FIG. 1) using a topology detection command utility, according to an embodiment of the invention” [Zhao ¶ 62]. “The system topology view 420 includes information which indicates that: (i) 4 GPUs were detected in the example topology 400; (ii) GPU0 and GPU1 are interconnected via an internal PCIe switch (PIX) with a CPU affinity to NUMA socket 0 (CPU0-7, 16-23), connected with Mellanox RoCE (single port) (mlx5_0) via host PCIe switch (PHB); and that (iii) GPU2 and GPU3 are interconnected via an internal PCIe switch (PIX), with a CPU affinity to NUMA socket1, with a long communication path between the Mellanox RoCE card and GPU2/GPU3” [Zhao ¶ 64, Fig. 4].
and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (third node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices (fourth node set) that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].

With regard to claim 17, Baillargeon teaches:
An electronic device, applied to a target cluster “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the target cluster comprises a plurality of nodes, and any one of the plurality of nodes is a bare machine, a physical machine, or a virtual machine; “A cluster consists of one or more master machines and multiple worker machines or nodes. The master runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3]. “From a radio access network (RAN) developer perspective, Kubernetes gives an infrastructure provider the tools to create powerful, production-ready applications to run on virtual machines or physical servers (known as workers or nodes)” [Baillargeon ¶ 2].
wherein the target cluster is independent of an electronic device but connected to the electronic device, or the electronic device is deployed on the target cluster, “FIG. 15 is a block diagram of an example implementation of an OAM cluster 12 in communication via network 18 with a workload cluster 20. The OAM cluster 12 has a communication interface 44, which may communicate with the network 18, either wirelessly or by wireline … The communication interface 44 may be configured to facilitate a connection to other devices, e.g., workload cluster 20, via network 18” [Baillargeon ¶ 109]. “The OAM cluster 12 also has processing circuitry 46. The processing circuitry 46 may include a memory 48 and a processor 50” [Baillargeon ¶ 110]. “These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” [Baillargeon ¶ 124].
and the electronic device is configured to allocate resources for a pod corresponding to a target task executed by the target cluster; “Referring now to the drawing figures, where like elements are like numbered, there is shown in FIG. 1 a diagram of an Operations, Administration and Maintenance (OAM) network 10 that may be employed to implement methods described herein for managing resource quotas” [Baillargeon ¶ 45]. “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the electronic device comprises: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising: “The memory 48 is configured to store data, programmatic software code and/or other information described herein. In some embodiments, the software may include instructions that, when executed by the processor 50 and/or processing circuitry 46, causes the processor 50 and/or processing circuitry 46 to perform the processes described herein with respect to OAM cluster 12, e.g., the functions of the cluster user quota controller 14 and/or the cluster admin quota controller 16” [Baillargeon ¶ 111].
Creating the pod for the target task; “Cluster users create resources (pods, services, etc.) in the namespace, and the resource quota system tracks resource allocation (not the same as resource utilization) to ensure that the resource allocation does not exceed hard resource limits defined in the resource quota specification” [Baillargeon ¶ 69].
acquiring Graphics Processing Unit (GPU) resource requirement information of the target task; “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60]. “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table].
and allocating a first target node … for the pod, “The resource quota targets are requests informing the cluster administrator to increase or decrease the container resource allocation (by adding new nodes to a cluster or allocating existing cluster capacity to other users) for a namespace on a specific cluster” [Baillargeon ¶ 10]. “Management of resources for cluster network control may be implemented at the network level 34, the cluster level 36, the node level 38, the pod level 40 and the container level 42” [Baillargeon ¶ 45].
	Baillargeon fails to teach in a machine learning scenario, acquiring available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource.
	However, Sun teaches:	
in a machine learning scenario, “For example, GPUs are used to accelerate data processing in high-performance computing (HPC) and embedded computing systems, for various applications such as financial modeling, scientific research, machine learning, data mining, video data transcoding, image analysis, image recognition, virus pattern matching, augmented reality, encryption/decryption, weather forecasting, big data comparisons, and other applications with computational workloads that have an inherently parallel nature” [Sun ¶ 4].
acquiring available node information of the target cluster and available GPU resource information of the target cluster; “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].
wherein the available node information comprises information of a node in an idle state among the plurality of nodes, “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].  “If a single GPU server node allocation is determined (in block 418) to be sufficient to handle the GPU processing task(s) associated with the GPU service request, the GPU server allocation and scheduling module 142 will select a single registered GPU server node within the pool of GPU server nodes which has available GPU resources to handle the GPU processing task(s) (block 420)” [Sun ¶ 69]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59 Examiner notes a node in an idle state is interpreted as a node with idle resources].
and the available GPU resource information comprises information of a GPU resource of the node in the idle state; “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59].
and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information “For example, the service attributes can specify a quality of service (QoS) and a priority level for executing the GPU processing tasks, wherein the allocation of one or more GPU server nodes within the server cluster 150 is dynamically determined so that the allocated GPU server nodes will collectively have sufficient processing resources to satisfy the service attributes specified in the GPU service request” [Sun ¶ 28]. 
and a second target node able to execute the target task by using the target GPU resource, “When executing the GPU processing tasks, the master GPU server node (second target node) will coordinate access to all GPU devices and resources access across the allocated (logically bound) master and slave GPU server nodes, returning processing results to the client system 110 only through the master GPU server node” [Sun ¶ 31]. “When the logical binding is complete, the GPU service controller 140 will return a response message to the GPU API 314 of the client system 310, wherein the response message comprises connection information to enable the GPU API 314 to connect to the elected master GPU server node to commence execution of the GPU processing task(s) associated with the GPU service request (block 424)” [Sun ¶ 72].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon to incorporate the teachings of Sun and include: in a machine learning scenario, acquiring available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource. Doing so would allow for more flexibility in the scheduling of tasks for server nodes. “In addition, to solve the issue of GPU scaling, embodiments of the invention provide techniques to extend GPUaaS functionality by allowing multiple GPU servers to logically bind together to build a logical server across multiple GPU server nodes, thereby combining GPU resources to create a pool of GPU resources that can be utilized for handling GPU processing tasks requested by a client. These scaling techniques allow the GPUaaS system to present a larger logical pool of GPU devices than is available on any one GPU server node, and provides flexibility for a system administrator to acquire and apply GPU resources in smaller increments as needed” [Sun ¶ 18].
Baillargeon in view of Sun fails to explicitly teach selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets.
However, Zhao teaches:
selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
selecting based on the first node set, an extended node set from the plurality of nodes; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and allocating … from the first node set or from the first and extended node sets. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun to incorporate the teachings of Zhao and include selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets. Doing so would allow for optimized communication resources in the provisioning of computing devices. “… maintain information regarding the hardware connection topology of server nodes within a heterogeneous cluster, as well as current bandwidth usage information regarding intra-node and inter-node communication links of the server nodes, and utilize such information to provision computing devices (e.g., GPUs) in a way that optimizes communication bus and networking resources (mitigates or eliminates waste of network resources), and which optimally utilizes bidirectional connection topologies, in a balanced manner, to mitigate communication bottlenecks between computing resources” [Zhao ¶ 14].

With regard to claim 18, Baillargeon in view of Sun in view of Zhao teaches the electronic device of claim 17, as referenced above. Baillargeon further teaches wherein the GPU resource requirement information comprises amount of GPU resources requested by the target task, “Also, in some existing implementations, the cluster administrator must monitor usage of resources of a resource quota of resources assigned to a cluster to determine a desired or required capacity to be allocated to the cluster” [Baillargeon ¶ 8]. “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60].  “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table]. “A cluster auto-scaler (CA) is a tool of the cluster administrator to find pods that cannot be scheduled, and determines if adding a new cluster node similar to other nodes of the cluster would materially aid a desired allocation of resources, while attempting to meet cluster requirements” [Baillargeon ¶ 8]. 
Baillargeon fails to explicitly teach determining an idle quantity of GPU resource corresponding to each GPU of each second node.
However, Sun teaches determining an idle quantity of GPU resource corresponding to each GPU of each second node “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59]. “For instance, the client request may specify a number of GPU devices for handling the GPU processing tasks associated with the GPU service request, wherein the allocation of one or more GPU server nodes within the server cluster 150 is determined so that the allocated GPU server nodes comprise a total number of available GPU devices that meet the specified number of GPU devices as requested in the service request” [Sun ¶ 28].
Baillargeon in view of Sun fails to teach a type of a GPU card, and a topology structure of the GPU card, and selecting the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; the second node set; and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set.
However, Zhao teaches:
a type of a GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20].
and a topology structure of the GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20]. 
and selecting the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; “Next, the control server node will evaluate the candidate GPU devices using topology information in the topology database 146 to select an optimal set of GPU devices to provision for handling the service request (block 604)” [Zhao ¶ 71]. “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices (second node set) among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. SA). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72].
each second node in the second node set; “In this instance, the set of N GPU devices (second node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set. “In this instance, the set of N GPU devices which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72 Examiner notes the selected set of N GPU devices and their corresponding nodes are the first node set].

With regard to claim 19, Baillargeon teaches:
A non-transitory computer-readable storage medium, applied to a target cluster “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the target cluster comprises a plurality of nodes, and any one of the plurality of nodes is a bare machine, a physical machine, or a virtual machine; “A cluster consists of one or more master machines and multiple worker machines or nodes. The master runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3]. “From a radio access network (RAN) developer perspective, Kubernetes gives an infrastructure provider the tools to create powerful, production-ready applications to run on virtual machines or physical servers (known as workers or nodes)” [Baillargeon ¶ 2].
wherein the target cluster is independent of a computer but connected to the computer, or the computer is deployed on the target cluster, “FIG. 15 is a block diagram of an example implementation of an OAM cluster 12 in communication via network 18 with a workload cluster 20. The OAM cluster 12 has a communication interface 44, which may communicate with the network 18, either wirelessly or by wireline … The communication interface 44 may be configured to facilitate a connection to other devices, e.g., workload cluster 20, via network 18” [Baillargeon ¶ 109]. “The OAM cluster 12 also has processing circuitry 46. The processing circuitry 46 may include a memory 48 and a processor 50” [Baillargeon ¶ 110]. “These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks” [Baillargeon ¶ 124].
and the computer is configured to allocate resources for a pod corresponding to a target task executed by the target cluster; “Referring now to the drawing figures, where like elements are like numbered, there is shown in FIG. 1 a diagram of an Operations, Administration and Maintenance (OAM) network 10 that may be employed to implement methods described herein for managing resource quotas” [Baillargeon ¶ 45]. “In some embodiments, cluster capacity request and allocation using resource quotas allow a cluster user to quickly discover current cluster resources throughout a network and effectively communicate resource requirements to the cluster administrator, thus ensuring that a certain amount of resources are readily available at the right time and to the right cluster as desired for optimal resource allocations for RAN/5G operations” [Baillargeon ¶ 37].
wherein the non-transitory computer-readable storage medium stores a computer instruction thereon, and the computer instruction is used to cause the computer to execute operations, comprising: “Furthermore, the disclosure may take the form of a computer program product on a tangible computer usable storage medium having computer program code embodied in the medium that can be executed by a computer” [Baillargeon ¶ 123]. “The memory 48 is configured to store data, programmatic software code and/or other information described herein. In some embodiments, the software may include instructions that, when executed by the processor 50 and/or processing circuitry 46, causes the processor 50 and/or processing circuitry 46 to perform the processes described herein with respect to OAM cluster 12, e.g., the functions of the cluster user quota controller 14 and/or the cluster admin quota controller 16” [Baillargeon ¶ 111].
creating the pod for the target task; “Cluster users create resources (pods, services, etc.) in the namespace, and the resource quota system tracks resource allocation (not the same as resource utilization) to ensure that the resource allocation does not exceed hard resource limits defined in the resource quota specification” [Baillargeon ¶ 69].
acquiring Graphics Processing Unit (GPU) resource requirement information of the target task; “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60]. “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table].
and allocating a first target node … for the pod, “The resource quota targets are requests informing the cluster administrator to increase or decrease the container resource allocation (by adding new nodes to a cluster or allocating existing cluster capacity to other users) for a namespace on a specific cluster” [Baillargeon ¶ 10]. “Management of resources for cluster network control may be implemented at the network level 34, the cluster level 36, the node level 38, the pod level 40 and the container level 42” [Baillargeon ¶ 45].
	Baillargeon fails to teach in a machine learning scenario, acquiring available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource.
	However, Sun teaches:	
in a machine learning scenario, “For example, GPUs are used to accelerate data processing in high-performance computing (HPC) and embedded computing systems, for various applications such as financial modeling, scientific research, machine learning, data mining, video data transcoding, image analysis, image recognition, virus pattern matching, augmented reality, encryption/decryption, weather forecasting, big data comparisons, and other applications with computational workloads that have an inherently parallel nature” [Sun ¶ 4].
acquiring available node information of the target cluster and available GPU resource information of the target cluster; “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].
wherein the available node information comprises information of a node in an idle state among the plurality of nodes, “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66].  “If a single GPU server node allocation is determined (in block 418) to be sufficient to handle the GPU processing task(s) associated with the GPU service request, the GPU server allocation and scheduling module 142 will select a single registered GPU server node within the pool of GPU server nodes which has available GPU resources to handle the GPU processing task(s) (block 420)” [Sun ¶ 69]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59 Examiner notes a node in an idle state is interpreted as a node with idle resources].
and the available GPU resource information comprises information of a GPU resource of the node in the idle state; “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59].
and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information “For example, the service attributes can specify a quality of service (QoS) and a priority level for executing the GPU processing tasks, wherein the allocation of one or more GPU server nodes within the server cluster 150 is dynamically determined so that the allocated GPU server nodes will collectively have sufficient processing resources to satisfy the service attributes specified in the GPU service request” [Sun ¶ 28]. 
and a second target node able to execute the target task by using the target GPU resource, “When executing the GPU processing tasks, the master GPU server node (second target node) will coordinate access to all GPU devices and resources access across the allocated (logically bound) master and slave GPU server nodes, returning processing results to the client system 110 only through the master GPU server node” [Sun ¶ 31]. “When the logical binding is complete, the GPU service controller 140 will return a response message to the GPU API 314 of the client system 310, wherein the response message comprises connection information to enable the GPU API 314 to connect to the elected master GPU server node to commence execution of the GPU processing task(s) associated with the GPU service request (block 424)” [Sun ¶ 72].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon to incorporate the teachings of Sun and include: in a machine learning scenario, acquiring available node information of the target cluster and available GPU resource information of the target cluster; wherein the available node information comprises information of a node in an idle state among the plurality of nodes, and the available GPU resource information comprises information of a GPU resource of the node in the idle state; and allocating a first target node having a target GPU resource satisfying the GPU resource requirement information and a second target node able to execute the target task by using the target GPU resource. Doing so would allow for more flexibility in the scheduling of tasks for server nodes. “In addition, to solve the issue of GPU scaling, embodiments of the invention provide techniques to extend GPUaaS functionality by allowing multiple GPU servers to logically bind together to build a logical server across multiple GPU server nodes, thereby combining GPU resources to create a pool of GPU resources that can be utilized for handling GPU processing tasks requested by a client. These scaling techniques allow the GPUaaS system to present a larger logical pool of GPU devices than is available on any one GPU server node, and provides flexibility for a system administrator to acquire and apply GPU resources in smaller increments as needed” [Sun ¶ 18].
Baillargeon in view of Sun fails to explicitly teach selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets.
However, Zhao teaches:
selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
selecting based on the first node set, an extended node set from the plurality of nodes; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and allocating … from the first node set or from the first and extended node sets. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun to incorporate the teachings of Zhao and include selecting, based on the available node information and the available GPU resource information, a first node set satisfying the GPU resource requirement information from the plurality of nodes; selecting based on the first node set, an extended node set from the plurality of nodes; and allocating … from the first node set or from the first and extended node sets. Doing so would allow for optimized communication resources in the provisioning of computing devices. “… maintain information regarding the hardware connection topology of server nodes within a heterogeneous cluster, as well as current bandwidth usage information regarding intra-node and inter-node communication links of the server nodes, and utilize such information to provision computing devices (e.g., GPUs) in a way that optimizes communication bus and networking resources (mitigates or eliminates waste of network resources), and which optimally utilizes bidirectional connection topologies, in a balanced manner, to mitigate communication bottlenecks between computing resources” [Zhao ¶ 14].

With regard to claim 20, Baillargeon in view of Sun teaches the non-transitory computer-readable storage medium of claim 19, as referenced above. Baillargeon further teaches wherein the GPU resource requirement information comprises amount of GPU resources requested by the target task, “Also, in some existing implementations, the cluster administrator must monitor usage of resources of a resource quota of resources assigned to a cluster to determine a desired or required capacity to be allocated to the cluster” [Baillargeon ¶ 8]. “Some embodiments described herein may impose pod and/or container resource requirements. In some embodiments, a cluster user consumes container resources produced by nodes in a k8s cluster. Container resources are classified in 2 categories: 1. "Basic" resources are defined in the kubernetes/io domain: cpu, memory, hugepages, ephemeral-storage; 2. Extended resources are defined outside the kubernetes/io domain: e.g., nvidia.com/gpu The cluster user may "request" container resources in the pod specification, (spec. containers [ ].resources):” [Baillargeon ¶ 56-60].  “Example: The number of NVDIA GPUs requested by the pod/container” [Baillargeon ¶ 67 Table]. “A cluster auto-scaler (CA) is a tool of the cluster administrator to find pods that cannot be scheduled, and determines if adding a new cluster node similar to other nodes of the cluster would materially aid a desired allocation of resources, while attempting to meet cluster requirements” [Baillargeon ¶ 8]. 
Baillargeon fails to explicitly teach determining an idle quantity of GPU resource corresponding to each GPU of each second node.
However, Sun teaches determining an idle quantity of GPU resource corresponding to each GPU of each second node “In particular, the GPU server allocation and scheduling module 142 will access the database of GPU server registration information 146 to determine all available GPU resources and GPU sever nodes within the current GPU resource pool of the GPU service platform 130, and determine all pending jobs that are currently scheduled for execution (or which are being executed) by the GPU server nodes” [Sun ¶ 66]. “In this regard, when a first client system is idling (e.g., the user is not executing the GPU-accelerated application, or the GPU accelerated application is not utilizing the GPU device at a given time), the GPU device can be utilized by a second client system” [Sun ¶ 59]. “For instance, the client request may specify a number of GPU devices for handling the GPU processing tasks associated with the GPU service request, wherein the allocation of one or more GPU server nodes within the server cluster 150 is determined so that the allocated GPU server nodes comprise a total number of available GPU devices that meet the specified number of GPU devices as requested in the service request” [Sun ¶ 28].
Baillargeon in view of Sun fails to teach a type of a GPU card, and a topology structure of the GPU card, and selecting the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; the second node set; and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set.
However, Zhao teaches:
a type of a GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20].
and a topology structure of the GPU card, “A service request can include various user-specified conditions and demands for executing a given job (e.g., DL training) associated with the service request. For example, a service request may specify (i) a desired number (N) of accelerator devices (e.g., GPU devices) to provision for the requested job, (ii) a specific type/model of accelerator device (e.g., NVidia Pl00 GPU, Tensor flow TPU, etc.) to be utilized for the requested job, (iii) whether the provisioned accelerator devices should be exclusively allocated for the requested job or can be shared with other jobs, and/or (iv) other conditions based on a service level agreement (SLA) with the given client” [Zhao ¶ 20]. 
and selecting the first node set satisfying the GPU resource requirement information from the plurality of nodes, comprises: determining, based on the available node information and the available GPU resource information, a plurality of candidate nodes satisfying the amount of GPU resources requested by the target task for the pod; “The control server node will determine a set of candidate GPU devices across the cluster of GPU server nodes which can meet the resource demands of the server request (block 602). For example, based on the resource demands of the service request, the control server node can determine a set of all qualified GPU devices across the server cluster which match the resource demands, and which are free for allocation. The set of candidate GPU devices can be GPU devices that reside on multiple GPU sever nodes.” [Zhao ¶ 70, fig. 6].
determining, from the plurality of candidate nodes, at least one second node satisfying the type of the GPU card and the topology structure of the GPU card, to obtain a second node set; “Next, the control server node will evaluate the candidate GPU devices using topology information in the topology database 146 to select an optimal set of GPU devices to provision for handling the service request (block 604)” [Zhao ¶ 71]. “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices (second node set) among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. SA). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72].
each second node in the second node set; “In this instance, the set of N GPU devices (second node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
and selecting, from the second first node set, at least one first node where the idle quantity of GPU resource satisfies the amount of GPU resources requested by the target task, to obtain the first node set. “In this instance, the set of N GPU devices which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72 Examiner notes the selected set of N GPU devices and their corresponding nodes are the first node set].

With regard to claim 21, Baillargeon in view of Sun in view of Zhao teaches the non-transitory computer-readable storage medium of claim 20, as referenced above. Baillargeon in view of Sun fails to teach wherein selecting the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node.
However, Zhao teaches:
wherein selecting the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; “FIG. 4 illustrates an example hardware topology of a GPU server node 400, and a corresponding system topology view 420, which can be generated and reported by a reporting agent 162 (FIG. 1) using a topology detection command utility, according to an embodiment of the invention” [Zhao ¶ 62]. “The system topology view 420 includes information which indicates that: (i) 4 GPUs were detected in the example topology 400; (ii) GPU0 and GPU1 are interconnected via an internal PCIe switch (PIX) with a CPU affinity to NUMA socket 0 (CPU0-7, 16-23), connected with Mellanox RoCE (single port) (mlx5_0) via host PCIe switch (PHB); and that (iii) GPU2 and GPU3 are interconnected via an internal PCIe switch (PIX), with a CPU affinity to NUMA socket1, with a long communication path between the Mellanox RoCE card and GPU2/GPU3” [Zhao ¶ 64, Fig. 4].
and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (third node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices (fourth node set) that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].

With regard to claim 25, Baillargeon in view of Sun in view of Zhao teaches the electronic device of claim 18, as referenced above. Baillargeon in view of Sun fails to teach wherein selecting the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node.
However, Zhao teaches:
wherein selecting the extended node set from the plurality of nodes, comprises: acquiring information of a switch corresponding to each first node in the first node set; “FIG. 4 illustrates an example hardware topology of a GPU server node 400, and a corresponding system topology view 420, which can be generated and reported by a reporting agent 162 (FIG. 1) using a topology detection command utility, according to an embodiment of the invention” [Zhao ¶ 62]. “The system topology view 420 includes information which indicates that: (i) 4 GPUs were detected in the example topology 400; (ii) GPU0 and GPU1 are interconnected via an internal PCIe switch (PIX) with a CPU affinity to NUMA socket 0 (CPU0-7, 16-23), connected with Mellanox RoCE (single port) (mlx5_0) via host PCIe switch (PHB); and that (iii) GPU2 and GPU3 are interconnected via an internal PCIe switch (PIX), with a CPU affinity to NUMA socket1, with a long communication path between the Mellanox RoCE card and GPU2/GPU3” [Zhao ¶ 64, Fig. 4].
and grouping a node among the plurality of nodes that corresponds to a same switch as the first node, into a third node set corresponding to the first node; or grouping a node among the plurality of nodes that corresponds to a different switch from the first node, into a fourth node set corresponding to the first node; “For example, one rule (e.g., Rule 1) can specify to determine a set of N GPU devices among the candidate GPU devices which have the same interconnection topology, starting from the highest ranked interconnection topology (e.g., NVLink, FIG. SA), and then to lower ranked interconnection topologies (e.g., PIX, PXB, etc., FIG. 5A). For example, there may exist a plurality (N) of candidate GPU devices that reside on one or more server nodes which implement the NVLink communication protocol” [Zhao ¶ 72, Fig. 5A]. “In this instance, the set of N GPU devices (third node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning. On the other hand, there may not be enough (less than N) candidate GPU devices that implement the highest ranked (e.g., NVLink) communication protocol, but rather there may be N candidate GPU devices (fourth node set) that implement a next highest ranked (e.g., PCIe) interconnection topology. In this case, the set of N candidate GPU devices which implement the next highest ranked interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].
wherein an extended node set corresponding to the first node comprises the third node set corresponding to the first node and/or the fourth node set corresponding to the first node. “In this instance, the set of N GPU devices (extended node set) which have the same NVLink interconnection topology can be selected for scheduling and provisioning” [Zhao ¶ 72].

Claims 6, 8, 22, 23, 26, and 27 are rejected under 35 U.S.C. 103 as being unpatentable over Baillargeon (US 2023/0153162 A1) in view of Sun (US 2019/0197655 A1) in view of Zhao (US 2019/0312772 A1) in view of Randhawa (US 9,077,580 B1).

With regard to claim 6, Baillargeon in view of Sun in view of Zhao teaches the method of claim 4, as referenced above. Baillargeon further teaches wherein allocating, by the electronic device, the first target node and the second target node for the pod, from the first node set or from the first and extended node sets, comprises: acquiring load situations respectively corresponding to the first node, the third node set and the fourth node set; “If a resource quota is enabled in a namespace for computing resources like CPU and memory, users may specify requests or limits for those values; otherwise, the resource quota system may reject pod creation” [Baillargeon ¶ 44]. “In some embodiments, a pod is only scheduled if all the resource "requests" are satisfied including CPU, memory and extended resources (load situations): The pod remains may in the PENDING state if a node cannot satisfy all the resource requirements; or The pod is not created and a pod controller is waiting for an increase in resource quota” [Baillargeon ¶ 64 Examiner notes this interpretation of load situation is in accordance with the description given in ¶ 61 of the instant specification]. 
Baillargeon in view of Sun fails to teach determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute.
However, Zhao teaches:
determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute; “The computing resource scheduling and provisioning module 142 is configured to implement a topology aware provisioning process that is based on a "weighted" consideration of factors (attributes) including current cluster topology and bandwidth usage, which enables the computing service platform 130 to provide intelligent, optimized computing infrastructures that can fully utilize state-of-the-art hardware accelerators (e.g., GPU, FPGA etc.) and better serve emerging workloads like distributed deep learning, or other HPC workloads” [Zhao ¶ 35].
Baillargeon in view of Sun in view of Zhao fails to teach determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node.
However, Randhawa teaches:
determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; “A weighting module 313 of node preference management system 101 can weigh the different parameters as desired, with different weighting values being applied to different parameters in different embodiments, depending on the importance given to the various factors in the calculation of node suitability… In other words, weights can be applied to measured parameters concerning nodes 303, and the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings 301 (total weight value)” [Randhawa Col. 7 Lines 20-25, 26-30]. 
wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61].
and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61]. “In other words, in response to the node preference management system 101 determining that it is time to select one (or more) most preferred node(s) 303 to perform the specific functional role in the cluster 305, the node parameters are measured, the user preference values are gleaned (e.g., received from a user, read from a configuration file, default values used, etc.), and the preference ratings 301 are calculated. The node(s) 303 best suited to perform the specific functional role as indicated by the calculated preference ratings 301 is then appointed to perform the role in the cluster 305” [Randhawa Col. 9 Lines 3-13].  
Randhawa is considered to be analogous to the claimed invention because it is in the same field of multiprogramming arrangements. The system of Baillargeon in view of Sun in view of Zhao includes a weighted consideration of attributes in the provisioning process for assigning GPU nodes to tasks. This can be combined with the process of Randhawa wherein node preferences are calculated using weighted parameters. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of Randhawa and include determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. Doing so would allow for flexibility in the prioritization of different attributes within the node selection process. “The preference rating for each specific node indicates its suitability for the specific functional role in the cluster, relative to the other nodes of the plurality. Weights can be applied to the measured parameters concerning nodes, such that the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings for the nodes” [Randhawa Col. 2 Lines 29-36].

With regard to claim 8, Baillargeon in view of Sun in view of Zhao in view of Randhawa teaches the method of claim 6, as referenced above. Baillargeon fails to teach wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. 
However, Sun teaches wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. “wherein the service request specifies one or more attributes associated with the GPU processing tasks specified by the service request, wherein the one or more attributes specify at least one of a quality of service (QoS) and a priority level for executing the GPU processing tasks, and wherein allocating comprises determining at least two GPU server nodes within the cluster of GPU server nodes having sufficient processing resources to satisfy the specified one or more attributes” [Sun Claim 4].

With regard to claim 22, Baillargeon in view of Sun in view of Zhao teaches the non-transitory computer-readable storage medium of claim 21, as referenced above. Baillargeon further teaches wherein allocating the first target node and the second target node for the pod, from the first node set or from the first and extended node sets, comprises: acquiring load situations respectively corresponding to the first node, the third node set and the fourth node set; “If a resource quota is enabled in a namespace for computing resources like CPU and memory, users may specify requests or limits for those values; otherwise, the resource quota system may reject pod creation” [Baillargeon ¶ 44]. “In some embodiments, a pod is only scheduled if all the resource "requests" are satisfied including CPU, memory and extended resources (load situations): The pod remains may in the PENDING state if a node cannot satisfy all the resource requirements; or The pod is not created and a pod controller is waiting for an increase in resource quota” [Baillargeon ¶ 64 Examiner notes this interpretation of load situation is in accordance with the description given in ¶ 61 of the instant specification]. 
Baillargeon in view of Sun fails to teach determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute.
However, Zhao teaches:
determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute; “The computing resource scheduling and provisioning module 142 is configured to implement a topology aware provisioning process that is based on a "weighted" consideration of factors (attributes) including current cluster topology and bandwidth usage, which enables the computing service platform 130 to provide intelligent, optimized computing infrastructures that can fully utilize state-of-the-art hardware accelerators (e.g., GPU, FPGA etc.) and better serve emerging workloads like distributed deep learning, or other HPC workloads” [Zhao ¶ 35].
Baillargeon in view of Sun in view of Zhao fails to teach determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node.
However, Randhawa teaches:
determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; “A weighting module 313 of node preference management system 101 can weigh the different parameters as desired, with different weighting values being applied to different parameters in different embodiments, depending on the importance given to the various factors in the calculation of node suitability… In other words, weights can be applied to measured parameters concerning nodes 303, and the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings 301 (total weight value)” [Randhawa Col. 7 Lines 20-25, 26-30]. 
wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61].
and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61]. “In other words, in response to the node preference management system 101 determining that it is time to select one (or more) most preferred node(s) 303 to perform the specific functional role in the cluster 305, the node parameters are measured, the user preference values are gleaned (e.g., received from a user, read from a configuration file, default values used, etc.), and the preference ratings 301 are calculated. The node(s) 303 best suited to perform the specific functional role as indicated by the calculated preference ratings 301 is then appointed to perform the role in the cluster 305” [Randhawa Col. 9 Lines 3-13].  
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of Randhawa and include determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. Doing so would allow for flexibility in the prioritization of different attributes within the node selection process. “The preference rating for each specific node indicates its suitability for the specific functional role in the cluster, relative to the other nodes of the plurality. Weights can be applied to the measured parameters concerning nodes, such that the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings for the nodes” [Randhawa Col. 2 Lines 29-36].

With regard to claim 23, Baillargeon in view of Sun in view of Zhao in view of Randhawa teaches the non-transitory computer-readable storage medium of claim 22, as referenced above. Baillargeon fails to teach wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. 
However, Sun teaches wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. “wherein the service request specifies one or more attributes associated with the GPU processing tasks specified by the service request, wherein the one or more attributes specify at least one of a quality of service (QoS) and a priority level for executing the GPU processing tasks, and wherein allocating comprises determining at least two GPU server nodes within the cluster of GPU server nodes having sufficient processing resources to satisfy the specified one or more attributes” [Sun Claim 4].

With regard to claim 26, Baillargeon in view of Sun in view of Zhao teaches the electronic device of claim 25, as referenced above. Baillargeon further teaches wherein allocating the first target node and the second target node for the pod, from the first node set or from the first and extended node sets, comprises: acquiring load situations respectively corresponding to the first node, the third node set and the fourth node set; “If a resource quota is enabled in a namespace for computing resources like CPU and memory, users may specify requests or limits for those values; otherwise, the resource quota system may reject pod creation” [Baillargeon ¶ 44]. “In some embodiments, a pod is only scheduled if all the resource "requests" are satisfied including CPU, memory and extended resources (load situations): The pod remains may in the PENDING state if a node cannot satisfy all the resource requirements; or The pod is not created and a pod controller is waiting for an increase in resource quota” [Baillargeon ¶ 64 Examiner notes this interpretation of load situation is in accordance with the description given in ¶ 61 of the instant specification]. 
Baillargeon in view of Sun fails to teach determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute.
However, Zhao teaches:
determining an attribute of the first node and each extended node in the third node set and/or the fourth node set, and a weight value corresponding to the attribute; “The computing resource scheduling and provisioning module 142 is configured to implement a topology aware provisioning process that is based on a "weighted" consideration of factors (attributes) including current cluster topology and bandwidth usage, which enables the computing service platform 130 to provide intelligent, optimized computing infrastructures that can fully utilize state-of-the-art hardware accelerators (e.g., GPU, FPGA etc.) and better serve emerging workloads like distributed deep learning, or other HPC workloads” [Zhao ¶ 35].
Baillargeon in view of Sun in view of Zhao fails to teach determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node.
However, Randhawa teaches:
determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; “A weighting module 313 of node preference management system 101 can weigh the different parameters as desired, with different weighting values being applied to different parameters in different embodiments, depending on the importance given to the various factors in the calculation of node suitability… In other words, weights can be applied to measured parameters concerning nodes 303, and the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings 301 (total weight value)” [Randhawa Col. 7 Lines 20-25, 26-30]. 
wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61].
and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. “For example, the set that contains the individual node 303 with the highest preference rating 301 can be chosen, the set with the highest sum of individual preference ratings 301 can be chosen, the set with the highest average, mean, mode or median can be chosen, etc” [Randhawa Col. 10 lines 57-61]. “In other words, in response to the node preference management system 101 determining that it is time to select one (or more) most preferred node(s) 303 to perform the specific functional role in the cluster 305, the node parameters are measured, the user preference values are gleaned (e.g., received from a user, read from a configuration file, default values used, etc.), and the preference ratings 301 are calculated. The node(s) 303 best suited to perform the specific functional role as indicated by the calculated preference ratings 301 is then appointed to perform the role in the cluster 305” [Randhawa Col. 9 Lines 3-13].  
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of Randhawa and include determining, based on the attribute and the weight value corresponding to the attribute, a total weight value of each node combination; wherein the node combination only includes the first node, or the node combination is composed of an extended node in the third or fourth node set and the first node; and determining a node combination with a highest total weight value among all node combinations generated based on the first node set, as the first target node and the second target node. Doing so would allow for flexibility in the prioritization of different attributes within the node selection process. “The preference rating for each specific node indicates its suitability for the specific functional role in the cluster, relative to the other nodes of the plurality. Weights can be applied to the measured parameters concerning nodes, such that the applied weights affect the relative consequence of the corresponding measured parameters in calculating preference ratings for the nodes” [Randhawa Col. 2 Lines 29-36].

With regard to claim 27, Baillargeon in view of Sun in view of Zhao in view of Randhawa teaches the electronic device of claim 25, as referenced above. Baillargeon fails to teach wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. 
However, Sun teaches wherein the attribute of any node comprises at least one of: a set to which the node pertains; network throughput of a switch corresponding to the node; usage of the switch corresponding to the node; an idle quantity of video memory of a GPU corresponding to the node; an idle quantity of computing power of the GPU corresponding to the node; an idle quantity of magnetic disk corresponding to the node; an idle quantity of central processing unit (CPU) corresponding to the node; or a GPU priority corresponding to the node. “wherein the service request specifies one or more attributes associated with the GPU processing tasks specified by the service request, wherein the one or more attributes specify at least one of a quality of service (QoS) and a priority level for executing the GPU processing tasks, and wherein allocating comprises determining at least two GPU server nodes within the cluster of GPU server nodes having sufficient processing resources to satisfy the specified one or more attributes” [Sun Claim 4].

Claims 9, 24, and 28 are rejected under 35 U.S.C. 103 as being unpatentable over Baillargeon (US 2023/0153162 A1) in view of Sun (US 2019/0197655 A1) in view of Zhao (US 2019/0312772 A1) in view of CAO (US 2024/0004700 A1).

With regard to claim 9, Baillargeon in view of Sun in view of Zhao teaches the method of claim 1, as referenced above. Baillargeon further teaches the second target node, “k8s cluster: A cluster consists of one or more master machines and multiple worker machines or nodes. The master (second target node) runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3].
Baillargeon in view of Sun in view of Zhao fails to teach further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task.
However, CAO teaches:
(Associating a target GPU resource in a first target node with a second target node) “In some embodiments, if it is detected in the cluster 402 that the third GPU 4023 is switched to the first state, the association information between the address of the third virtual resource 4033 and the address of the first GPU 4021 may be switched to the association information between the address of the third virtual resource 4033 and the address of the third GPU 4023, and send the data corresponding to the first GPU 4021 in the k+ 1th processing data received by the first GPU 4021 to the third GPU 4023” [CAO ¶ 121].
Although CAO does not explicitly teach further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the second target node, CAO teaches further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the (control processing unit) second target node, “One aspect of this disclosure provides a task processing method including associating and controlling, by a processing unit, a first resource to perform a task processing operation through N virtual resource identifiers in response to a task processing instruction… The method further includes associating, by the processing unit, a second resource with N-n virtual resource identifiers when the second resource switches from a second state to a first state” [CAO ¶ 4]. “In some embodiments, the first address information may include the physical address information of the data processing resources. For example, the first address information may include the network address information of the data processing resources, such as the IP address of the data processing resources, or the port numbers of the data processing resources… After receiving the request message, the cluster may send the first address information to the processing unit” [CAO ¶ 72-73].
	It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to combine this embodiment of CAO with the above embodiment and include further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Sending the service address of the target GPU resource to the second target node would allow the second target node to access the resource, which is a predictable result. “In the first manner, the processing unit may associate the first resource with N virtual resource identifiers. After the association is successful, at least some of the first resources in the first resource may be controlled to perform task processing operations through the N virtual resource identifiers” [CAO ¶ 31].
	to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. “B2, the processing unit controls the second resource based on the third data through the target virtual resource identifier and controls the first resource based on the hth processing data through the n virtual resource identifiers to perform the hth task processing process” [CAO ¶ 111].
CAO is considered to be analogous to the claimed invention because it is in the same field of task scheduling strategies. Baillargeon includes a second target node, the master node, which coordinates the worker nodes to execute workloads. This could be combined with CAO to include the processing unit of CAO which receives GPU addresses of various nodes in the system. Further, Baillargeon allows for the use of ports to communicate with sources external to the pod: “When the containers have to communicate outside the pod, they expose a port” [Baillargeon ¶ 4]. Therefore, it would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of CAO and include sending, by the electronic device, a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Doing so would allow for improved flexibility in task assignment to different system resources. “Therefore, in the task processing method provided by the embodiments of the present disclosure, the association information can be determined by the processing unit, which improves the flexibility of the association information and greatly improves the flexibility of the processing unit to associate the corresponding data processing resources for different task processing operations” [CAO ¶ 79].

With regard to claim 24, Baillargeon in view of Sun in view of Zhao teaches the non-transitory computer-readable storage medium of claim 19, as referenced above. Baillargeon further teaches the second target node, “k8s cluster: A cluster consists of one or more master machines and multiple worker machines or nodes. The master (second target node) runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3].
Baillargeon in view of Sun in view of Zhao fails to teach wherein the computer instruction is used to cause the computer to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task.
However, CAO teaches:
(Associating a target GPU resource in a first target node with a second target node) “In some embodiments, if it is detected in the cluster 402 that the third GPU 4023 is switched to the first state, the association information between the address of the third virtual resource 4033 and the address of the first GPU 4021 may be switched to the association information between the address of the third virtual resource 4033 and the address of the third GPU 4023, and send the data corresponding to the first GPU 4021 in the k+ 1th processing data received by the first GPU 4021 to the third GPU 4023” [CAO ¶ 121].
Although CAO does not explicitly teach further comprising: wherein the computer instruction is used to cause the computer to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, CAO teaches further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the (control processing unit) second target node, “One aspect of this disclosure provides a task processing method including associating and controlling, by a processing unit, a first resource to perform a task processing operation through N virtual resource identifiers in response to a task processing instruction… The method further includes associating, by the processing unit, a second resource with N-n virtual resource identifiers when the second resource switches from a second state to a first state” [CAO ¶ 4]. “In some embodiments, the first address information may include the physical address information of the data processing resources. For example, the first address information may include the network address information of the data processing resources, such as the IP address of the data processing resources, or the port numbers of the data processing resources… After receiving the request message, the cluster may send the first address information to the processing unit” [CAO ¶ 72-73].
	It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to combine this embodiment of CAO with the above embodiment and include wherein the computer instruction is used to cause the computer to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Sending the service address of the target GPU resource to the second target node would allow the second target node to access the resource, which is a predictable result. “In the first manner, the processing unit may associate the first resource with N virtual resource identifiers. After the association is successful, at least some of the first resources in the first resource may be controlled to perform task processing operations through the N virtual resource identifiers” [CAO ¶ 31].
	to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. “B2, the processing unit controls the second resource based on the third data through the target virtual resource identifier and controls the first resource based on the hth processing data through the n virtual resource identifiers to perform the hth task processing process” [CAO ¶ 111].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of CAO and include sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Doing so would allow for improved flexibility in task assignment to different system resources. “Therefore, in the task processing method provided by the embodiments of the present disclosure, the association information can be determined by the processing unit, which improves the flexibility of the association information and greatly improves the flexibility of the processing unit to associate the corresponding data processing resources for different task processing operations” [CAO ¶ 79].

With regard to claim 28, Baillargeon in view of Sun in view of Zhao teaches the electronic device of claim 17, as referenced above. Baillargeon further teaches the second target node, “k8s cluster: A cluster consists of one or more master machines and multiple worker machines or nodes. The master (second target node) runs the control plane functions and coordinates between all the nodes running the actual workloads knowns as pods” [Baillargeon ¶ 3].
Baillargeon in view of Sun in view of Zhao fails to teach wherein the instruction, when executed by the at least one processor, enables the at least one processor to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task.
However, CAO teaches:
(Associating a target GPU resource in a first target node with a second target node) “In some embodiments, if it is detected in the cluster 402 that the third GPU 4023 is switched to the first state, the association information between the address of the third virtual resource 4033 and the address of the first GPU 4021 may be switched to the association information between the address of the third virtual resource 4033 and the address of the third GPU 4023, and send the data corresponding to the first GPU 4021 in the k+ 1th processing data received by the first GPU 4021 to the third GPU 4023” [CAO ¶ 121].
Although CAO does not explicitly teach wherein the instruction, when executed by the at least one processor, enables the at least one processor to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, CAO teaches further comprising: sending, by the electronic device, a service address of the target GPU resource in the first target node to the (control processing unit) second target node, “One aspect of this disclosure provides a task processing method including associating and controlling, by a processing unit, a first resource to perform a task processing operation through N virtual resource identifiers in response to a task processing instruction… The method further includes associating, by the processing unit, a second resource with N-n virtual resource identifiers when the second resource switches from a second state to a first state” [CAO ¶ 4]. “In some embodiments, the first address information may include the physical address information of the data processing resources. For example, the first address information may include the network address information of the data processing resources, such as the IP address of the data processing resources, or the port numbers of the data processing resources… After receiving the request message, the cluster may send the first address information to the processing unit” [CAO ¶ 72-73].
	It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to combine this embodiment of CAO with the above embodiment and include wherein the instruction, when executed by the at least one processor, enables the at least one processor to further execute operations, comprising: sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Sending the service address of the target GPU resource to the second target node would allow the second target node to access the resource, which is a predictable result. “In the first manner, the processing unit may associate the first resource with N virtual resource identifiers. After the association is successful, at least some of the first resources in the first resource may be controlled to perform task processing operations through the N virtual resource identifiers” [CAO ¶ 31].
	to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. “B2, the processing unit controls the second resource based on the third data through the target virtual resource identifier and controls the first resource based on the hth processing data through the n virtual resource identifiers to perform the hth task processing process” [CAO ¶ 111].
It would be obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Baillargeon in view of Sun in view of Zhao to incorporate the teachings of CAO and include sending a service address of the target GPU resource in the first target node to the second target node, to enable the second target node to invoke the target GPU resource based on the service address to execute the target task. Doing so would allow for improved flexibility in task assignment to different system resources. “Therefore, in the task processing method provided by the embodiments of the present disclosure, the association information can be determined by the processing unit, which improves the flexibility of the association information and greatly improves the flexibility of the processing unit to associate the corresponding data processing resources for different task processing operations” [CAO ¶ 79].

Response to Arguments
Applicant's arguments filed 11/26/2025 have been fully considered but they are not persuasive. Applicant argues in substance:
I.	Second, the technical solution in the method claims is applied to technical fields of resource management, task allocation and the like in computer technology, and is applicable to scenarios such as machine learning. In a machine learning scenario, computing resources often need to be managed uniformly. When some clusters (such as Kubernetes cluster, K8S cluster for short) execute a task, resources will be allocated for a pod corresponding to the task, and specifically, a node, a graphics processing unit (GPU) and other resources will be allocated to the pod. In the related art, only the node with a GPU resource may be allocated to the pod during allocation. However, this allocation method has low resource utilization rates. 
Therefore, the technical solution in the method claims aims to address the problem that: only the node with a GPU resource may be allocated to the pod during allocation, which has a low resource utilization rate. 
At last, there are several advantages using the technical solution in the method claims: claim 1 adopts the allocation mode of allocating two nodes to the pod. One is the first target node where the target GPU resource allocated to the pod is located, and the other is the second target node where the pod allocated to the pod is located. The second target node may not have a GPU resource, thereby enabling the second target node to execute the target task corresponding to the pod based on the GPU resource of the first target node. By adopting the allocation mode of allocating different nodes to the pod and the GPU resource, the decoupling between the node where the GPU resource is located and the node where the pod is located is realized, the limitation of only allocating the node with GPU resource to the pod is eliminated, and the resource utilization rate is improved. 
Thus, a conclusion that the claimed inventions provide practical applications and also are necessarily rooted in computer technology in order to overcome one or more problems specifically arising in the realm of resource scheduling is undeniable. Therefore, the claims of the current Application are most decidedly patentable subject matter. 
a)	Examiner respectfully disagrees. As detailed in the rejection above, claim 1 recites abstract ideas of assignment and allocation which are mental processes. A person can assign and allocate resources mentally or with a pencil and paper. The additional elements of this claim amount to mere generic computing components, technological environment/field of use, and insignificant extra solution activity which do not integrate the abstract ideas into a practical application. Thus, the claim is directed to the judicial exception.
The allocation of claim 1 is a mental process which does not represent improvements to computer functionality or other technology. “… it is important to keep in mind that an improvement in the abstract idea itself (e.g. a recited fundamental economic concept) is not an improvement in technology. For example, in Trading Technologies Int’l v. IBG, 921 F.3d 1084, 1093-94, 2019 USPQ2d 138290 (Fed. Cir. 2019), the court determined that the claimed user interface simply provided a trader with more information to facilitate market trades, which improved the business process of market trading but did not improve computers or technology” [MPEP § 2106.05(a) II]. An improvement to the abstract idea of allocation does not amount to significantly more than the abstract idea. Further, the additional elements of the claim also fail to amount to significantly more than the abstract idea. The arguments have been considered but were not found to be persuasive.

II.	Baillargeon fails to disclose or teach the above distinguished technical features.  Furthermore, Applicant respectfully submits that there are no other reference documents cited by the Examiner that can remedy the aforementioned deficiencies of Baillargeon, and thus claim 1 is patentable over all cited references either individually or in combination.  For at least the same reasons, the present independent claims 17 and 19 are also patentable over all cited references either individually or in combination.
a)	Applicant's arguments fail to comply with 37 CFR 1.111(b) because they amount to a general allegation that the claims define a patentable invention without specifically pointing out how the language of the claims patentably distinguishes them from the references.
Further, as detailed in the rejection above, the limitations of claim 1 are taught by Baillargeon in view of Sun in view of Zhao. Baillargeon is not cited to teach many of the features referenced by Applicant. Baillargeon teaches the target cluster of nodes [¶ 2, 37], the electronic device configured to allocate resources [¶ 45], creating the pod for the target task [¶ 69], acquiring GPU resource requirement information [¶ 56-60], and allocating nodes for the pod [¶ 10, 45]. The combination of the teachings of Baillargeon with the teachings of Sun and Zhao, detailed in the rejection above, teaches the referenced features of claim 1.

Conclusion
Applicant's amendment necessitated the new ground(s) of rejection presented in this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a).
A shortened statutory period for reply to this final action is set to expire THREE MONTHS from the mailing date of this action. In the event a first reply is filed within TWO MONTHS of the mailing date of this final action and the advisory action is not mailed until after the end of the THREE-MONTH shortened statutory period, then the shortened statutory period will expire on the date the advisory action is mailed, and any nonprovisional extension fee (37 CFR 1.17(a)) pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of the advisory action. In no event, however, will the statutory period for reply expire later than SIX MONTHS from the mailing date of this final action.

Examiner respectfully requests, in response to this Office action, support be shown for language added to any original claims on amendment and any new claims. That is, indicate support for newly added claim language by specifically pointing to page(s) and line number(s) in the specification and/or drawing figure(s). This will assist Examiner in prosecuting the application.  

When responding to this Office Action, Applicant is advised to clearly point out the patentable novelty which he or she thinks the claims present, in view of the state of   the art disclosed by the references cited or the objections made. He or she must also show how the amendments avoid such references or objections. See 37 CFR 1.111(c).

Any inquiry concerning this communication or earlier communications from the examiner should be directed to ARI F RIGGINS whose telephone number is (571)272-2772. The examiner can normally be reached Monday-Friday 7:00AM-4:30PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Bradley Teets can be reached at (571) 272-3338. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/A.F.R./Examiner, Art Unit 2197                                                                                                                                                                                                        

/BRADLEY A TEETS/Supervisory Patent Examiner, Art Unit 2197
Read full office action
Resource Allocation Method, Electronic Device and Storage Medium

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Resource Allocation Method, Electronic Device and Storage Medium

Interview Optional

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email