DETAILED ACTION
This action is responsive to claims filed on 8 May 2023.
Claims 1-20 are pending for examination.
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Claim Objections
Claim 14 is objected to because of the following informalities: “a a model manager” in line 2 should be “a model manager”. Appropriate correction is required.
Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b) CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.
The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.
Claims 7-9, 13-20 are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA 35 U.S.C. 112, the applicant), regards as the invention.
Claim 13 recites the limitation "the model manager" in line 5. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, the term "the model manager" has been construed to be “a model manager”. Claims 14-16, which are dependent on claim 13, are similarly rejected.
Claim 17 recites the limitation "the near-real time radio access network intelligent controller" in lines 2-3. There is insufficient antecedent basis for this limitation in the claim. For examination purposes, the term "the near-real time radio access network intelligent controller" has been construed to be “a near-real time radio access network intelligent controller”. Claims 18-20, which are dependent on claim 17, are similarly rejected.
The term “relevant” in claim 7 and analogous claim 19 is a relative term which renders the claim indefinite. The term “relevant” is not defined by the claim, the specification does not provide a standard for ascertaining the requisite degree, and one of ordinary skill in the art would not be reasonably apprised of the scope of the invention. In such a product, any degree of relation between models can be considered "relevant”. For examination purposes, "relevant” has been construed to be a model with any degree of relation produced as a result of the model searching procedure. Claims 8-9, are similarly rejected.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception, abstract idea, without significantly more.
Step 1: This part of the eligibility analysis evaluates whether the claim(s) falls within any statutory
category. MPEP 2106.03:
According to the first part of the Alice analysis, in the instant case, the claims were determined
to be directed to one of the four statutory categories: an article of manufacture, a method/process (Claims 17-20), a machine/system/product (Claims 1-16), and a composition of matter. Based on the claims being determined to be within of the four categories (i.e., process, machine, manufacture, or composition of matter), (Step 1), it must be determined if the claims are directed to a judicial exception (i.e., law of nature, natural phenomenon, and abstract idea).
Step 2A Prong One: This part of the eligibility analysis evaluates whether the claim(s) recites a
judicial exception.
Regarding independent claims 1, 12, 17, the claims recite a judicial exception (i.e., an abstract idea enumerated in the 2019 PEG) without significantly more (Step-2A: Prong One). The applicant's claim limitations under broadest reasonable interpretation covers activities classified under mental processes - concepts performed in the human mind (including an observation, evaluation, judgment, opinion) (see MPEP § 2106.04(a)(2), subsection Ill) and the 2019 PEG. As evaluated below:
Claims 1, 12, 17:
“determining that the near-real time radio access network intelligent controller has received, from a service management and orchestration device, first authentication data representative of a first authentication” (mental process of judgement)
If the identified limitation(s) falls within at least one of the groupings of abstract ideas, it is
reasonable to conclude that the claim(s) recites an abstract idea in Step 2A Prong One.
Step 2A Prong Two: This part of the eligibility analysis evaluates whether the claim(s) as a whole integrates the recited judicial exception into a practical application of the exception. As evaluated below:
“receiving, from the service management and orchestration device, second authentication data representative of a second authentication to deploy a machine learning model on the serving hub device”
“comprising at least a predict application programming interface in which the machine learning model receives a request from the xApp and, in response, provides a prediction to the xApp to satisfy the request”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions for mere data gathering or data output, see MPEP 2106.05(g).
“to deploy an xApp on the near-real time radio access network intelligent controller”
“enabling authenticated communication for a model manager device that manages the machine learning model via a group of application programming interfaces”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Accordingly, these additional elements do not integrate the abstract idea into a practical application because they do not impose any meaningful limits on practicing the abstract idea when considered as an ordered combination and as a whole.
Step 2B: This part of the eligibility analysis evaluates whether the claim, as a whole, amounts to
significantly more than the recited exception, i.e., whether any additional element, or combination of
additional elements, adds an inventive concept to the claim. MPEP 2106.05.
First, the additional elements considered as part of the preamble and the additional elements
directed to the use of computer technology are deemed insufficient to transform the judicial exception
to a patentable invention to a patentable invention because they generally link the judicial exception to
the technology environment, see MPEP 2106.05(h).
Second, the additional elements directed to mere application of the abstract idea or mere instructions to implement an abstract idea on a computer are deemed insufficient to transform the judicial exception to a patentable invention to a patentable invention because the limitations generally apply the use of a generic computer and/or process with the judicial exception, see MPEP 2106.05(f).
Third, the claims are directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception. The courts have found these types of limitations insufficient to transform the judicial exception to a patentable invention, see MPEP 2106.05(g).
Lastly, the claims directed to data gathering activity as noted above, are deemed directed to an insignificant extra-solution activity. The courts have found these types of limitations insufficient to
qualify as "significantly more", see MPEP 2106.05(g).
Furthermore, when considering evidence in view of Berkheimer v. HP, Inc., 881 F.3d 1360, 1368, 125 USPQ2d 1649, 1654 (Fed. Cir. 2018), see USPTO Berkheimer Memorandum (April 2018). Examiner notes Berkheimer: Option 2 - A citation to one or more of the court decisions discussed in MPEP § 2106.05(d}(II} as noting the well understood, routine, conventional nature of the additional element (s) (e.g., limitations directed to mere data gathering):
The courts have recognized the following computer functions as well understood, routine, and conventional functions when they are claimed in a merely generic manner (e.g., at a high level of generality) or as insignificant extra-solution activity, see MPEP 2106.05(d).
The additional limitations, as analyzed, failed to integrate a judicial exception into a practical application at Step 2A and provide an inventive concept in Step 2B, per the analysis above. Thus, considering the additional elements individually and in combination and the claims as a whole, the additional elements do not provide significantly more than the abstract idea. This claim is not patent eligible. Therefore, in examining elements as recited by the limitations individually and as an ordered combination, as a whole, claims 1, 12, 17 do not recite what the courts have identified as "significantly more".
Furthermore, regarding dependent claims 2-11, which depend from claim 1, claims 13-16, which depend from claim 12, claims 18-20, which depend from claim 17, the claims are directed to a judicial exception (i.e., an abstract idea enumerated in the 2019 PEG, a law of nature, or a natural phenomenon) without significantly more as highlighted below in the claim limitations by evaluating the claim limitations under the Step2A and 2B:
Claims 2, 13:
Incorporates the rejections of claims 1, 12, respectively.
“a deploy application programming interface that enables authenticated deployment, to the serving hub device, of the machine learning model that was pre-trained by the model manager device according to a first dataset”
“a retrain application programming interface that retrains all model weights of the machine learning model based on a received second dataset, resulting in a retrained machine learning model”
“calls the deploy application programming interface to deploy the retrained machine learning model on the serving hub device”
“a fine tune application that adjusts at least one model weight of the machine learning model based on a received third dataset, resulting in a tuned machine learning model”
“calls the deploy application programming interface to deploy the tuned machine learning model on the serving hub device”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claims 3, 14:
Incorporates the rejections of claims 1, 13, respectively.
“wherein the model manager device is situated in the near real time radio access network intelligent controller and uses resources of the near real time radio access network intelligent controller to process all calls to the group of application programming interfaces”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 4:
Incorporates the rejection of claim 1.
“wherein the model manager device is situated in the service management and orchestration device and uses resources of the service management and orchestration device to process specified calls to the group of application programming interfaces, while exposing the predict application programming interface in the serving hub device”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 15:
Incorporates the rejection of claim 13.
“further comprising a model manager device that manages the machine learning model, wherein the model manager device is situated in the service management and orchestration device and uses resources of the service management and orchestration device to process a group of calls to the group of application programming interfaces”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 16:
Incorporates the rejection of claim 15.
“wherein the model manager device exposes the predict application programming interface in the serving hub and uses resources of the near real time radio access network intelligent controller to process calls to the predict application”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claims 5, 18:
Incorporates the rejections of claims 1, 17, respectively.
“wherein the operations further comprise performing a caching procedure that determines, from among a group of machine learning models that is deployed on the serving hub device, a subscribed group of machine learning models that is to be placed in a cache” (mental process of judgement)
The recitation is directed to mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea and are considered to adding the words "apply it" (or an equivalent) with the judicial exception, See MPEP 2106.05(f).
Limitations directed to mere instructions to implement an abstract idea on a computer/using computer as a tool cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 6:
Incorporates the rejection of claim 5.
“wherein the caching procedure determines the subscribed group as a function of at least one of: a first criterion, satisfaction of which is indicative of a low latency constraint associated with the xApp, a second criterion, satisfaction of which is indicative of a recency of use, a third criterion, satisfaction of which is indicative of a frequency of use, or a fourth criterion, satisfaction of which is indicative of a subscription flow to a given machine learning model of the group of machine learning models”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claims 7, 19:
Incorporates the rejections of claims 1, 17, respectively.
“wherein the operations further comprise performing a model searching procedure that, in response to a search query, determines, from among a group of machine learning models that is deployed on the serving hub device, a most relevant group of machine learning models that satisfies the search query and communicates documentation associated with the most relevant group of machine learning models” (mental process of judgement)
The recitation is directed to mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea and are considered to adding the words "apply it" (or an equivalent) with the judicial exception, See MPEP 2106.05(f).
Limitations directed to mere instructions to implement an abstract idea on a computer/using computer as a tool cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 8:
Incorporates the rejection of claim 7.
“wherein the model searching procedure is configured to interpret a natural language search query by matching search keywords to model keywords entered as metadata model tags upon deployment to the serving hub device” (mental process of judgement)
The recitation is directed to mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea and are considered to adding the words "apply it" (or an equivalent) with the judicial exception, See MPEP 2106.05(f).
Limitations directed to mere instructions to implement an abstract idea on a computer/using computer as a tool cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 9:
Incorporates the rejection of claim 8.
“wherein the model searching procedure uses at least one of a named entity recognition process or a syntactical and semantic matching process”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claims 10, 20:
Incorporates the rejections of claims 1, 17, respectively.
“wherein the operations further comprise performing a global training procedure in which the machine learning model is trained on global data that is collected from at least one of: multiple different service management and orchestration devices or multiple different radio access network intelligent controllers”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
Claim 11:
Incorporates the rejection of claim 1.
“wherein the operations further comprise performing a global sharing procedure in which a pre-trained machine learning model that is trained on global data or local data is shared by the service management and orchestration device to another service management and orchestration device”
These recitations are deemed insufficient to transform the judicial exception to a patentable invention because the recitation is directed to instructions merely indicating a field of use or technological environment in which to apply a judicial exception, see MPEP 2106.05(h).
Limitations directed to mere instructions indicating a field of use or technological environment in which to apply a judicial exception cannot integrate a judicial exception into a practical application at Step 2A or provide an inventive concept in Step 2B.
The dependent claims as analyzed above, do not recite limitations that integrated the judicial exception into a practical application. In addition, the claim limitations do not include additional elements that are sufficient to amount to significantly more than the judicial exception (Step-2B). Therefore, the claims do not recite any limitations, when considered individually or as a whole, that recite what have the courts have identified as "significantly more", see MPEP 2106.05; and therefore, as a whole the claims are not patent eligible. As shown above, the dependent claims do not provide any additional elements that when considered individually or as an ordered combination, amount to significantly more than the abstract idea identified. Therefore, as a whole, the dependent claims do not recite what have the courts have identified as "significantly more" than the recited judicial exception. Therefore, claims 2-11, 13-16, 18-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to a judicial exception and does not recite, when claim elements are examined individually and as a whole, elements that the courts have identified as "significantly more" than the recited judicial exception.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary. Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
Claims 1, 3-4, 11-12, 14-17 are rejected under 35 U.S.C. 103 as being unpatentable over Ranganath et al. (U.S. Pre-Grant Publication No. 20240259879, hereinafter ‘Ranganath'), in view of Balasubramanian et al. (NPL: "RIC: A RAN Intelligent Controller Platform for AI-Enabled Cellular Networks", hereinafter 'Balasubramanian').
Regarding claim 1 and analogous claims 12, 17, Ranganath teaches A serving hub device, comprising: a processor that processes data for a near real time radio access network intelligent controller; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, comprising ([0057] The vRAN processors 3 c 52 are processors that include (or are configured with) one or more optimizations for vRAN functionality. The vRAN processors 3 c 52 may be COTS HW or application-specific HW elements. As examples, the vRAN processors 3 c 52 may be Intel® Xeon® D processors, Intel® Xeon® Scalable processors, AMD® Epyc® 7000, AMD® “Rome” processors, and/or the like. The vRAN accelerators 3 c 54 are HW accelerators that are configured to accelerate 4G/LTE and 5G vRAN workloads. As examples, the vRAN accelerators 3 c 54 may be Forward Error Correction (FEC) accelerators (e.g., Intel® vRAN dedicated accelerator ACC100m Xolinx® T1 Telco Accelerator Card, and the like), low density parity check (LDPC) accelerators (e.g., AccelerComm® LE500 and LD500), networking accelerators (e.g., Intel® FPGA PAC N3000), and/or the like. Additionally or alternatively, the vRAN processors 3 c 52 may be the same or similar as processor(s) 1752 of FIG. 17 , and the vRAN accelerators 3 c 54 may be the same or similar as the acceleration circuitry 1764 of FIG. 17 . Interaction between the vRAN processors 3 c 52 and vRAN accelerators 3 c 54 may take place via an acceleration abstraction layer (AAL) for standardized interoperability, via an inline HW accelerator pipeline or functional chains, via virtual input/output (vI/O) interfaces, via single root I/O virtualization (SR-IOV) interfaces, and/or via some other interface or mechanism. The HW platform layer 3 c 50 also includes platform compute HW 3 c 56, which includes compute/processor, acceleration, memory, and storage resources that can be used for UE-specific data processing and/or RANF-specific data processing. The compute, acceleration, memory, and storage resources of the platform compute HW 3 c 56 correspond to the processor circuitry 1752, acceleration circuitry 1764, memory circuitry 1754, and storage circuitry 1758 of FIG. 17, respectively.):
determining that the near-real time radio access network intelligent controller has received, from a service management and orchestration device, first authentication data representative of a first authentication to deploy an xApp on the near-real time radio access network intelligent controller ([0026] FIG. 1 depicts an example O-RAN architecture 100 including various determining that the near-real time radio access network intelligent controller has received, from a service management and orchestration device interfaces between a RAN Intelligent Controller (RIC) 114 and service management and orchestration framework (SMO) 102. The SMO 102 may be the same or similar as the SMO 802, 902, 1002 and/or the MO 301, 3 c 02 discussed infra. The RIC 114 is an NF that also includes intelligent applications (apps) such as network ML/AI apps functioning with it to automate various NFs for predictive maintenance, enhanced operation, and the like. The O-RAN architecture 100 describes a model for RAN resource control, managed at the upper level by orchestration and automation components of the SMO 102 (e.g., policy, configuration, inventory, design, and non-RT RIC 112). These components control and communicate with the near-RT RIC 114 via the A1 interface. The near-RT RIC 114 provides management of and connectivity to RAN nodes (e.g., eNB/gNB 910, RU 816, DU 916, and the like). In some implementations, the near-RT RIC 114 may be the same or similar as the near-RT RIC 814 of FIG. 8 and/or the RIC 3 c 14 of FIG. 3 c , and some aspects of the near-RT RIC 114 may be described infra w.r.t FIG. 3 c . Additionally, a core set of services provided by the near-RT RIC 114 is extensible by custom third-party xApps, which are instantiated as cloud services and have low-latency connectivity to RAN nodes. first authentication data representative of a first authentication to deploy an xApp on the near-real time radio access network intelligent controller xApps communicate with the RIC 114 and its managed RAN nodes via the E2 interface. O-RAN defines and clarifies the usage of various interfaces in the O-RAN architecture 100. These interfaces are summarized by Table 1.; [0035] FIG. 3 a depicts an example RAN intelligent xApp manager architecture 300 a in an O-RAN framework. In this example, the xApp manager architecture 300 a includes an xApp manager analytics engine 310-a implemented as an xApp 310 in an app layer 330 of the near-RT RIC 114, and a counterpart xApp manager measurement engine 320 implemented by the O-DU 115. The app layer 330 also includes a set of xApps 310-1 to 310-N (where N is a number). The xApps 310-a, 310-1 to 310-N (collectively referred to as “xApps 310”) may be the same or similar as xApps 410, 1110, and 1210 of FIGS. 4, 11, and 12.);
enabling authenticated communication for a model manager device that manages the machine learning model via a group of application programming interfaces, comprising at least a predict application programming interface in which the machine learning model receives a request from the xApp and, in response, provides a prediction to the xApp to satisfy the request ([0072] For example, the xApp manager 425 can utilize the AI/ML model(s) 3 c 24 to make various predictions/inferences about future resource requirements for individual xApps 410 through various correlations.; [0075] Further, the enabling authenticated communication for a model manager device xApp manager 425 reads or obtains one or more policies 441 from the policy store 440. The telemetry data 515, measurement data 415, and policies 441 can be that manages the machine learning model via a group of application programming interfaces obtained via the service bus 435, one or more APIs, and/or network interfaces. The xApp manager 425 uses the telemetry data 515 (or profile information 515) and the measurement data 415 (either raw or processed), and in response, provides a prediction to the xApp to satisfy the request generates observability insights 525 using AI/ML mechanisms as discussed previously and in accordance with the one or more policies 441. The observability insights 525 are comprising at least a predict application programming interface in which the machine learning model receives a request from the xApp provided to one or more xApps 410, which use the insights 525 to adjust their performance. The insights 525 are provided to the xApps 410 via the service bus 435, one or more APIs, and/or network interfaces.).
Ranganath fails to teach receiving, from the service management and orchestration device, second authentication data representative of a second authentication to deploy a machine learning model on the serving hub device; and
Balasubramanian teaches receiving, from the service management and orchestration device, second authentication data representative of a second authentication to deploy a machine learning model on the serving hub device ([ML-Based Control Loops, pg. 11-12] The life-cycle training and mapping of the AI/ML models for the three control loops is illustrated in Figure 4. Loop 3: receiving, from the service management and orchestration device, second authentication data representative of a second authentication to deploy a machine learning model on the serving hub device The ML/AI analytics engine for loop 3 uses RAN KPI statistics reported from the RAN nodes to SMO over the O1 interface as the training dataset to build its ML models. SMO uses these models to make accurate decisions on policies and configuration of KPI objectives. These decisions are further communicated to the near-real-time RIC over A1. The aforementioned procedure is outlined in Steps 1–7 in Figure 4.; Loop 2: Due to the near-real-time nature of loop 2, the ML-based microservices at the RIC use hybrid models, comprising a mix of both offline and online ML. Online ML models (e.g., deep/recurrent neural networks, reinforcement learning) can achieve lower control loop latency since they are typically processed on a single stream of incoming RAN data to the RIC. However, since online ML models suffer from accuracy in generating optimized RRM decisions, complementary offline ML models are used leveraging historical information in R-NIB. Typically, while inferences are being made in loop 2 (in the order of millisecond), larger time-scale analysis is continuously being done in loop 3. Feedback on the precision and accuracy of the predictions in loop 2 is provided to the nonreal-time RIC in loop 3 via the O1 interface. This allows for fine tuning the ML models and guides the operation of the overall application toward a certain objective. If the ML models in loop 2 misbehave or exhibit degraded performance, the nonreal-time RIC may instruct loop 2 to terminate the ML model or switch to a more improved model. This results in enhanced RRM decisions at the RIC in near-real-time, as shown in Step 7; Loop 3: SMO offers policy guidance on the minimum fraction of traffic that should be split and served on any base station (eNB or gNB) participating in DC for any UE. It uses offline ML models built from large historical RAN KPI data reported over O1 to efficiently compute this threshold for more recent RAN conditions reflecting up to the current state. For example, offline ML models could suggest a load imbalance across the pairs of base stations, participating in DC, whose traffic split does not meet this minimum threshold. Such threshold fractions are communicated as policy guidance to the RIC over A1 and/or to the RAN over O1, and can be updated over coarser time scales.); and
Ranganath and Balasubramanian are considered to be analogous to the claimed invention because they are in the same field of network and edge computing. In view of the teachings of Ranganath, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Balasubramanian to Ranganath before the effective filing date of the claimed invention in order to improve customer experience and optimize spectral efficiency by using machine learning (ML)-driven policies to tailor the RAN for unique spectrum position and geography based on a holistic area-wide network view (cf. Balasubramanian, [Introduction] Disaggregation lowers the barrier to entry for new entrants, expands the ecosystem beyond incumbent domain vendors, allows mixing and matching best of breed technology parts to reduce costs and foster innovation that ultimately improves the end-user Quality of Experience (QoE) by efficient network customization. Disaggregation improves customer experience and optimizes spectral efficiency by using machine learning (ML)-driven policies to tailor the RAN for unique spectrum position and geography based on a holistic area-wide network view. Finally, by decoupling the CP and UP, the RAN is more flexible and allows for independent scaling.).
Regarding claim 3 and analogous claim 14, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1 and The non-transitory computer-readable medium of claim 13, respectively.
Ranganath teaches wherein the model manager device is situated in the near real time radio access network intelligent controller and uses resources of the near real time radio access network intelligent controller to process all calls to the group of application programming interfaces ([0070] The xApps 420 also includes an wherein the model manager device xApp manager 425, which is a logical element/entity that leverages observation data, and generates meaningful insights/knowledge using one or more AI/ML models 3 c 24. The observation data can include measurement data 415 and/or platform telemetry data (or profiling information). For example, the xApp manager 425 collects E2 measurement data 415 via the E2 mediation function 460 and telemetry data via a collection agent (see e.g., FIG. 5), and analyzes the collected E2 measurement data 415 and telemetry data to determine HW, SW, and/or NW resource allocations for individual xApps 410. This may involve, for example, determining to scale up or down HW, SW, and/or NW resources for individual xApps 410, E2 nodes, and/or other elements in the O-RAN framework. The uses resources of the near real time radio access network intelligent controller to process all calls to the group of application programming interfaces resource allocations can also be included in signaling and/or PDUs/messages that are provided to individual xApps 410 via the service bus 435, and/or in events 416 provided to individual E2 nodes via the E2 mediation function 460 and the E2 interface. In these implementations, the events 416 and/or PDUs/messages can include instructions, commands, and/or relevant information (e.g., scaling factors, configuration data, and/or the like) for re-allocating and/or adjusting HW, SW, and/or NW resources for individual xApps 410 and/or individual RANFs operating on or by one or more E2 nodes. The xApp manager 425 adjusts or otherwise determines HW, SW, and/or NW resource usage/allocations according to service requirements for one or more network slices or service slices. (e.g., as defined by KPIs, KPMs, and/or SLAs). The is situated in the near real time radio access network intelligent controller near-RT RIC's 414 (or the xApp manager's 425) control over xApps 410 and/or E2 nodes is steered or otherwise guided according to one or more policies 441 and/or enrichment information provided by the non-RT RIC 412 over the A1 interface.).
Ranganath and Balasubramanian are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 4, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1.
Balasubramanian teaches wherein the model manager device is situated in the service management and orchestration device and uses resources of the service management and orchestration device to process specified calls to the group of application programming interfaces, while exposing the predict application programming interface in the serving hub device ([ML-Based Control Loops, pg. 11-12] The life-cycle training and mapping of the AI/ML models for the three control loops is illustrated in Figure 4. Loop 3: wherein the model manager device is situated in the service management and orchestration device The ML/AI analytics engine for loop 3 uses RAN KPI statistics reported from the RAN nodes to SMO over the O1 interface as the training dataset to build its ML models. uses resources of the service management and orchestration device to process specified calls to the group of application programming interfaces, while exposing the predict application programming interface in the serving hub device SMO uses these models to make accurate decisions on policies and configuration of KPI objectives. These decisions are further communicated to the near-real-time RIC over A1. The aforementioned procedure is outlined in Steps 1–7 in Figure 4.; Loop 2: Due to the near-real-time nature of loop 2, the ML-based microservices at the RIC use hybrid models, comprising a mix of both offline and online ML. Online ML models (e.g., deep/recurrent neural networks, reinforcement learning) can achieve lower control loop latency since they are typically processed on a single stream of incoming RAN data to the RIC. However, since online ML models suffer from accuracy in generating optimized RRM decisions, complementary offline ML models are used leveraging historical information in R-NIB. Typically, while inferences are being made in loop 2 (in the order of millisecond), larger time-scale analysis is continuously being done in loop 3. Feedback on the precision and accuracy of the predictions in loop 2 is provided to the nonreal-time RIC in loop 3 via the O1 interface. This allows for fine tuning the ML models and guides the operation of the overall application toward a certain objective. If the ML models in loop 2 misbehave or exhibit degraded performance, the nonreal-time RIC may instruct loop 2 to terminate the ML model or switch to a more improved model. This results in enhanced RRM decisions at the RIC in near-real-time, as shown in Step 7; Loop 3: SMO offers policy guidance on the minimum fraction of traffic that should be split and served on any base station (eNB or gNB) participating in DC for any UE. It uses offline ML models built from large historical RAN KPI data reported over O1 to efficiently compute this threshold for more recent RAN conditions reflecting up to the current state. For example, offline ML models could suggest a load imbalance across the pairs of base stations, participating in DC, whose traffic split does not meet this minimum threshold. Such threshold fractions are communicated as policy guidance to the RIC over A1 and/or to the RAN over O1, and can be updated over coarser time scales.).
Ranganath and Balasubramanian are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 15, Ranganath, as modified by Balasubramanian, teaches The non-transitory computer-readable medium of claim 13.
Balasubramanian teaches further comprising a model manager device that manages the machine learning model, wherein the model manager device is situated in the service management and orchestration device and uses resources of the service management and orchestration device to process a group of calls to the group of application programming interfaces ([ML-Based Control Loops, pg. 11-12] The life-cycle training and mapping of the AI/ML models for the three control loops is illustrated in Figure 4. Loop 3: further comprising a model manager device that manages the machine learning model, wherein the model manager device is situated in the service management and orchestration device The ML/AI analytics engine for loop 3 uses RAN KPI statistics reported from the RAN nodes to SMO over the O1 interface as the training dataset to build its ML models. and uses resources of the service management and orchestration device to process a group of calls to the group of application programming interfaces SMO uses these models to make accurate decisions on policies and configuration of KPI objectives. These decisions are further communicated to the near-real-time RIC over A1. The aforementioned procedure is outlined in Steps 1–7 in Figure 4.; Loop 2: Due to the near-real-time nature of loop 2, the ML-based microservices at the RIC use hybrid models, comprising a mix of both offline and online ML. Online ML models (e.g., deep/recurrent neural networks, reinforcement learning) can achieve lower control loop latency since they are typically processed on a single stream of incoming RAN data to the RIC. However, since online ML models suffer from accuracy in generating optimized RRM decisions, complementary offline ML models are used leveraging historical information in R-NIB. Typically, while inferences are being made in loop 2 (in the order of millisecond), larger time-scale analysis is continuously being done in loop 3. Feedback on the precision and accuracy of the predictions in loop 2 is provided to the nonreal-time RIC in loop 3 via the O1 interface. This allows for fine tuning the ML models and guides the operation of the overall application toward a certain objective. If the ML models in loop 2 misbehave or exhibit degraded performance, the nonreal-time RIC may instruct loop 2 to terminate the ML model or switch to a more improved model. This results in enhanced RRM decisions at the RIC in near-real-time, as shown in Step 7; Loop 3: SMO offers policy guidance on the minimum fraction of traffic that should be split and served on any base station (eNB or gNB) participating in DC for any UE. It uses offline ML models built from large historical RAN KPI data reported over O1 to efficiently compute this threshold for more recent RAN conditions reflecting up to the current state. For example, offline ML models could suggest a load imbalance across the pairs of base stations, participating in DC, whose traffic split does not meet this minimum threshold. Such threshold fractions are communicated as policy guidance to the RIC over A1 and/or to the RAN over O1, and can be updated over coarser time scales.).
Ranganath and Balasubramanian are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 16, Ranganath, as modified by Balasubramanian, teaches The non-transitory computer-readable medium of claim 15.
Balasubramanian teaches wherein the model manager device exposes the predict application programming interface in the serving hub and uses resources of the near real time radio access network intelligent controller to process calls to the predict application ([ML-Based Control Loops, pg. 11-12] The life-cycle training and mapping of the AI/ML models for the three control loops is illustrated in Figure 4. Loop 3: The ML/AI analytics engine for loop 3 uses RAN KPI statistics reported from the RAN nodes to SMO over the O1 interface as the training dataset to build its ML models. wherein the model manager device exposes the predict application programming interface in the serving hub SMO uses these models to make accurate decisions on policies and configuration of KPI objectives. These decisions are further uses resources of the near real time radio access network intelligent controller to process calls to the predict application communicated to the near-real-time RIC over A1. The aforementioned procedure is outlined in Steps 1–7 in Figure 4.; Loop 2: Due to the near-real-time nature of loop 2, the ML-based microservices at the RIC use hybrid models, comprising a mix of both offline and online ML. Online ML models (e.g., deep/recurrent neural networks, reinforcement learning) can achieve lower control loop latency since they are typically processed on a single stream of incoming RAN data to the RIC. However, since online ML models suffer from accuracy in generating optimized RRM decisions, complementary offline ML models are used leveraging historical information in R-NIB. Typically, while inferences are being made in loop 2 (in the order of millisecond), larger time-scale analysis is continuously being done in loop 3. Feedback on the precision and accuracy of the predictions in loop 2 is provided to the nonreal-time RIC in loop 3 via the O1 interface. This allows for fine tuning the ML models and guides the operation of the overall application toward a certain objective. If the ML models in loop 2 misbehave or exhibit degraded performance, the nonreal-time RIC may instruct loop 2 to terminate the ML model or switch to a more improved model. This results in enhanced RRM decisions at the RIC in near-real-time, as shown in Step 7; Loop 3: SMO offers policy guidance on the minimum fraction of traffic that should be split and served on any base station (eNB or gNB) participating in DC for any UE. It uses offline ML models built from large historical RAN KPI data reported over O1 to efficiently compute this threshold for more recent RAN conditions reflecting up to the current state. For example, offline ML models could suggest a load imbalance across the pairs of base stations, participating in DC, whose traffic split does not meet this minimum threshold. Such threshold fractions are communicated as policy guidance to the RIC over A1 and/or to the RAN over O1, and can be updated over coarser time scales.).
Ranganath and Balasubramanian are combinable for the same rationale as set forth above with respect to claim 1.
Regarding claim 11, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1.
Balasubramanian teaches wherein the operations further comprise performing a global sharing procedure in which a pre-trained machine learning model that is trained on global data or local data is shared by the service management and orchestration device to another service management and orchestration device ([DC Optimization, pg. 12] Loop 3: shared by the service management and orchestration device to another service management and orchestration device SMO offers policy guidance on the minimum fraction of traffic that should be split and served on any base station (eNB or gNB) participating in DC for any UE. It uses performing a global sharing procedure in which a pre-trained machine learning model that is trained on global data or local data offline ML models built from large historical RAN KPI data reported over O1 to efficiently compute this threshold for more recent RAN conditions reflecting up to the current state. For example, offline ML models could suggest a load imbalance across the pairs of base stations, participating in DC, whose traffic split does not meet this minimum threshold. Such threshold fractions are communicated as policy guidance to the RIC over A1 and/or to the RAN over O1, and can be updated over coarser time scales.).
Ranganath and Balasubramanian are combinable for the same rationale as set forth above with respect to claim 1.
Claims 2, 10, 13, 20 are rejected under 35 U.S.C. 103 as being unpatentable over Ranganath, in view of Balasubramanian, and further in view of Pham et al. (NPL: "HexRIC: Building a Better Near-real Time Network Controller for the Open RAN Ecosystem", hereinafter 'Pham').
Regarding claim 2 and analogous claim 13, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1 and The non-transitory computer-readable medium of claim 12, respectively.
Ranganath teaches wherein the group of application programming interfaces further comprises at least one of: a deploy application programming interface that enables authenticated deployment, to the serving hub device, of the machine learning model that was pre-trained by the model manager device according to a first dataset ([0068] Based on RAN-specific slice SLA requirements, the non-RT RIC 412 and the near-RT RIC 414 can fine-tune RAN behaviors to assure network slice SLAs dynamically. Utilizing slice specific performance metrics (e.g., based on measurement data 415 received from E2 nodes and/or UEs), the non-RT RIC 412 monitors long-term trends and patterns regarding RAN slice subnets' performance, and trains of the machine learning model AI/ML models to be a deploy application programming interface that enables authenticated deployment, to the serving hub device deployed at the near-RT RIC 414 (e.g. trained AI/ML models 3 c 24 of FIG. 3 c). The AI/ML models 3 c 24 may include heuristics and/or inference/predictive algorithms, which may be based on any of those discussed herein such as those shown by FIGS. 18 and 19. In various implementations, one or more of the that was pre-trained by the model manager device according to a first dataset trained AI/ML models 3 c 24 may be part of the xApp manager 425, which uses slice specific performance metrics as well as telemetry data (or profiling information) of the underlying platform to determine resource allocations for individual xApps 410 and/or other elements.);
Ranganath, as modified by Balasubramanian, fails to teach a retrain application programming interface that retrains all model weights of the machine learning model based on a received second dataset, resulting in a retrained machine learning model, and calls the deploy application programming interface to deploy the retrained machine learning model on the serving hub device; or a fine tune application that adjusts at least one model weight of the machine learning model based on a received third dataset, resulting in a tuned machine learning model, and calls the deploy application programming interface to deploy the tuned machine learning model on the serving hub device.
Pham teaches a retrain application programming interface that retrains all model weights of the machine learning model based on a received second dataset, resulting in a retrained machine learning model, and calls the deploy application programming interface to deploy the retrained machine learning model on the serving hub device; or a fine tune application that adjusts at least one model weight of the machine learning model based on a received third dataset, resulting in a tuned machine learning model, and calls the deploy application programming interface to deploy the tuned machine learning model on the serving hub device ([The MLOps Framework., pg. 18] The operational workflow of the framework has been shown in Fig. 4. First, we have the On-boarding Stage, which involves uploading an on-boarding package to the Model LCM entity. This on-boarding package consists of the model, a set of artifacts supporting the model, a manifest containing the model metadata such as versioning information, required libraries, etc., and a set of input and output parameter types. In a major step towards enhancing simplicity, preparing the on-boarding package requires no prior domain knowledge regarding the RIC, for e.g., there is no need for implementing yet another xApp each time a new ML model is added. The on-boarding package then undergoes a validation step, and, depending on the nature of the model, one of three possible stages are executed. For untrained models, the framework executes the Training Stage. As part of this stage, the ML Executor executes the provided untrained model which is then trained on data from the feature store. The trained model is then stored in the model and artifacts database. On the other hand, the Serving and Retraining Stage leverages models that have been trained previously (either internally as part of the Training Stage or externally at the non-RT RIC) and are ready for deployment. The output from such models is then sent over the messaging infrastructure to other xApps for consumption. Model performance is monitored by Prometheus and analyzed by M&E to identify model drift. Model drift is categorized as either sudden or incremental, with the a retrain application programming interface that retrains all model weights of the machine learning model based on a received second dataset, resulting in a retrained machine learning model, and calls the deploy application programming interface to deploy the retrained machine learning model on the serving hub device former resulting in retraining, while the latter necessitates a fine tune application that adjusts at least one model weight of the machine learning model based on a received third dataset, resulting in a tuned machine learning model, and calls the deploy application programming interface to deploy the tuned machine learning model on the serving hub device replacement with a previously trained model.).
Ranganath, Balasubramanian, and Pham are considered to be analogous to the claimed invention because they are in the same field of network and edge computing. In view of the teachings of Ranganath and Balasubramanian, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Pham to Ranganath before the effective filing date of the claimed invention in order to simplify and automate the life-cycle management of ML models within the near-RT RIC (cf. Pham, [A Machine Learning Operations Framework., pg. 16] HexRIC introduces a machine learning operations (MLOps) framework to simplify and automate the life-cycle management of ML models within the near-RT RIC. With HexRIC’s MLOps framework, ML scientists and engineers can focus on key model development tasks without having to worry about the complexities of the RIC.).
Regarding claim 10 and analogous claim 20, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1 and The method of claim 17, respectively.
Ranganath, as modified by Balasubramanian, fails to teach wherein the operations further comprise performing a global training procedure in which the machine learning model is trained on global data that is collected from at least one of: multiple different service management and orchestration devices or multiple different radio access network intelligent controllers.
Pham teaches wherein the operations further comprise performing a global training procedure in which the machine learning model is trained on global data that is collected from at least one of: multiple different service management and orchestration devices or multiple different radio access network intelligent controllers ([The MLOps Framework., pg. 17] With a view to simplifying ML-related operations within the RIC, HexRIC includes a novel MLOps Framework. The framework has been designed to address a number of challenges including: (i) enhancing ease-of-use, (ii) automating AI/ML workflows, (iii) improving scalability while reducing complexity, and (iv) ML model monitoring, evaluation, retraining, and redeployment on a near-RT timescale. As shown in Fig. 3, the framework consists of a Data Broker, a Model LCM component, a Monitoring and Evaluation (M&E) system, and a number of ML Executors and Serving Servers, in addition to a feature store from the Feast Project [3], an event monitoring service such as Prometheus [6], and MinIO [16] and MariaDB for storing models and artifacts. In particular, the performing a global training procedure in which the machine learning model is trained on global data that is collected from at least one of Data Broker aggregates and processes data from a number of sources, for e.g., xApps, the multiple different service management and orchestration devices RAN, external applications, and the multiple different radio access network intelligent controllers non-RT RIC. The processed data is then stored in the Feast feature store. Model LCM performs ML model lifecycle management and validation, while leveraging the ML Executors for model training and Serving Servers for inference. The ML Executor is a generic component used to execute any kind of ML file and decouples the MLOps Framework from specific ML libraries, thereby allowing for models to use all kinds of libraries, greatly enhancing ease-of-use.).
Ranganath, Balasubramanian, and Pham are combinable for the same rationale as set forth above with respect to claim 2.
Claims 5-7, 18-19 are rejected under 35 U.S.C. 103 as being unpatentable over Ranganath, in view of Balasubramanian, and further in view of Li et al. (NPL: "DLHub: Simplifying publication, discovery, and use of machine learning models in science", hereinafter 'Li').
Regarding claim 5 and analogous claim 18, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1 and The method of claim 17, respectively.
Ranganath, as modified by Balasubramanian, fails to teach wherein the operations further comprise performing a caching procedure that determines, from among a group of machine learning models that is deployed on the serving hub device, a subscribed group of machine learning models that is to be placed in a cache.
Li teaches wherein the operations further comprise performing a caching procedure that determines, from among a group of machine learning models that is deployed on the serving hub device, a subscribed group of machine learning models that is to be placed in a cache ([4.3. Inference execution system, pg. 69] funcX: The funcX service implements a secure task execution model with hierarchical queues for reliability. Tasks are submitted to the funcX Web service where they are queued for execution. A Python Forwarder process is operated for each endpoint. The Forwarder retrieves tasks from the cloud-hosted queues and transmits them to the endpoint via a secure, low-latency, and reliable message communication channel. Once delivered to the endpoint, tasks are internally queued until they can be scheduled for execution on the resource. Results are returned via the same channel and deposited in a result queue until they can be retrieved by the user. funcX uses a Redis store to implement the cloud-based queues. Redis is an easy-to-scale, in-memory key–value store. Each function execution request is stored in a Redis hashmap and the task identifier is added to the endpoint’s queue. funcX uses ZeroMQ to establish high performance communication channels between the forwarder and endpoint.; DLHub and funcX: When a user publishes a model to DLHub, we performing a caching procedure that determines, from among a group of machine learning models that is deployed on the serving hub device, a subscribed group of machine learning models that is to be placed in a cache create and register a function with funcX and associate it with the DLHub servable container. This allows funcX to deploy the servable on-demand to perform DLHub invocations. When a user invokes a servable using DLHub the request is routed to a DLHub operated funcX endpoint. The funcX agent will then deploy the servable and, once the servable is ready, deliver the request for execution. The funcX agent is responsible for deploying and managing servables, monitoring incoming requests from DLHub (via the funcX service), and then executing waiting tasks. The funcX agent can be deployed in Docker environments, Kubernetes clusters, HPC resources via Singularity or Shifter, or locally via any of these containerization mechanisms.).
Ranganath, Balasubramanian, and Li are considered to be analogous to the claimed invention because they are in the same field of network and edge computing. In view of the teachings of Ranganath and Balasubramanian, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Li to Ranganath before the effective filing date of the claimed invention in order to publish and share models and to serve them on a range of available computing resources (cf. Li, [Abstract, pg. 64] Machine Learning (ML) has become a critical tool enabling new methods of analysis and driving deeper understanding of phenomena across scientific disciplines. There is a growing need for ‘‘learning systems’’ to support various phases in the ML lifecycle. While others have focused on supporting model development, training, and inference, few have focused on the unique challenges inherent in science, such as the need to publish and share models and to serve them on a range of available computing resources. In this paper, we present the Data and Learning Hub for science (DLHub), a learning system designed to support these use cases. Specifically, DLHub enables publication of models, with descriptive metadata, persistent identifiers, and flexible access control. It packages arbitrary models into portable servable containers, and enables low-latency, distributed serving of these models on heterogeneous compute resources. We show that DLHub supports low-latency model inference comparable to other model serving systems including TensorFlow Serving, SageMaker, and Clipper, and improved performance, by up to 95%, with batching and memoization enabled. We also show that DLHub can scale to concurrently serve models on 500 containers. Finally, we describe five case studies that highlight the use of DLHub for scientific applications.).
Regarding claim 6, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 5.
Ranganath, as modified by Balasubramanian, fails to teach wherein the caching procedure determines the subscribed group as a function of at least one of: a first criterion, satisfaction of which is indicative of a low latency constraint associated with the xApp, a second criterion, satisfaction of which is indicative of a recency of use, a third criterion, satisfaction of which is indicative of a frequency of use, or a fourth criterion, satisfaction of which is indicative of a subscription flow to a given machine learning model of the group of machine learning models.
Li teaches wherein the caching procedure determines the subscribed group as a function of at least one of: a first criterion, satisfaction of which is indicative of a low latency constraint associated with the xApp ([4.3. Inference execution system, pg. 69] DLHub coordinates the execution of inference tasks on remote resources. This architecture focuses on high performance and low latency model inference as well as flexibility in terms of where inference tasks are executed. Specifically, DLHub allows researchers to execute inference tasks on Kubernetes clusters, HPC resources, or clouds using various container technologies (e.g., Docker, Singularity or Shifter), on edge devices, or even on their own execution resources using any of these containerization mechanisms. DLHub supports both synchronous and asynchronous task execution. In asynchronous mode, the DLHub SDK returns a task UUID that can be used subsequently to monitor the status of the task and retrieve its result.; Implementation: DLHub’s on-demand inference is built on the funcX distributed Function-as-a-Service platform. We briefly describe funcX and outline how it is used by DLHub. funcX: funcX enables the managed execution of functions— snippets of Python code—on arbitrary remote resources. Users can register and discover functions through a cloud-hosted service and then execute those functions with arbitrary input parameters on arbitrary endpoints. Where an endpoint abstracts a specific compute resource, whether a single edge device or a supercomputer, in a manner defined by the funcX agent software. The funcX service implements a secure task execution model with hierarchical queues for reliability. caching procedure determines the subscribed group as a function of at least one of Tasks are submitted to the funcX Web service where they are queued for execution. A Python Forwarder process is operated for each endpoint. The Forwarder satisfaction of which is indicative of a low latency constraint associated with the xApp retrieves tasks from the cloud-hosted queues and transmits them to the endpoint via a secure, low-latency, and reliable message communication channel.),
a second criterion, satisfaction of which is indicative of a recency of use ([6.7. Discussion, pg. 73] We briefly discuss the lessons learned by using DLHub in these five use cases. Prior to using DLHub, these use cases required a substantial amount of human effort to manually manage model versions, publish and share models with others, deploy complex software environments on distributed computing resources, and reliably deploy models for real-time inferences at scale. DLHub provides several benefits: First, DLHub manages different versions of the same model, removing challenges associated with tracking model versions and using incorrect versions. A key side effect of this is that researchers are able to a second criterion, satisfaction of which is indicative of a recency of use deploy new versions of their models and compare the performance to any of the previously published versions. Second, DLHub’s on-demand inference system abstracts the complexity of deploying models at different computing resources and enables researchers to easily deploy their models at scale, without requiring expert knowledge of batch submission interfaces and computing architectures. Finally, the containerization of models allows researchers to securely share models with others, removing the burden of porting models and environments to other locations.),
a third criterion, satisfaction of which is indicative of a frequency of use, or a fourth criterion, satisfaction of which is indicative of a subscription flow to a given machine learning model of the group of machine learning models.
Ranganath, Balasubramanian, and Li are combinable for the same rationale as set forth above with respect to claim 5.
Regarding claim 7 and analogous claim 19, Ranganath, as modified by Balasubramanian, teaches The serving hub device of claim 1 and The method of claim 17, respectively.
Ranganath, as modified by Balasubramanian, fails to teach wherein the operations further comprise performing a model searching procedure that, in response to a search query, determines, from among a group of machine learning models that is deployed on the serving hub device, a most relevant group of machine learning models that satisfies the search query and communicates documentation associated with the most relevant group of machine learning models.
Li teaches wherein the operations further comprise performing a model searching procedure that, in response to a search query, determines, from among a group of machine learning models that is deployed on the serving hub device, a most relevant group of machine learning models that satisfies the search query and communicates documentation associated with the most relevant group of machine learning models ([4.3. Inference execution system, pg. 69] Implementation: DLHub’s on-demand inference is built on the funcX distributed Function-as-a-Service platform. We briefly describe funcX and outline how it is used by DLHub. funcX: funcX enables the managed execution of functions— snippets of Python code—on arbitrary remote resources. Users can register and discover functions through a cloud-hosted service and then execute those functions with arbitrary input parameters on arbitrary endpoints. Where an endpoint abstracts a specific compute resource, whether a single edge device or a supercomputer, in a manner defined by the funcX agent software. The funcX service implements a secure task execution model with hierarchical queues for reliability. performing a model searching procedure that, in response to a search query, determines, from among a group of machine learning models that is deployed on the serving hub device Tasks are submitted to the funcX Web service where they are queued for execution. A Python Forwarder process is operated for each endpoint. The Forwarder retrieves tasks from the cloud-hosted queues and transmits them to the endpoint via a secure, low-latency, and reliable message communication channel. Once delivered to the endpoint, tasks are internally queued until they can be scheduled for execution on the resource. Results are returned via the same channel and deposited in a result queue until they can be retrieved by the user. funcX uses a a most relevant group of machine learning models that satisfies the search query and communicates documentation associated with the most relevant group of machine learning models Redis store to implement the cloud-based queues. Redis is an easy-to-scale, in-memory key–value store. Each function execution request is stored in a Redis hashmap and the task identifier is added to the endpoint’s queue. funcX uses ZeroMQ to establish high performance communication channels between the forwarder and endpoint.).
Ranganath, Balasubramanian, and Li are combinable for the same rationale as set forth above with respect to claim 5.
Claims 8-9 are rejected under 35 U.S.C. 103 as being unpatentable over Ranganath, in view of Balasubramanian, Li, and further in view of Pham.
Regarding claim 8, Ranganath, as modified by Balasubramanian and Li, teaches The serving hub device of claim 7.
Ranganath, as modified by Balasubramanian and Li, fails to teach wherein the model searching procedure is configured to interpret a natural language search query by matching search keywords to model keywords entered as metadata model tags upon deployment to the serving hub device.
Pham teaches wherein the model searching procedure is configured to interpret a natural language search query by matching search keywords to model keywords entered as metadata model tags upon deployment to the serving hub device ([The MLOps Framework., pg. 18] The operational workflow of the framework has been shown in Fig. 4. First, we have the On-boarding Stage, which involves uploading an on-boarding package to the Model LCM entity. This wherein the model searching procedure is configured to interpret a natural language search query by matching search keywords to model keywords entered as metadata model tags upon deployment to the serving hub device on-boarding package consists of the model, a set of artifacts supporting the model, a manifest containing the model metadata such as versioning information, required libraries, etc., and a set of input and output parameter types. In a major step towards enhancing simplicity, preparing the on-boarding package requires no prior domain knowledge regarding the RIC, for e.g., there is no need for implementing yet another xApp each time a new ML model is added. The on-boarding package then undergoes a validation step, and, depending on the nature of the model, one of three possible stages are executed. For untrained models, the framework executes the Training Stage. As part of this stage, the ML Executor executes the provided untrained model which is then trained on data from the feature store. The trained model is then stored in the model and artifacts database. On the other hand, the Serving and Retraining Stage leverages models that have been trained previously (either internally as part of the Training Stage or externally at the non-RT RIC) and are ready for deployment. The output from such models is then sent over the messaging infrastructure to other xApps for consumption. Model performance is monitored by Prometheus and analyzed by M&E to identify model drift. Model drift is categorized as either sudden or incremental, with the former resulting in retraining, while the latter necessitates replacement with a previously trained model.).
Ranganath, Balasubramanian, Li, and Pham are considered to be analogous to the claimed invention because they are in the same field of network and edge computing. In view of the teachings of Ranganath, Balasubramanian, and Li, it would have been obvious for a person of ordinary skill in the art to apply the teachings of Pham to Ranganath before the effective filing date of the claimed invention in order to simplify and automate the life-cycle management of ML models within the near-RT RIC (cf. Pham, [A Machine Learning Operations Framework., pg. 16] HexRIC introduces a machine learning operations (MLOps) framework to simplify and automate the life-cycle management of ML models within the near-RT RIC. With HexRIC’s MLOps framework, ML scientists and engineers can focus on key model development tasks without having to worry about the complexities of the RIC.).
Regarding claim 9, Ranganath, as modified by Balasubramanian, Li, and Pham, teaches The serving hub device of claim 8.
Li teaches wherein the model searching procedure uses at least one of a named entity recognition process or a syntactical and semantic matching process ([4.1. Management service and catalog, pg. 68] Model repository: The primary function of the Management Service is to support the publication and discovery of models. DLHub defines a general model schema that is used to describe all published models. The schema includes standard publication metadata (e.g., creator, date, a named entity recognition process name, description) as well as ML specific or a syntactical and semantic matching process metadata such as model type (e.g., Keras, TensorFlow) and input and output data types. These metadata are registered with a search catalog to enable flexible discovery. Model discovery: DLHub’s discovery interface supports fine grained, access-controlled search across registered model meta data. It provides a model searching procedure uses at least one of rich search model, in which model metadata can be queried using free text queries, partial matching, range queries, faceted search, and more through both the DLHub CLI and SDK.).
Ranganath, Balasubramanian, Li, and Pham are combinable for the same rationale as set forth above with respect to claim 8.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Melodia et al. (U.S. Pre-Grant Publication No. 20220167236) teaches a radio access network (RAN) intelligent controller (RIC) and corresponding method may be implemented within RAN and in next-generation cellular networks to improve performance.
Any inquiry concerning this communication or earlier communications from the examiner should be directed to MAGGIE MAIDO whose telephone number is (703) 756-1953. The examiner can normally be reached M-Th: 6am - 4pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/MM/Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129