DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
Continued Examination Under 37 CFR 1.114
A request for continued examination under 37 CFR 1.114, including the fee set forth in 37 CFR 1.17(e), was filed in this application after final rejection. Since this application is eligible for continued examination under 37 CFR 1.114, and the fee set forth in 37 CFR 1.17(e) has been timely paid, the finality of the previous Office action has been withdrawn pursuant to 37 CFR 1.114. Applicant's submission filed on 10/14/2025 has been entered.
Response to Arguments
Applicant's arguments filed 10/14/2025 have been fully considered and they are persuasive.
Regarding applicant’s remarks directed to the rejection of claims under 35 USC § 103, the arguments are directed to newly amended limitations that were not previously examined by the examiner. Therefore, applicants arguments are rendered moot. The examiner refers to the rejection under 35 USC § 103 in the current office action for more details.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
The factual inquiries for establishing a background for determining obviousness under 35 U.S.C. 103 are summarized as follows:
1. Determining the scope and contents of the prior art.
2. Ascertaining the differences between the prior art and the claims at issue.
3. Resolving the level of ordinary skill in the pertinent art.
4. Considering objective evidence present in the application indicating obviousness or nonobviousness.
Claim(s) 1-2 and 9-12 is/are rejected under 35 U.S.C. 103 as being unpatentable over U.S. Pat. No. US10257275B1 Dirac et al. (“Dirac”) in view of Van Aken, Dana, et al. "Automatic database management system tuning through large-scale machine learning." Proceedings of the 2017 ACM international conference on management of data. 2017. (“Aken”).
In regards to claim 1,
Dirac teaches A computer implemented tuning method carried on an IT system comprising a System Under Test (SUT) including a stack of software layers, provided with a number of adjustable parameters,
(Dirac, Col. 17 line 67 – Col. 18 line 14, “A wide variety of tunable parameters [provided with a number of adjustable parameters] may be amenable for GP-based Bayesian optimization in different embodiments, at various levels of the hardware/software stacks [stack of software layers] used in the execution environments. Examples of such parameters in various embodiments may include a garbage collection algorithm to be used for memory management, a minimum or maximum size of a memory to be allocated for a heap managed using the garbage collection algorithm, various networking configuration parameters associated with networking protocols being used, the number of nodes to be included in a collection of nodes implementing a distributed application, database configuration parameters such as buffer pool sizes, hypervisor configuration parameters, or storage device configuration parameters.”)
Dirac teaches said initial tasks are selected according to a user-specified parameter N, indicating that a new task should be created every N tuning iterations.
(Dirac, Col. 5 lines 36-46, “The extent of to which the clients on whose behalf the execution environment is configured influence or control [a user-specified parameter N] the optimization tool's operations may also vary in different embodiments. In some embodiments, clients may not even be aware that optimization tasks are being performed for them; in contrast, in other embodiments as described below, the clients may provide detailed guidance regarding the manner in which the tunable parameters should be varied, hierarchical or conditional relationships between tunable parameters, validity periods for observations collected from the execution environments, and so on [indicating that a new task should be created every N tuning iterations].”)
However, Dirac does not explicitly teach
the method comprising the steps of supplying
a characterization and prediction module,
a tuner module, and
a knowledge base (KB) composed by N tuples, (si, {right arrow over (w)}i, {right arrow over (x)}i, yi) being gathered over iterative tuning sessions where each iteration is started by applying to a System Under Test (SUT) si a configuration {right arrow over (xl)} suggested by said tuner module, exposing said system si to an external working condition wi and gathering performance metrics resulting in a performance indicator score yi,
wherein said characterization and prediction module builds a characterization vector {right arrow over (cl)} for each tuple stored in said knowledge base (KB) using the information stored in said knowledge base and produces a prediction about the characterization vector {right arrow over (cl+1)} of the next tuning iteration i+1,
where characterization vector {right arrow over (cl)} is a numeric representation of the tunable properties of system si when exposed to said external working condition wi, and the distance between two characterization vectors {right arrow over (cl)}, {right arrow over (cj)} represents how similarly the systems si, sj should have been configured in previous tuning iterations i, j when the two systems si, sj were exposed to external working conditions wi, wj respectively,
said tuner module leverages said knowledge base (KB) and said characterization vectors {right arrow over (cl)} to select a new suggested configuration {right arrow over (xl+1)} to be applied to said System Under Test si+1 in the next iteration i+1,
and further wherein,
assuming that there exists tuples of systems and working conditions (Si, wi) which should be configured in a similar way, and a group of said tuples tk=((si, wi), (sj, wj), . . . ) is called archetypal task tk,
said characterization module identifies said archetypal tasks in the knowledge base (KB),
said archetypal tasks are identified by assigning each tuple of the knowledge base (KB) to an initial task and then iteratively measuring distances between tasks and clustering similar ones, so that tuples which should be tuned similarly are assigned to the same task,
Aken teaches the method comprising the steps of supplying
a characterization and prediction module,
a tuner module, and
a knowledge base (KB)
PNG
media_image1.png
253
847
media_image1.png
Greyscale
(Aken, Fig. 3 teaches a workload characterization module that performs K-means clustering (thus generating a prediction of cluster membership), a tuner module ie Automatic Tuner, and a knowledge base ie Data Repository)
Aken teaches composed by N tuples ,(s_i,w ⃗_i,x ⃗_i,y_i) being gathered over iterative tuning sessions where each iteration is started by applying to a System Under Test (SUT) s_i a configuration (x_i ) ⃗ suggested by said tuner module, exposing said system s_i to an external working condition w_i and gathering performance metrics resulting in a performance indicator score y_i,
PNG
media_image2.png
221
407
media_image2.png
Greyscale
(Aken, Section 3, “Fig. 2 shows an overview of OtterTune’s architecture. The system is comprised of two parts. The first is the client-side controller that interacts with the target DBMS to be tuned. It collects runtime information from the DBMS using a standard API (e.g., JDBC), installs new configurations, and collects performance measurements.
The second part is OtterTune’s tuning manager. It receives the information collected from the controller and stores it in its repository with data from previous tuning sessions [composed by N tuples ,(s_i,w ⃗_i,x ⃗_i,y_i) being gathered over iterative tuning sessions where each iteration is started by applying to a System Under Test (SUT) s_i a configuration (x_i ) ⃗ suggested by said tuner module, exposing said system s_i to an external working condition w_i and gathering performance metrics resulting in a performance indicator score y_i]. This repository does not contain any confidential information about the DBMSs or their databases; it only contains knob configurations and performance data. OtterTune organizes this data per major DBMS version (e.g., Postgres v9.3 data is separate from Postgres v9.4). This prevents OtterTune from tuning knobs from older versions of the DBMS that may be deprecated in newer versions, or tuning knobs that only exist in newer versions. The manager is also supported by background processes that continuously analyze new data and refine OtterTune’s internal ML models. These models allow it to identify the relevant knobs and metrics without human input, and find workloads in the repository that are similar to the target.”)
Aken teaches wherein said characterization and prediction module builds a characterization vector (c_i ) ⃗ for each tuple stored in said knowledge base (KB) using the information stored in said knowledge base and produces a prediction about the characterization vector (c_(i+1) ) ⃗ of the next tuning iteration i+1, where characterization vector (c_i ) ⃗ is a numeric representation of the tunable properties of system s_i when exposed to said external working condition w_i, and the distance between two characterization vectors (c_i ) ⃗,(c_j ) ⃗ represents how similarly the systems s_i,s_j should have been configured in previous tuning iterations i,j when the two systems s_i,s_j were exposed to external working conditions w_i,w_j respectively,
(Aken, Section 4, “A better approach is to use the DBMS’s internal runtime metrics to characterize how a workload behaves. All modern DBMSs expose a large amount of information about the system. For example, MySQL’s InnoDB engine provides statistics on the number of pages read/written, query cache utilization, and locking overhead. OtterTune characterizes a workload using the runtime statistics recorded while executing it.”)
(Aken, Section 6.1, “The goal of this first step is to match the target DBMS’s workload with the most similar workload in its repository based on the performance measurements for the selected group of metrics. We find that the matched workload varies for the first few experiments
before converging to a single workload. This suggests that the quality of the match made by OtterTune increases with the amount of data gathered from the target workload, which is what we would expect. For this reason, using a dynamic mapping scheme is preferable to static mapping (i.e., mapping one time after the end of the first observation period) because it enables OtterTune to make more educated matches as the tuning session progresses.
For each DBMS version, we build a set S of N matrices — one for every non-redundant metric — from the data in our repository [builds a characterization vector (c_i ) ⃗ for each tuple stored in said knowledge base (KB) using the information stored in said knowledge base]. Similar to the Lasso and FA models, these matrices are constructed by background processes running on OtterTune’s tuning manager (see Sect. 3). The matrices in S (i.e., X0, X1, . . . XN−1) have identical row and column labels. Each row in matrix Xm corresponds to a workload in our repository and each column corresponds to a DBMS configuration from the set of all unique DBMS configurations that have been used to run any of the workloads. The entry Xm,i,j is the value of metric m observed when executing workload i with configuration j [where characterization vector (c_i ) ⃗ is a numeric representation of the tunable properties of system s_i when exposed to said external working condition w_i]. If we have multiple observations from running workload i with configuration j, then entry Xm,i,j is the median of all observed values of metric m.
The workload mapping computations are straightforward. OtterTune calculates the Euclidean distance between the vector of measurements for the target workload and the corresponding vector for each workload i in the matrix Xm (i.e., Xm,i,:). It then repeats this computation for each metric m. In the final step, OtterTune computes a “score” for each workload i by taking the average of these distances over all metrics m [produces a prediction about the characterization vector (c_(i+1) ) ⃗ of the next tuning iteration i+1,]. The algorithm then chooses the workload with the lowest score as the one that is most similar to the target workload for that observation period [the distance between two characterization vectors (c_i ) ⃗,(c_j ) ⃗ represents how similarly the systems s_i,s_j should have been configured in previous tuning iterations i,j when the two systems s_i,s_j were exposed to external working conditions w_i,w_j respectively].”)
Aken teaches said tuner module uses a surrogate model based on Bayesian Optimization (BO), said surrogate model being a Gaussian Process (GP);
(Aken, Section 6.2, “In the next step, OtterTune uses Gaussian Process (GP) regression [said surrogate model being a Gaussian Process (GP)] [42] to recommend configurations that it believes will improve the target metric. GP regression is a state-of-the-art technique with power approximately equal to that of deep networks. There are a number of attractive features of GPs that make it an appropriate choice for modeling the configuration space and making recommendations.”)
Aken teaches said tuner module leverages said knowledge base (KB) and said characterization vectors (c_i ) ⃗ to select a new suggested configuration (x_(i+1) ) ⃗ to be applied to said System Under Test s_(i+1) in the next iteration i+1,
(Aken, Section 6.1, “Now for each observation period in this step, OtterTune tries to find a better configuration than the best configuration that it has seen thus far in this session [said tuner module leverages said knowledge base (KB) and said characterization vectors (c_i ) ⃗ to select a new suggested configuration (x_(i+1) ) ⃗ to be applied to said System Under Test s_(i+1) in the next iteration i+1]. It does this by either (1) searching an unknown region in its GP (i.e., workloads for which it has little to no data for), or (2) selecting a configuration that is near the best configuration in its GP. The former strategy is referred to as exploration. This helps OtterTune look for configurations where knobs are set to values that are beyond the minimum or maximum values that it has tried in the past. This is useful for trying certain knobs where the upper limit might depend on the underlying hardware (e.g., the amount of memory available). The second strategy is known as exploitation. This is where OtterTune has found a good configuration and it tries slight modifications to the knobs to see whether it can further improve the performance.”)
Aken teaches and further wherein, assuming that there exists tuples of systems and working conditions (s_i,w_i) which should be configured in a similar way, and a group of said tuples t_k=((s_i,w_i),(s_j,w_j),...) is called archetypal task t_k,
said characterization module identifies said archetypal tasks in the knowledge base (KB), said archetypal tasks are identified by assigning each tuple of the knowledge base (KB) to an initial task and then iteratively measuring distances between tasks and clustering similar tasks based on the measured distances, so that tuples which should be tuned similarly are assigned to the same task,
(Aken, Fig. 3 and 4, Section 4.2, “We use two well-studied techniques for this pruning. The first is a dimensionality reduction technique, called factor analysis(FA) [5], that transforms the (potentially) high dimensional DBMS metric data into lower dimensional data. We then use the second technique, called k-means [6], to cluster this lower dimensional data into meaningful groups. Using a dimensionality reduction technique is a preprocessing step for many clustering algorithms because they reduce the amount of “noise” in the data [31, 30]. This improves the robustness and the quality of the cluster analysis.
Given a set of real-valued variables that contain arbitrary correlations, FA reduces these variables to a smaller set of factors that capture the correlation patterns of the original variables. Each factor is a linear combination of the original variables; the factor coefficients are similar to and can be interpreted in the same way as the coefficients in a linear regression. Furthermore, each factor has unit variance and is uncorrelated with all other factors. This means that one can order the factors by how much of the variability in the original data they explain. We found that only the initial factors are significant for our DBMS metric data, which means that most of the variability is captured by the first few factors.
The FA algorithm takes as input a matrix X whose rows correspond to metrics and whose columns correspond to knob configurations that we have tried. The entry Xij is the value of metric I on configuration j. FA gives us a smaller matrix U: the rows of U correspond to metrics, while the columns correspond to factors, and the entry Uij is the coefficient of metric i in factor j. We can scatter-plot the metrics using elements of the ith row of U as coordinates for metric i. Metrics i and j will be close together if they have similar coefficients in U — that is, if they tend to correlate strongly in X. Removing redundant metrics now means removing metrics that are too close to one another in our scatter-plot. We then cluster the metrics via k-means, using each metric’s row of U as its coordinates [assuming that there exists tuples of systems and working conditions (s_i,w_i) which should be configured in a similar way, and a group of said tuples t_k=((s_i,w_i),(s_j,w_j),...) is called archetypal task t_k,
PNG
media_image3.png
229
406
media_image3.png
Greyscale
said characterization module identifies said archetypal tasks in the knowledge base (KB), said archetypal tasks are identified by assigning each tuple of the knowledge base (KB) to an initial task and then iteratively measuring distances between tasks and clustering similar tasks based on the measured distances; k-means uses Euclidean distance, so that tuples which should be tuned similarly are assigned to the same task]. We keep a single metric for each cluster, namely, the one closest to the cluster center. One of the drawbacks of using k-means is that it requires the optimal number of clusters (K) as its input. We use a simple heuristic [40] to fully automate this selection process and approximate K. Although this approach is not guaranteed to find the optimal solution, it does not require a human to manually interpret a graphical representation of the problem to determine the optimal number of clusters. We compared this heuristic with other techniques [55, 48] for choosing K and found that they select values that differ by one to two clusters at most from our approximations. Such variations made little difference in the quality of configurations that OtterTune generated in our experimental evaluation in Sect. 7.”)
In regards to claim 2,
Dirac and Aken teaches The computer implemented tuning method as in claim 1,
Aken teaches wherein the external working condition w_i is characterized by an external characterization methodology, and this characterization is used as an input for a clustering algorithm to derive said initial tasks instead of the user-specified parameter N.
(Aken, Fig. 3, “We use two well-studied techniques for this pruning. The first is a dimensionality reduction technique, called factor analysis(FA) [5], that transforms the (potentially) high dimensional DBMS metric data into lower dimensional data. We then use the second technique, called k-means [6], to cluster this lower dimensional data into meaningful groups. Using a dimensionality reduction technique is a preprocessing step for many clustering algorithms because they reduce the amount of “noise” in the data [31, 30]. This improves the robustness and the quality of the cluster analysis.
Given a set of real-valued variables that contain arbitrary correlations, FA reduces these variables to a smaller set of factors that capture the correlation patterns of the original variables. Each factor is a linear combination of the original variables; the factor coefficients are similar to and can be interpreted in the same way as the coefficients in a linear regression. Furthermore, each factor has unit variance and is uncorrelated with all other factors. This means that one can order the factors by how much of the variability in the original data they explain. We found that only the initial factors are significant for our DBMS metric data, which means that most of the variability is captured by the first few factors.
The FA algorithm takes as input a matrix X whose rows correspond to metrics and whose columns correspond to knob configurations that we have tried. The entry Xij is the value of metric I on configuration j. FA gives us a smaller matrix U: the rows of U correspond to metrics, while the columns correspond to factors, and the entry Uij is the coefficient of metric i in factor j. We can scatter-plot the metrics using elements of the ith row of U as coordinates for metric i. Metrics i and j will be close together if they have similar coefficients in U — that is, if they tend to correlate strongly in X. Removing redundant metrics now means removing metrics that are too close to one another in our scatter-plot. We then cluster the metrics via k-means, using each metric’s row of U as its coordinates. We keep a single metric for each cluster, namely, the one closest to the cluster center. One of the drawbacks of using k-means is that it requires the optimal number of clusters (K) as its input. We use a simple heuristic [40] to fully automate this selection process and approximate K. Although this approach is not guaranteed to find the optimal solution, it does not require a human to manually interpret a graphical representation of the problem to determine the optimal number of clusters. We compared this heuristic with other techniques [55, 48] for choosing K [wherein the external working condition w_i is characterized by an external characterization methodology, and this characterization is used as an input for a clustering algorithm to derive said initial tasks instead of the user-specified parameter N] and found that they select values that differ by one to two clusters at most from our approximations. Such variations made little difference in the quality of configurations that OtterTune generated in our experimental evaluation in Sect. 7.”)
Dirac and Aken are both considered to be analogous to the claimed invention because they are in the same field of automated system tuning using machine learning. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dirac to incorporate the teachings of Aken in order to provide an automated approach that leverage supervised and unsupervised machine learning methods to enable transfer experiences and recommend configuration that perform at the same level or better than state-of-the-art solutions. (Aken, Abstract, “Database management system (DBMS) configuration tuning is an essential aspect of any data-intensive application effort. But this is historically a difficult task because DBMSs have hundreds of configuration “knobs” that control everything in the system, such as the amount of memory to use for caches and how often data is written to storage. The problem with these knobs is that they are not standardized (i.e., two DBMSs use a different name for the same knob), not independent (i.e., changing one knob can impact others), and not universal (i.e., what works for one application may be sub-optimal for another). Worse, information about the effects of the knobs typically comes only from (expensive) experience. To overcome these challenges, we present an automated approach that leverages past experience and collects new information to tune DBMS configurations: we use a combination of supervised and unsupervised machine learning methods to (1) select the most impactful knobs, (2) map unseen database workloads to previous workloads from which we can transfer experience, and (3) recommend knob settings. We implemented our techniques in a new tool called OtterTune and tested it on three DBMSs. Our evaluation shows that OtterTune recommends configurations that are as good as or better than ones generated by existing tools or a human expert.”)
In regards to claim 9,
Dirac and Aken teaches The computer implemented tuning method as in claim 1,
Dirac teaches wherein it is provided a preliminary step when said knowledge base (KB) is empty, including preliminary tuning iterations dedicated to bootstrap said knowledge base.
(Dirac, Col. 13 lines 34-37, “For one or more initial observation collection intervals, random selection among the permissible values may be used to select tunable parameter settings [preliminary step when said knowledge base (KB) is empty, including preliminary tuning iterations dedicated to bootstrap said knowledge base; wherein random selection of values is used to bootstrap the KB wherein the KB is analogous to the data repository of Aken]; later during the optimization process, the results collected using the randomly-selected values may help to guide the optimization tools towards appropriate values for the parameter settings;”)
In regards to claim 10,
Dirac and Aken teaches The computer implemented tuning method as in claim 9,
Dirac teaches wherein said preliminary tuning iterations include evaluating vendor default (baseline) configuration.
(Dirac, Col. 4 line 49 – Col. 5 line 6, “A number of iterations of executions of the Bayesian model using GP priors may be performed in various embodiments, interleaved with iterations of execution results collection from the execution environment. The model input parameter value settings used for a given iteration may be based at least in part on the execution results (and the corresponding objective function value) obtained in the most recent collection of execution results from the execution environment [evaluating vendor default (baseline) configuration; wherein the most recent collection is the baseline configuration], and the output model parameter values of the given iteration may be used to set the tunable parameter values for the next observation collection interval. In at least some embodiments, the optimization tool may determine the boundaries (e.g., start and end times) of the observation collection intervals as discussed below. The iterations of model execution and observation collection may be repeated until either the targeted extremum of the objective function has been attained (at least to within some tolerance limit), or until the resources available for the optimization task have been exhausted. The combination of tunable parameter settings that correspond to the attainment of the optimization goal (or the combination of tunable parameter settings that came closest to the optimization goal) may be stored in various embodiments, e.g., in a persistent storage repository accessible to the optimization tool and/or in a knowledge base.”)
Claim 11 is rejected on the same rationale under 35 U.S.C. 103 as claim 1 as they are substantially similar.
In regards to claim 12,
Dirac and Wang teaches The computer implemented tuning method as in claim 10,
Dirac teaches wherein said preliminary tuning iterations further include evaluating a number of user-defined randomly-selected configurations.
(Dirac, Col. 3 lines 13-24, “Regardless of the manner in which the optimization tool is presented to clients, in various embodiments the clients may not be required to be experts in, or even be particularly familiar with, the statistics or mathematics involved in generating and executing the models. Instead, at least in some embodiments in which the client is aware that an optimization task is going to be performed, the client may simply specify or indicate various properties or characteristics of the software execution environment optimization target, the tunable parameters of interest [a number of user-defined randomly-selected configurations], and/or constraints (if any) to be placed on the resources used for the optimization task itself.”)
Claim(s) 3 is rejected under 35 U.S.C. 103 as being unpatentable over Dirac in view of Aken in further view of Yehuda Koren “Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model.” (“Koren”).
In regards to claim 3,
Dirac and Aken teaches The computer implemented tuning method as in claim 1,
Koren teaches wherein the distance between two tasks t, t′ used for the task clustering procedure, is computed as:
PNG
media_image4.png
230
410
media_image4.png
Greyscale
(Koren, Section 2.4, “First, quality of the results is usually measured by their root mean squared error (RMSE) [the distance between two tasks t, t′ used for the task clustering procedure, is computed as]:
PNG
media_image5.png
39
311
media_image5.png
Greyscale
The two sets contain many more ratings by users that do not rate much and are harder to predict. In a way, they represent real requirements from a CF system, which needs to predict new ratings from older ones, and to equally address all users, not only the heavy raters.”; wherein items are equivalent to optimizations ie configurations and users are equivalent to programs such that a program rates an optimization ie configuration to exploit previous ratings ie performance)
However, Koren does not explicitly teach where {xi}N i=1 is the set of configurations that have been evaluated on both tasks t and t′, and rt(xi) is a relevance score being a scalar value indicating how much task t benefits from the configuration xi.
Aken teaches where {xi}N i=1 is the set of configurations that have been evaluated on both tasks t and t′,
(Aken, Section 6.1, “For each DBMS version, we build a set S of N matrices — one for every non-redundant metric — from the data in our repository. Similar to the Lasso and FA models, these matrices are constructed by background processes running on OtterTune’s tuning manager
(see Sect. 3). The matrices in S (i.e., X0,X1, …XN-1) have identical row and column labels. Each row in matrix Xm corresponds to a workload in our repository and each column corresponds
to a DBMS configuration from the set of all unique DBMS configurations that have been used to run any of the workloads [where {xi}N i=1 is the set of configurations that have been evaluated on both tasks t and t′].”)
Aken teaches and rt(xi) is a relevance score being a scalar value indicating how much task t benefits from the configuration xi.
(Aken, Section 6.1, “The entry Xm,i,j is the value of metric m observed when executing
workload i with configuration j.”)
Koren is considered to be analogous to the claimed invention because they are in the same field of using recommender systems. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dirac in view of Aken to incorporate the teachings of Koren in order to provide a combined model (recommender system) that has improved prediction accuracy (Koren, Section 1., “In this work we suggest a combined model that improves prediction accuracy by capitalizing on the advantages of both neighborhood and latent factor approaches. To our best knowledge, this is the first time that a single model has integrated the two approaches. In fact, some past works (e.g., [2, 4]) recognized the utility of combining those approaches. However, they suggested post-processing the factorization results, rather than a unified model where neighborhood and factor information are considered symmetrically.”)
Claim(s) 6-8 are rejected under 35 U.S.C. 103 as being unpatentable over Dirac in view of Aken in further view of Andreas Krause et al. "Contextual Gaussian process bandit optimization." (“Krause”)
In regards to claim 6,
Dirac and Aken teaches The computer implemented tuning method as in claim 1,
Krause teaches wherein said Gaussian Process (GP) is a contextual gaussian process bandit tuner (CGPTuner) which uses as context said characterization vector {right arrow over (cl)} provided by said computed similarity.
(Krause, Section 1, “In summary, as our main contributions we
• develop an efficient algorithm, CGP-UCB, for the contextual GP bandit problem; [contextual gaussian process bandit tuner (CGPTuner)]”)
(Krause, Section 5, “By choosing different kernel functions k : X×X → R, the CGP-UCB algorithm can be applied to a variety of different applications. A natural approach is to start with kernel functions kZ : Z×Z → R and kS : S × S → R on the space of contexts and actions, and use them to derive the kernel on the product space.”)
(Krause, Section 5.1, “One possibility is to consider a product kernel k = kS ⊗ kZ, by setting (kS ⊗ kZ)((s, z),(s 0 , z 0 )) = kZ(z, z 0 )kS(s, s 0 ). The intuition behind this product kernel is a conjunction of the notions of similarities induced by the kernels over context and action spaces: Two context-action pairs are similar (large correlation) if the contexts are similar and actions are similar (Figure 1(a)). [uses as context said characterization vector {right arrow over (cl)} provided by said computed similarity]”)
Krause is considered to be analogous to the claimed invention because they are in the same field of using recommender systems for optimizing systems. Therefore, it would have been obvious to someone of ordinary skill in the art before the effective filing date of the claimed invention to have modified Dirac in view of Aken to incorporate the teachings of Krause in order to provide an approach that guides the tradeoff between exploration and exploitation and estimate predictive uncertainty (Krause, Section I., “Consider the problem of learning to optimize a complex system subject to varying environmental conditions. Or learning to retrieve relevant documents (ads), given context about the user. Or learning to solve a sequence of related optimization and search tasks, by taking into account experience with tasks solved previously. All these problems can be phrased as a contextual bandit problem (c.f., [1, 2], we review related work in Section 7), where in each round, we receive context (about the experimental conditions, the query, or the task), and have to choose an action (system parameters, document to retrieve). We then receive noisy feedback about the obtained payoff. The key challenge is to trade off exploration by gathering data for estimating the mean payoff function over the context-action space, and to exploit by choosing an action deemed optimal based on the gathered data. Without making any assumptions about the class of payoff functions under consideration, we cannot expect to do well. A natural approach is to choose a regularizer, encoding assumptions about smoothness of the payoff function. In this paper, we take a nonparametric approach, and model the payoff function as a sample from a Gaussian process defined over the joint context-action space (or having low norm in the associated RKHS). This approach allows us to estimate the predictive uncertainty in the payoff function estimated from previous experiments, guiding the tradeoff between exploration and exploitation.”)
In regards to claim 7,
Dirac and Aken and Krause teaches The computer implemented tuning method as in claim 6,
Krause teaches wherein Gaussian Process (GP) is provided with a combined kernel function
PNG
media_image6.png
42
217
media_image6.png
Greyscale
, which is the sum of configuration kernel
PNG
media_image7.png
40
99
media_image7.png
Greyscale
defined over a configuration space (X) and a task kernel
PNG
media_image8.png
35
98
media_image8.png
Greyscale
defined over a task characterization space (C).
(Krause, Section 5, “By choosing different kernel functions k : X×X → R, the CGP-UCB algorithm can be applied to a variety of different applications. A natural approach is to start with kernel functions kZ : Z×Z → R and kS : S × S → R on the space of contexts and actions, and use them to derive the kernel on the product space [the sum of configuration kernel
PNG
media_image7.png
40
99
media_image7.png
Greyscale
defined over a configuration space (X) and a task kernel
PNG
media_image8.png
35
98
media_image8.png
Greyscale
defined over a task characterization space (C)].”)
In regards to claim 8,
Dirac and Aken and Krause teaches The computer implemented tuning method as in claim 7,
Krause teaches wherein said Gaussian Process (GP) has an additive structure made of a characterization component
PNG
media_image9.png
19
33
media_image9.png
Greyscale
which models overall trends among tasks and a configuration component
PNG
media_image10.png
19
32
media_image10.png
Greyscale
which models configuration-specific deviation from said overall trends.
(Krause, Section 5.1, “An alternative is to consider the additive combination [additive structure] (kS ⊕ kZ)((s, z),(s 0 , z 0 )) = kZ(z, z 0 ) + kS(s, s 0 ) [made of a characterization component
PNG
media_image9.png
19
33
media_image9.png
Greyscale
] which is positive definite as well. The intuition behind this construction is that a GP with additive kernel can be understood as a generative model, which first samples a function fS(s, z) that is constant along z, and varies along s with regularity as expressed by ks; it then samples a function fz(s, z), which varies along z and is constant along s; then f = fs + fz. Thus, the fz component models overall trends according to the context (e.g., encoding assumptions about similarity within clusters of contexts) [which models overall trends among tasks], and the fS models action-specific deviation from this trend (Figure 1(b)). [a configuration component
PNG
media_image10.png
19
32
media_image10.png
Greyscale
which models configuration-specific deviation from said overall trends]”)
Allowable Subject Matter
Claim 4 was searched but not rejected under 35 USC §103.
Claim 4 is allowed.
The following is an examiner’s statement of reasons for allowance:
The references of record alone or in combination do not disclose or suggest the limitations found within claim 4 limitations as a whole with regards to technical features recited by the claim limitations directed to: “wherein said relevance score rt(xi) is defined as:
PNG
media_image11.png
185
355
media_image11.png
Greyscale
where J is the set of entries in the knowledge base (KB) assigned to task t for which xi has been evaluated, ƒj(xi) is the performance score yj obtained at iteration j when the system is configured using configuration xi, and ƒj(x0) is the performance score obtained at iteration j by the baseline configuration.”
In summary, the references made of record, fail to disclose the required claimed technical features recited by the noted claim limitations as a whole; and the related dependent claims are found allowable for the reasons noted above.
Conclusion
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
US20100082599A1: HP teaches Characterizing Queries To Predict Execution In A Database
US11061902B2: Oracle teaches Automated configuration parameter tuning for database performance
Any inquiry concerning this communication or earlier communications from the examiner should be directed to JASMINE THAI whose telephone number is (703)756-5904. The examiner can normally be reached M-F 8-4.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael Huntley can be reached at (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/J.T.T./Examiner, Art Unit 2129
/MICHAEL J HUNTLEY/Supervisory Patent Examiner, Art Unit 2129