DETAILED ACTION
Notice of Pre-AIA or AIA Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA . This action is in response to an amendment filed on November 25th, 2025. Claims 1-9 are pending in the current application. The arguments in response to the 101 rejection were persuasive. As such, a modified rejection under 101 is presented in this action, constituting a new ground of rejection, and accordingly this action is made Non-Final.
Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.
Claim(s) 1-9 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.
Regarding claim 1, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, claim 1 is directed towards a method, which falls within one of the four statutory
categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions “extracting… a series of cluster features from the set of clusters”, “performing… pairwise cross-correlation… resulting in potential candidate for an optimal hyperparameter value.”, aggregating… maximum or minimum values for the hyperparameter value at their respective indices”, and “selecting…an optimum value for the hyperparameter value.” As drafted, these are processes that, under the broadest reasonable interpretation, fall under the mental processes grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
“receiving… data to be used by a clustering algorithm;”
“receiving… a selection of a hyperparameter value to tune a hyperparameter value”
a hyperparameter value optimization computer program
an electronic device
“executing… the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”
“outputting, by the hyperparameter value optimization computer program, the optimum value for the hyperparameter value to the clustering algorithm”
and “consuming, by the clustering algorithm, the hyperparameter value.”
The “receiving… data to be used by a clustering algorithm”, “receiving… a selection of a hyperparameter to tune a hyperparameter value”, “outputting, by the hyperparameter value optimization computer program, the optimum value for the hyperparameter value to the clustering algorithm”, and “consuming, by the clustering algorithm, the hyperparameter value.” is merely insignificant extra-solution activity, (see MPEP 2016.05(g)) the “executing… the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”, the hyperparameter value optimization computer program, and the electronic device are interpreted to be mere instructions to apply a judicial exception, as it instructs to execute the clustering algorithm, using the hyperparameter value optimization program, and an electronic device to perform the abstract ideas. (See MPEP 2106.05(f)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, “receiving… data to be used by a clustering algorithm”, “receiving… a selection of a hyperparameter to tune a hyperparameter value”, and “outputting, by the hyperparameter value optimization computer program, the optimum value for the hyperparameter value to the clustering algorithm” is considered well-understood, routine, and conventional, as it is simply receiving or transmitting data over a network, (See MPEP 2106.05(d)(II)(i)) and “consuming, by the clustering algorithm, the hyperparameter value” is considered to be well-understood, routine, and conventional, as disclosed by scikit-learn (The parameters in the clustering methods correspond to hyperparameter values that the clustering methods use. With scikit-learn being a well-known library for the Python programming language shows that consuming a hyperparameter value by the clustering algorithm is considered well-understood, routine, and conventional.) Therefore, the claim is ineligible.
Regarding claim 4, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, claim 4 is directed towards a non-transitory computer readable storage medium, which is considered a manufacture, which falls within one of the four statutory
categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions “extracting a series of cluster features from the set of clusters”, “performing pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value”, “aggregating maximum or minimum values for the hyperparameter value at their respective indices”, and “selecting an optimum value for the hyperparameter value.” As drafted, these are processes that, under the broadest reasonable interpretation, fall under the mental processes grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
A non-transitory computer readable storage
“receiving a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm”
one or more computer processors
“for each possible hyperparameter value, executing… the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”
“outputting the optimum value for the hyperparameter value to the clustering algorithm”
and “the clustering algorithm is configured to consume the hyperparameter value.”
The “receiving a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm”, “outputting the optimum value for the hyperparameter value to the clustering algorithm”, and “the clustering algorithm is configured to consume the hyperparameter value.” is merely insignificant extra-solution activity, (see MPEP 2016.05(g)) the “executing the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”, the non-transitory computer readable storage and the one or more computer processors are interpreted to be mere instructions to apply a judicial exception, as it instructs to execute the clustering algorithm to get a set of clusters for each possible hyperparameter value, and to use the non-transitory computer readable storage and the one or more computer processors as tools to perform the abstract ideas. (See MPEP 2106.05(f)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, “receiving a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm” and “outputting the optimum value for the hyperparameter value to the clustering algorithm” is considered well-understood, routine, and conventional, as it is simply receiving or transmitting data over a network, (See MPEP 2106.05(d)(II)(i)) and “the clustering algorithm is configured to consume the hyperparameter value.” is considered to be well-understood, routine, and conventional, as disclosed by scikit-learn (The parameters in the clustering methods correspond to hyperparameter values that the clustering methods use. With scikit-learn being a well-known library for the Python programming language shows that consuming a hyperparameter value by the clustering algorithm is considered well-understood, routine, and conventional.) Therefore, the claim is ineligible.
Regarding claim 7, Under Step 1 of the Subject Matter Eligibility Test of Products and
Processes, claim 7 is directed towards an electronic device, which is considered a machine, which falls within one of the four statutory categories.
Next, under a Step 2A Prong 1 Analysis, the claim mentions “extracting a series of cluster features from the set of clusters”, “performs pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value”, “aggregates maximum or minimum values for the hyperparameter value at their respective indices”, and “selects an optimum value for the hyperparameter value.” As drafted, these are processes that, under the broadest reasonable interpretation, fall under the mental processes grouping of abstract ideas.
Therefore, we have to examine the claim under Step 2A prong 2, which considers the additional elements within the claim. The claim’s additional elements are:
“receives a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm”
a computer processor
a memory storing a hyperparameter value optimization computer program
“executes the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”
“outputs the optimum value for the hyperparameter value to the clustering algorithm.”
“the clustering algorithm is configured to consume the hyperparameter value.”
The “receives a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm”, “outputs the optimum value for the hyperparameter value to the clustering algorithm”, and “the clustering algorithm is configured to consume the hyperparameter value.” is merely insignificant extra-solution activity, (see MPEP 2016.05(g)) the “executes the clustering algorithm resulting in a set of clusters for each possible hyperparameter value”, a computer processor, and a memory storing a hyperparameter value optimization computer program are interpreted to be mere instructions to apply a judicial exception, as it instructs to execute the clustering algorithm to get a set of clusters for each possible hyperparameter value, and to use a computer processor and a memory contains a hyperparameter optimization computer program as tools to perform the abstract ideas. (See MPEP 2106.05(f)) Therefore, these additional elements do not integrate the abstract idea into a practical application. The claim is directed to an abstract idea.
Under a Step 2B analysis, the claim’s addition elements do not amount to significantly
more than the judicial exception as explained above in Step 2A prong 2. Additionally, “receiving a selection of a hyperparameter value to tune for a clustering algorithm” and “outputting the optimum value for the hyperparameter value to the clustering algorithm” is considered well-understood, routine, and conventional, as it is simply receiving or transmitting data over a network, (See MPEP 2106.05(d)(II)(i)) and “the clustering algorithm is configured to consume the hyperparameter value.” is considered to be well-understood, routine, and conventional, as disclosed by scikit-learn (The parameters in the clustering methods correspond to hyperparameter values that the clustering methods use. With scikit-learn being a well-known library for the Python programming language shows that consuming a hyperparameter value by the clustering algorithm is considered well-understood, routine, and conventional.) Therefore, the claim is ineligible.
Regarding claim 2, 5 and 8, “the clustering algorithm is selected from the group consisting of K-means clustering and DBScan” is merely indicating the field of use or technological environment to apply the abstract idea i.e. using K-means clustering and DBScan to help perform the abstract idea. (See MPEP 2106.05(h)) As such, these elements do not integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself. Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claims 1, 4, and 7.
Regarding claims 3, 6, and 9, “a first order difference in size, a normalized entropy, a Davies-Bouldin score, a Calinski-Harabasz index, and a silhouette coefficient.” is merely indicating the field of use or technological environment to apply the abstract idea i.e. including a first order difference in size, a normalized entropy, a Davies-Bouldin score, a Calinski-Harabasz index, and a silhouette coefficient for cluster features that help perform the abstract idea. (See MPEP 2106.05(h)) As such, these elements do not integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself. Therefore, the claims are not eligible under U.S.C. 101 for the same reasons as set forth in the rejection of claims 1, 4, and 7.
Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA 35 U.S.C. 102 and 103 (or as subject to pre-AIA 35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis (i.e., changing from AIA to pre-AIA ) for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1, 2, 4, 5, 7, and 8 are rejected under 35 U.S.C. 103 as being unpatentable over Radwa ElShawi et al. (Herein referred to as ElShawi) (A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Clustering) in further view of Uttam Thakore (Herein referred to as Thakore) (IMPROVING RELIABILITY AND SECURITY MONITORING IN ENTERPRISE AND CLOUD SYSTEMS BY LEVERAGING INFORMATION REDUNDANCY)
Regarding claim 1, ElShawi teaches a method for auto-thresholding for hyperparameter value selection, comprising: receiving data to be used by a clustering algorithm (“…for a large number of datasets, we collect both performance data and a set of meta-features, i.e., characteristics of the dataset that can be computed efficiently and that help determining which algorithm and evaluation metric to use on a new dataset”, pg. 3, under “A. Meta-Feature Extraction”) by a hyperparameter value optimization computer program executed by an electronic device (Fig. 1 cSmartML: Framework Architecture, pg. 2 (See Fig. 1 below)) (cSmartML is a framework built off of scikit-learn (Abstract), which itself is a library found on Python, which is a programming language used to write a program. cSmartML, under BRI, is a program designed for the purposes of hyperparameter value optimization and implicitly requires an electronic device capable of doing so.) receiving, by the hyperparameter value optimization computer program, a selection of a hyperparameter to tune a hyperparameter value (“Given a clustering algorithm along with combined internal indices obtained from the meta-learning recommendation component, this phase aims to find the best set of hyperparameters for the recommended algorithm.”, pg. 3, under “C. Hyper-parameters Optimization”; See also Table III on pg. 4, as well as the section on pg. 4 titled “Defining Hyper-parameter Search Space”) (The quotation is also shown in Fig. 1, (see below) under hyperparameter optimization, where the cluster method, along with the meta-learning recommendation from the previous step, are used for hyper-partition generation. Depending on the clustering algorithm, (such as DBScan) main hyperparameters and conditional hyperparameters are selected.) for each possible hyperparameter value, executing, by the hyperparameter value optimization computer program, the clustering algorithm resulting in a set of clusters for each possible hyperparameter value (“In order to efficiently explore the space of possible clustering solutions for the recommended clustering algorithm, we should be able to enumerate the sets of hyper-parameters which describe the recommended algorithm... Hyper-parameters may be categorical such as the metric used to compute the linkage in agglomerative clustering or numerical such as the number of the clusters to find in K-means clustering ”, pg. 4, under “Defining Hyper-parameter Search Space”;; See also Figure 1 on pg. 2) (Clustering algorithms typically have a hyperparameter for “number of clusters” or as cited in this reference, “n-clusters” which denote the number of clusters to search or find. (See Table III on pg. 4 for reference) The result of the clustering algorithm’s execution being a set of clusters (as seen in Fig. 1) for each hyperparameter value, which teaches the limitation.) aggregating maximum or minimum values for the hyperparameter value at their respective indices; selecting an optimum value for the hyperparameter value (“To automatically tune the hyper-parameters in each of the hyper-partitions, we use the MuPlusLambda evolutionary algorithm [39] implemented in the Python package DEAP [40]... In the end, the final populations from the different partitions are merged, and an optimal configuration is selected using NSGA-II”, pg. 4, right column, last paragraph; pg. 5, left column, first paragraph) (A population in an evolutionary algorithm is an aggregation of the hyperparameters. An optimal hyperparameter denotes the maximum values.) outputting, by the hyperparameter value optimization computer program, the optimum value for the hyperparameter value to the clustering algorithm. (“More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index” pg. 5, right column, second paragraph) (A set of hyper-parameters that best optimize a selected internal index is output from the framework.) and consuming, by the clustering algorithm, the hyperparameter value. (“cSmartML aims to help non-expert machine learning users. One of the most commonly-used approaches by non-expert users is to try all clustering algorithms with their defaults hyper-parameters and then select the clustering algorithm that best optimizes a randomly chosen internal index… as stronger baseline, we consider various hyper-parameter settings for each of the 8 clustering algorithms considered in this work. More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index.”, pg. 5, right column, bottom paragraph) (The clustering algorithm uses the hyperparameters to best optimize a randomly selected internal index, which teaches the limitation.)
However, ElShawi does not explicitly teach extracting, by the hyperparameter value optimization computer program, a series of cluster features from the set of clusters, nor performing, by the hyperparameter value optimization computer program, pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value
Thakore teaches extracting, by the hyperparameter value optimization computer program, a series of cluster features from the set of clusters, (“Our framework performs feature extraction… To further improve scalability, we propose adding additional levels to the clustering… to cluster features within, for example, the same physical or virtual machine, the same network subnet, etc.” pg. 24, under “Procedure for automated feature extraction”; pg. 31, under “Scalability”) (The feature extraction of Thakore can be easily configured to extract cluster features within the plurality of clusters of ElShawi to teach this limitation) and performing, by the hyperparameter value optimization computer program, pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value. (“The most expensive operation in our framework is the clustering-based feature reduction, which consists of two distinct steps performed repeatedly: 1) computing pairwise cross-correlation values across all features within each feature cluster at each level of clustering, and 2) finding all maximal cliques within each cluster.”, pg. 31, under “Scalability”) (The feature reduction separates a plurality of potential candidate features, which are used for analysis, and non-candidate features which are irrelevant. The analysis, alongside clustering, (among other steps) would then help determine an optimal hyperparameter value.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the current application’s filing date, to combine the method of hyperparameter optimization, as disclosed by ElShawi, with the feature extraction and framework of Thakore. One would be motivated to combine the two teachings, prior to the filing date of the current application, as by removing redundant features and clustering, it allows for a faster analysis, as disclosed by Thakore. (“Our results show…our framework… facilitates more rapid root cause analysis… by enabling the clustering and removal of redundant features… our framework dramatically reduces the number of features that analysts must sift through during analysis.”, pg. 34, final paragraph)
PNG
media_image1.png
360
816
media_image1.png
Greyscale
Fig. 1 of ElShawi
Regarding claim 4, ElShawi teaches a non-transitory computer readable storage medium, including instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps (“cSmartML: A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Clustering”, Title; Fig. 1 cSmartML: Framework Architecture (See Fig. 1 above)) (This teaches it, as cSmartML, under BRI, is a program designed for the purposes of hyperparameter value optimization and implicitly would need a non-transitory computer readable storage medium to run and/or distribute the framework) comprising: receiving a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm (“Given a clustering algorithm along with combined internal indices obtained from the meta-learning recommendation component, this phase aims to find the best set of hyperparameters for the recommended algorithm.”, pg. 3, under “C. Hyper-parameters Optimization”; See also Table III on pg. 4, as well as the section on pg. 4 titled “Defining Hyper-parameter Search Space”) (The quotation is also shown in Fig. 1, (see above) under hyperparameter optimization, where the cluster method, along with the meta-learning recommendation from the previous step, are used for hyper-partition generation. Depending on the clustering algorithm, (such as DBScan) main hyperparameters and conditional hyperparameters are selected.) for each possible hyperparameter value, executing the clustering algorithm resulting in a set of clusters for each possible hyperparameter value, (“In order to efficiently explore the space of possible clustering solutions for the recommended clustering algorithm, we should be able to enumerate the sets of hyper-parameters which describe the recommended algorithm. For some clustering algorithms, the value of a particular hyper-parameter for a given clustering algorithm affects the selection of other hyper-parameters… Some clustering algorithms require the number of clusters to be specified, we consider the number of clusters to search varies between 2 to a fifth of the data size. ”, pg. 4, under “Defining Hyper-parameter Search Space”; pgs. 5-6, under “B. Experimental Results”) (Clustering algorithms typically have a hyperparameter for “number of clusters” or as cited in this reference, “n-clusters” which denote the number of clusters to search. (See Table III on pg. 4 for reference) The result of the clustering algorithm’s execution being a set of clusters for each hyperparameter value, which teaches the limitation.) aggregating maximum or minimum values for the hyperparameter value at their respective indices; selecting an optimum value for the hyperparameter value (“To automatically tune the hyper-parameters in each of the hyper-partitions, we use the MuPlusLambda evolutionary algorithm [39] implemented in the Python package DEAP [40]... In the end, the final populations from the different partitions are merged, and an optimal configuration is selected using NSGA-II”, pg. 4, right column, last paragraph; pg. 5, left column, first paragraph) (A population in an evolutionary algorithm is an aggregation of the hyperparameters. An optimal hyperparameter denotes the maximum values.) outputting the optimum value for the hyperparameter value to the clustering algorithm. (“More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index” pg. 5, right column, second paragraph) (A set of hyper-parameters that best optimize a selected internal index is output from the framework.) and the clustering algorithm is configured to consume the hyperparameter value. (“cSmartML aims to help non-expert machine learning users. One of the most commonly-used approaches by non-expert users is to try all clustering algorithms with their defaults hyper-parameters and then select the clustering algorithm that best optimizes a randomly chosen internal index… as stronger baseline, we consider various hyper-parameter settings for each of the 8 clustering algorithms considered in this work. More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index.”, pg. 5, right column, bottom paragraph) (The clustering algorithm uses the hyperparameters to best optimize a randomly selected internal index, which teaches the limitation.)
However, ElShawi does not explicitly teach extracting a series of cluster features from the set of clusters, nor performing pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value
Thakore teaches extracting a series of cluster features from the set of clusters, (“Our framework performs feature extraction… To further improve scalability, we propose adding additional levels to the clustering… to cluster features within, for example, the same physical or virtual machine, the same network subnet, etc.” pg. 24, under “Procedure for automated feature extraction”; pg. 31, under “Scalability”) (The feature extraction of Thakore can be easily configured to extract cluster features within the plurality of clusters of ElShawi to teach this limitation) and performing pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value. (“The most expensive operation in our framework is the clustering-based feature reduction, which consists of two distinct steps performed repeatedly: 1) computing pairwise cross-correlation values across all features within each feature cluster at each level of clustering, and 2) finding all maximal cliques within each cluster.”, pg. 31, under “Scalability”) (The feature reduction separates a plurality of potential candidate features, which are used for analysis, and non-candidate features which are irrelevant. The analysis, alongside clustering, (among other steps) would then help determine an optimal hyperparameter value.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the current application’s filing date, to combine the method of hyperparameter optimization, as disclosed by ElShawi, with the feature extraction and framework of Thakore. One would be motivated to combine the two teachings, prior to the filing date of the current application, as by removing redundant features and clustering, it allows for a faster analysis, as disclosed by Thakore. (“Our results show…our framework… facilitates more rapid root cause analysis… by enabling the clustering and removal of redundant features… our framework dramatically reduces the number of features that analysts must sift through during analysis.”, pg. 34, final paragraph)
Regarding claim 7, ElShawi teaches an electronic device comprising a computer processor, a memory storing a hyperparameter value optimization computer program (“cSmartML: A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Clustering”, Title; Fig. 1 cSmartML: Framework Architecture (See Fig. 1 above)) (This teaches it as cSmartML, under BRI, is a program designed for the purposes of hyperparameter value optimization and implicitly need a computing device to run and/or distribute the framework) receiving a selection of a hyperparameter to tune a hyperparameter value for a clustering algorithm (“Given a clustering algorithm along with combined internal indices obtained from the meta-learning recommendation component, this phase aims to find the best set of hyperparameters for the recommended algorithm.”, pg. 3, under “C. Hyper-parameters Optimization”; See also Table III on pg. 4, as well as the section on pg. 4 titled “Defining Hyper-parameter Search Space”) (The quotation is also shown in Fig. 1, (see above) under hyperparameter optimization, where the cluster method, along with the meta-learning recommendation from the previous step, are used for hyper-partition generation. Depending on the clustering algorithm, (such as DBScan) main hyperparameters and conditional hyperparameters are selected.) for each possible hyperparameter value, executing the clustering algorithm resulting in a set of clusters for each possible hyperparameter value (“In order to efficiently explore the space of possible clustering solutions for the recommended clustering algorithm, we should be able to enumerate the sets of hyper-parameters which describe the recommended algorithm. For some clustering algorithms, the value of a particular hyper-parameter for a given clustering algorithm affects the selection of other hyper-parameters… Some clustering algorithms require the number of clusters to be specified, we consider the number of clusters to search varies between 2 to a fifth of the data size. ”, pg. 4, under “Defining Hyper-parameter Search Space”; pgs. 5-6, under “B. Experimental Results”) (Clustering algorithms typically have a hyperparameter for “number of clusters” or as cited in this reference, “n-clusters” which denote the number of clusters to search. (See Table III on pg. 4 for reference) The result of the clustering algorithm’s execution being a set of clusters for each hyperparameter value, which teaches the limitation.) aggregating maximum or minimum values for the hyperparameter value at their respective indices; selecting an optimum value for the hyperparameter value (“To automatically tune the hyper-parameters in each of the hyper-partitions, we use the MuPlusLambda evolutionary algorithm [39] implemented in the Python package DEAP [40]... In the end, the final populations from the different partitions are merged, and an optimal configuration is selected using NSGA-II”, pg. 4, right column, last paragraph; pg. 5, left column, first paragraph) (A population in an evolutionary algorithm is an aggregation of the hyperparameters. An optimal hyperparameter denotes the maximum values.) outputting the optimum value for the hyperparameter value to the clustering algorithm. (“More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index” pg. 5, right column, second paragraph) (A set of hyper-parameters that best optimize a selected internal index is output from the framework.) and the clustering algorithm is configured to consume the hyperparameter value. (“cSmartML aims to help non-expert machine learning users. One of the most commonly-used approaches by non-expert users is to try all clustering algorithms with their defaults hyper-parameters and then select the clustering algorithm that best optimizes a randomly chosen internal index… as stronger baseline, we consider various hyper-parameter settings for each of the 8 clustering algorithms considered in this work. More specifically, this baseline performs an exhaustive search (grid search) over a grid of hyper-parameter settings for each of the 8 clustering algorithms and then select the clustering algorithm along with the set of hyper-parameters that best optimize a randomly selected internal index.”, pg. 5, right column, bottom paragraph) (The clustering algorithm uses the hyperparameters to best optimize a randomly selected internal index, which teaches the limitation.)
However, ElShawi does not explicitly teach extracting a series of cluster features from the set of clusters, nor performing pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value
Thakore teaches extracting a series of cluster features from the set of clusters, (“Our framework performs feature extraction… To further improve scalability, we propose adding additional levels to the clustering… to cluster features within, for example, the same physical or virtual machine, the same network subnet, etc.” pg. 24, under “Procedure for automated feature extraction”; pg. 31, under “Scalability”) (The feature extraction of Thakore can be easily configured to extract cluster features within the plurality of clusters of ElShawi to teach this limitation) and performing pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value. (“The most expensive operation in our framework is the clustering-based feature reduction, which consists of two distinct steps performed repeatedly: 1) computing pairwise cross-correlation values across all features within each feature cluster at each level of clustering, and 2) finding all maximal cliques within each cluster.”, pg. 31, under “Scalability”) (The feature reduction separates a plurality of potential candidate features, which are used for analysis, and non-candidate features which are irrelevant. The analysis, alongside clustering, (among other steps) would then help determine an optimal hyperparameter value.)
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the current application’s filing date, to combine the method of hyperparameter optimization, as disclosed by ElShawi, with the feature extraction and framework of Thakore. One would be motivated to combine the two teachings, prior to the filing date of the current application, as by removing redundant features and clustering, it allows for a faster analysis, as disclosed by Thakore. (“Our results show…our framework… facilitates more rapid root cause analysis… by enabling the clustering and removal of redundant features… our framework dramatically reduces the number of features that analysts must sift through during analysis.”, pg. 34, final paragraph)
Regarding claims 2, 5, and 8, ElShawi, as modified by Thakore, teaches the method, non-transitory computer readable medium, and system of claims 1, 4, and 7 respectively, as well as the clustering algorithm is selected from the group consisting of K-means clustering and DBScan. (“we evaluated a set of meta-features described in Section II-A on 8 clustering techniques, including, KMeans, DBSCAN, OPTICS, Birch, Spectral, Agglomerated, Affinity Propagation and MeanShift… For some clustering algorithms, the value of a particular hyper-parameter for a given clustering algorithm affects the selection of other hyper-parameters.”, pgs. 3 and 4, under TABLE II and Defining Hyper-parameter Search (ElShawi))
Claims 3, 6, and 9 are rejected under 35 U.S.C. 103 as being unpatentable over Radwa ElShawi et al. (Herein referred to as ElShawi) (A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Clustering) in further view of Uttam Thakore (Herein referred to as Thakore) (IMPROVING RELIABILITY AND SECURITY MONITORING IN ENTERPRISE AND CLOUD SYSTEMS BY LEVERAGING INFORMATION REDUNDANCY) and in further view of RENJIE CHEN et al. (Herein referred to as Chen) (Supervised Feature Selection With a Stratified Feature Weighting Method)
Regarding 3, 6, and 9, ElShawi, as modified by Thakore, teaches the cluster features include a Davies-Bouldin score, a Calinski-Harabasz index, and a silhouette coefficient. (“For evaluation, the user should choose between three internal metrics including Calinski-Harabasz [17], the Davies-Bouldin Index [18], and the Silhouette [19].”, pg. 2, left column, second paragraph (ElShawi))
However, ElShawi nor Thakore teach a first order difference in size nor a normalized entropy.
Chen teaches a first order difference in size nor a normalized entropy. (“Peng et al. [29] proposed a feature selection method based on the principle of Max-Relevance and Min-Redundancy. They used a first-order incremental process to attain optimal feature set… It iteratively partitions a data matrix into k × l disjoint co-clusters, where k is the number of object clusters and l is the number of feature clusters. Based on a partition process, quite a few partitional co-clustering algorithms have been proposed. Banerjee et al. [2] introduced minimum Bregman information (MBI) to co-clustering and proposed a Bregman Block Average co-clustering algorithm (BBAC). It attained optimal matrix approximation which simultaneously generalizes the maximum entropy and the standard least square.”, pg. 3, left column, paragraph 1; pg. 4, left column, bottom paragraph) (Chen discloses a feature selection method which utilizes a first-order process to attain a set of features. After features are selected, co-clustering takes place, wherein a data matrix, corresponding to feature clusters and object clusters, is partitioned, and in the process, a co-clustering algorithm is performed, which entails normalized entropy,
Therefore, it would have been considered obvious to one of ordinary skill in the art, prior to the current application’s filing date, to combine the Davies-Bouldin score, Calinski-Harabasz index, and silhouette coefficient of ElShawi, with the first-order process and entropy of Chen, as the Davies-Bouldin score, Calinski-Harabasz index, silhouette coefficient, first-order process and entropy all relate to the evaluation, approximation, and optimization of matrices related to features. One would be motivated to combine the two teachings, as SFR, (Subspace Feature Ranking) which is the feature selection used by Chen, has proven effective for high-dimensional data, as disclosed by Chen. (“Experimental results show that our method can select features which are both informative and diverse. Therefore, SFR is effective for high-dimensional data.”, pg. 2, left column, paragraph 3)
Response to Arguments
Applicant's arguments filed November 25th, 2025 have been fully considered but they are not fully persuasive. Applicant’s arguments have overcome the 35 U.S.C. 112(b) rejections of the previous office action.
The Applicant argues in substance,
Argument 1: The claims integrate the judicial exception into a practical application by employing the information provided by the judicial exception to the clustering algorithm. These elements together recite a meaningful way of using the judicial exception beyond “generally linking.”
The statements made by the applicant amount to a persuasive argument and are being addressed here in light of the modified rejection set forth in this action. The employment of information to the clustering algorithm merely recites an improvement of an abstract idea. The applicant is reminded that the requirement for eligibility is that an improvement must be made to a particular technological environment or a computer functionality. The claims recites elements that are directed to a technological environment, (i.e. the clustering algorithm) but the claims never point to the improvement in the particular technological environment. Therefore, the rejection is maintained.
Argument 2: ElShawi does not teach “receiving, by the hyperparameter value optimization computer program, a selection of a hyperparameter to tune a hyperparameter value.” Notably, there is no disclosure of the selection of a hyperparameter to optimize a hyperparameter value for.
The examiner respectfully disagrees. Further explanation has been added above. ElShawi discloses a choice of clustering algorithms which have hyperparameter to tune depending on the chosen algorithm, which teaches this limitation, as the choice of algorithm corresponds to, under the broadest reasonable interpretation, a selection of a hyperparameter. This is disclosed in ElShawi’s disclosure on pg. 4 under “Defining Hyper-parameter Search Space” with Table III and Fig. 2 as visual aids to help describe the process.
Argument 3: ElShawi does not teach “for each possible hyperparameter value, executing, by the hyperparameter value optimization computer program, the clustering algorithm resulting in a set of clusters for each possible hyperparameter value.” Notably, there is disclosure of executing a clustering algorithm that results in a set of clusters for each possible hyperparameter value
The examiner respectfully disagrees. Further explanation has been added above. ElShawi discloses the execution of a hyperparameter optimization program that uses a clustering algorithm to output a set of clusters. This is shown in Fig. 1 on pg. 2, wherein the computing output is a clustering solution, the clustering solution corresponding to, under the broadest reasonable interpretation, a set of clusters, which teaches the limitation.
Argument 4: Thakore does not teach “extracting, by the hyperparameter value optimization computer program, a series of cluster features from the set of clusters.” Notably, the extraction has nothing to do with extracting a series of features from clusters.
The examiner respectfully disagrees. Thakore teaches feature extraction, which combined with the clusters of ElShawi, teaches the limitation, as feature extraction can be easily configured to work with ElShawi’s clustering solutions to get a plurality of cluster features, which corresponds to a series of cluster features. This is supported by Thakore, as they disclose “feature clustering” on pg. 26, and in other parts of the disclosure.
Argument 5: Thakore does not teach “performing, by the hyperparameter value optimization computer program, pairwise cross-correlation on the series of cluster features resulting in potential candidates for an optimal hyperparameter value.” Notably, instead of performing pairwise cross-correlation on a series of cluster features, it’s performed across all features within each feature cluster at each level of clustering.
The examiner respectfully disagrees. Thakore discloses performing pairwise cross-correlation on “all features within each feature cluster at each level of clusters” as the applicant asserts. “All features within each feature cluster at each level of clusters” is interpreted to be, under the broadest reasonable interpretation, a series of cluster features. Therefore, the rejection is maintained.
Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to Tyler E Iles whose telephone number is (571)272-5442. The examiner can normally be reached 9:00am - 5:00pm.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Kakali Chaki can be reached at (571) 272-3719. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.
/T.E.I./ Patent Examiner, Art Unit 2122
/KAKALI CHAKI/ Supervisory Patent Examiner, Art Unit 2122