Last updated: April 19, 2026
Application No. 18/461,265
FEATURE SELECTION PROGRAM, FEATURE SELECTION DEVICE, AND FEATURE SELECTION METHOD

Non-Final OA §101§102§103§112
Filed
Sep 05, 2023
Examiner
KIM, SEHWAN
Art Unit
2129
Tech Center
2100 — Computer Architecture & Software
Assignee
Fujitsu Limited
OA Round
1 (Non-Final)
This examiner grants 60% of cases after interview

— +65.6% interview lift. A telephonic interview to clarify the technical implementation could significantly improve the outcome.
Based on 144 resolved cases, 2023–2026
Examiner Intelligence

KIM, SEHWAN View full profile →
Grants 60% of resolved cases
Career Allow Rate
86 granted / 144 resolved
+4.7% vs TC avg
Strong +66% interview lift
Without
With
+65.6%
Interview Lift
resolved cases with interview
Typical timeline
4y 1m
Avg Prosecution
35 currently pending
Career history
179
Total Applications
across all art units
Statute-Specific Performance

§101
20.8%
-19.2% vs TC avg
§103
46.2%
+6.2% vs TC avg
§102
6.3%
-33.7% vs TC avg
§112
23.3%
-16.7% vs TC avg
Black line = Tech Center average estimate • Based on career data from 144 resolved cases
Office Action

§101 §102 §103 §112
Notice of Pre-AIA  or AIA  Status
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .

Examiner’s Note
The Examiner encourages Applicant to schedule an interview to discuss issues related to, for example, the rejections noted below under 35 U.S.C § 112, 101 and § 103, for moving toward allowance.
Providing supporting paragraph(s) for each limitation of amended/new claim(s) in Remarks is strongly requested for clear and definite claim interpretations by Examiner.

Priority
Acknowledgment is made of applicant's claim for the CON application filed on 03/12/2021.

Claim Objections
Claim(s) 7 is/are objected to because of the following informalities.
Claim(s) 7 is/are objected to because of the following informalities: it appears that “each of the rules included in the set of the rules” (line 2) may need to read “each rule included in the set of rules” or something else. Appropriate correction is required. 
Claim(s) 7 is/are objected to because of the following informalities: it appears that “outputs” (line 5) may need to read “outputting” or something else. Appropriate correction is required.

Claim Rejections - 35 USC § 112
The following is a quotation of 35 U.S.C. 112(b):
(b)  CONCLUSION.—The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the inventor or a joint inventor regards as the invention.


The following is a quotation of 35 U.S.C. 112 (pre-AIA ), second paragraph:
The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject matter which the applicant regards as his invention.


Claim(s) 4-5, 7, 11, 15 is/are rejected under 35 U.S.C. 112(b) or 35 U.S.C. 112 (pre-AIA ), second paragraph, as being indefinite for failing to particularly point out and distinctly claim the subject matter which the inventor or a joint inventor (or for applications subject to pre-AIA  35 U.S.C. 112, the applicant), regards as the invention.
Claim(s) 4 recite(s) the limitation “the node” (line 5). There is insufficient antecedent basis for this limitation in the claim. It is not clear what it is referring to. It appears it may need to read “a node”, or something else. For the purposes of examination, “a node” is used. In addition, claim(s) 11, 15 is/are rejected for the same reason.
Claim(s) 5 recite(s) the limitation “the feature” (line 2). There is insufficient antecedent basis for this limitation in the claim. It is not clear what they are referring to. It appears it may need to read “a feature”, or something else. For the purposes of examination, “a feature” is used. 
Claim(s) 5 recite(s) the limitation “the node directly coupled to the node that corresponds to a certain feature value” (line 3). There is insufficient antecedent basis for this limitation in the claim. It is not clear what they are referring to. It appears it may need to read “a node directly coupled to a node that corresponds to a certain feature value”, or something else. For the purposes of examination, “a node directly coupled to a node that corresponds to a certain feature value” is used. 
Claim(s) 7 recite(s) the limitation “the rules” (line 2). There is insufficient antecedent basis for this limitation in the claim. It is not clear what they are referring to. It appears it may need to read “rules”, or something else. For the purposes of examination, “rules” is used. 
Claim(s) 7 recite(s) the limitation “the data satisfying the condition included in the rule” (line 4). There is insufficient antecedent basis for this limitation in the claim. It is not clear what they are referring to. It appears it may need to read “data satisfying a condition included in a rule”, or something else. For the purposes of examination, “data satisfying a condition included in a rule” is used. 
Claim(s) 7 recite(s) the limitation “the rules” (line 5). There is insufficient antecedent basis for this limitation in the claim. It is not clear what they are referring to, since it may indicate “a set of rules” (claim 6, line 3) or “the rules” (line 2) or something else. It appears it may need to read “rules”, or something else. For the purposes of examination, “rules” is used. 
Claim(s) 4-5, 7, 11, 15 each recite(s) limitations that raise issues of indefiniteness as set forth above, and their dependent claims are rejected at least based on their direct and/or indirect dependency from the claims listed above. Appropriate explanation and/or amendment is required.

Claim Rejections - 35 USC § 101
35 U.S.C. 101 reads as follows:
Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title.


Claims 1-20 are rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more.

Regarding claim 1
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: 
The limitations of 
“…:
specifying a feature of a superordinate concept that has a feature included in a feature set as a subordinate concept; and
selecting the feature of the superordinate concept as a feature to be added to the feature set when a plurality of hypotheses each represented by a combination of features that include the feature of the subordinate concept satisfies a certain condition based on an objective variable, features of the subordinate concept being different from each other”, as drafted, under its broadest reasonable interpretation, cover performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. 
In particular, the claim recites an additional element(s) (“storing a feature selection program that”) – the act of storing data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of storing data is recited at a high-level of generality (i.e., as a generic act of storing performing a generic act function of storing data) such that it amounts no more than a mere act to apply the exception using a generic act of storing. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.
The claim recites additional elements that are mere instructions to implement an abstract idea on a computer, or merely uses a computer as a tool to perform an abstract idea. See MPEP 2106.05(f). In particular, the claim recites an additional element(s) (“causes at least one computer to execute a process, the process comprising”) – using a device and/or a model to process data. The device and the model in each step are recited at a high-level of generality (i.e., as a generic computer performing a generic computer function of processing data) such that it amounts no more than mere instructions to apply the exception using a generic computer component. Accordingly, these additional elements do not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea.

Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
As discussed above, the claim recites the additional element(s) of storing data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g) – storing data. However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.
As discussed above, with respect to integration of the abstract idea into a practical application, the additional elements of using a generic computer component to perform each step amount to no more than mere instructions to apply the exception using a generic computer component. Mere instructions to apply an exception using a generic computer component cannot provide an inventive concept. The claim is not patent eligible. MPEP 2106.05(f).

Regarding claim 2
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: The claim recites the abstract idea identified above regarding claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. 
In particular, the claim recites an additional element (“wherein the certain condition includes a case where equal to or more than a certain rate of hypotheses among the plurality of hypotheses are established”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h) 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).

Regarding claim 3
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: The claim recites the abstract idea identified above regarding claim 1.
Step 2A Prong 2: This judicial exception is not integrated into a practical application. 
In particular, the claim recites an additional element (“wherein the certain condition includes a case where equal to or more than a certain rate of hypotheses among the plurality of hypotheses are established and a hypothesis obtained by replacing the feature of the subordinate concept with the feature of the superordinate concept is established”). This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not integrate the abstract idea into a practical application. See MPEP 2106.05(h) 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
This is a recitation of a particular type or source of model/data to be used in performing the abstract idea. Limiting the abstract idea to a particular type or source of model/data is an attempt to limit the abstract idea to a particular field of use or technological environment, which does not amount to significantly more than the abstract idea. See MPEP 2106.05(h).

Regarding claim 4
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: 
The limitations of 
“wherein the specifying includes specifying, in a graph that includes a node that corresponds to a feature value and an edge associated with an attribute that indicates a relationship between nodes that includes a superordinate-subordinate relationship, a feature that corresponds to the node coupled to the node that corresponds to the feature value included in the feature set by the edge associated with the attribute that indicates the superordinate- subordinate relationship”, as drafted, are an article of manufacture that, under its broadest reasonable interpretation, covers performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements. Thus, the claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Thus, the claim is not patent eligible.

Regarding claim 5
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: 
The limitations of 
“wherein the feature set includes the feature that corresponds to the node directly coupled to the node that corresponds to a certain feature value by the edge in the graph”, as drafted, under its broadest reasonable interpretation, cover performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements. Thus, the claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Thus, the claim is not patent eligible.

Regarding claim 6
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: 
The limitations of 
“generating a set of rules in which a condition represented by the combination of features included in the feature set to which the selected feature of the superordinate concept is added is associated with the objective variable established under the condition”, as drafted, under its broadest reasonable interpretation, cover performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. In particular, the claim does not recite additional elements. Thus, the claim is directed to an abstract idea.
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. Thus, the claim is not patent eligible.

Regarding claim 7
The claim is rejected under 35 U.S.C. 101 because the claimed invention is directed to an abstract idea without significantly more. 
Step 1: The claim recites a computer-readable storage medium; therefore, it falls into the statutory category of an article of manufacture.
Step 2A Prong 1: 
The limitations of 
“assigning, to each of the rules included in the set of the rules, an index according to a number of pieces of data that are positive examples with respect to the objective variable, the data satisfying the condition included in the rule, and …”, as drafted, under its broadest reasonable interpretation, cover performance of the limitation in the mind. That is, nothing in the claim element precludes the step from practically being performed in the mind. For example, the limitations in the context of this claim encompass the user mentally thinking with a physical aid (e.g., pencil and paper).

If a claim limitation, under its broadest reasonable interpretation, covers performance of the limitation in the mind but for the recitation of generic computer components, then it falls within the “Mental Processes” grouping of abstract ideas. Accordingly, the claim recites an abstract idea.

Step 2A Prong 2: This judicial exception is not integrated into a practical application. 
In particular, the claim recites an additional element(s) (“outputs the rules”) – the act of outputting data. The claim is adding an insignificant extra-solution activity to the judicial exception – see MPEP 2106.05(g). The act of outputting data is recited at a high-level of generality (i.e., as a generic act of performing a generic act function of outputting data) such that it amounts no more than a mere act to apply the exception using a generic act of outputting. Accordingly, this additional element does not integrate the abstract idea into a practical application because it does not impose any meaningful limits on practicing the abstract idea. The claim is directed to an abstract idea. 
Step 2B: The claim does not include additional elements that are sufficient to amount to significantly more than the judicial exception. 
As discussed above, the claim recites the additional element(s) of outputting data at a high-level of generality and is adding an insignificant extra-solution activity – see MPEP 2106.05(g). However, the addition of insignificant extra-solution activity does not amount to an inventive concept, particularly when the activity is well-understood, routine, and conventional. See MPEP 2106.05(d)(II) – “Receiving or transmitting data over a network” or “Storing and retrieving information in memory”. Accordingly, this additional element does not provide an inventive concept and significantly more than the abstract idea. Thus, the claim is not patent eligible.

Regarding claim 8
The claim recites “A feature selection device comprising: one or more memories; and one or more processors coupled to the one or more memories and the one or more processors configured to:” to perform precisely the computer-readable storage medium of Claim 1. As performance of an abstract idea on generic computer components (see MPEP 2106.05(f)) cannot integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself, the claim is rejected for reasons set forth in the rejection of Claim 1.

Regarding claim 9
The claim is rejected for the reasons set forth in the rejection of Claim 2 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Regarding claim 10
The claim is rejected for the reasons set forth in the rejection of Claim 3 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Regarding claim 11
The claim is rejected for the reasons set forth in the rejection of Claim 4 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Regarding claim 12
The claim recites “A feature selection method for a computer to execute a process comprising:” to perform precisely the computer-readable storage medium of Claim 1. As performance of an abstract idea on generic computer components (see MPEP 2106.05(f)) cannot integrate the abstract idea into a practical application nor provide significantly more than the abstract idea itself, the claim is rejected for reasons set forth in the rejection of Claim 1.

Regarding claim 13
The claim is rejected for the reasons set forth in the rejection of Claim 2 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Regarding claim 14
The claim is rejected for the reasons set forth in the rejection of Claim 3 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Regarding claim 15
The claim is rejected for the reasons set forth in the rejection of Claim 4 under 35 U.S.C. 101, mutatis mutandis, as reciting an abstract idea without integrating the judicial exception into a practical application nor providing significantly more than the judicial exception.

Claim Rejections - 35 USC § 102
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the basis for the rejections under this section made in this Office action:
A person shall be entitled to a patent unless –

(a)(1) the claimed invention was patented, described in a printed publication, or in public use, on sale, or otherwise available to the public before the effective filing date of the claimed invention.


Claim(s) 1-5, 8-15 is/are rejected under 35 U.S.C. 102(a)(1) as being anticipated by Ristoski et al. (Feature Selection in Hierarchical Feature Spaces)

Regarding claim 1
Ristoski teaches
A non-transitory computer-readable storage medium storing a feature selection program that causes at least one computer to execute a process, the process comprising:
(Ristoski [sec(s) 5.2] “The proposed approach for feature selection, as well as all other related approaches, were implemented in a separate operator as part of the RapidMiner Linked Open Data extension. All experiments were run using standard laptop computer with 8GB of RAM and Intel Core i7-3540M 3.0GHz CPU. The RapidMiner processes and datasets used for the evaluation can be found online”;)

specifying a feature of a superordinate concept that has a feature included in a feature set as a subordinate concept; and
(Ristoski [fig(s) 1] [sec(s) 2] “For most problems, we expect the optimal features to be somewhere in the middle of the hierarchy, while the most general features are often too general for predictive models, and the most specific ones are too specific. The hierarchy level of the most valuable features depends on the task at hand. Fig. 1 shows a small part of the hierarchical feature space extracted for dataset Sports Tweets T (see section 5.1). If the task is to classify tweets into sports and non sports related, the optimal features are those in the upper rectangle, if the task is to classify them by different kinds of sports, then the features in the lower rectangle are more valuable.” [sec(s) 4] “The initial selection algorithm is shown in Algorithm 1. The algorithm takes as input the feature hierarchy H, the initial feature set F, a relevance similarity threshold t, and the relevance similarity measure s to be used by the algorithm. The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. The algorithm starts with identifying the leaf nodes of the feature hierarchy. Then, starting from each leaf node l, it calculates the relevance similarity value between the current node and its direct ascendants d. The relevance similarity value is calculated using the selected relevance measure s. If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L.”;)

selecting the feature of the superordinate concept as a feature to be added to the feature set when a plurality of hypotheses each represented by a combination of features that include the feature of the subordinate concept satisfies a certain condition based on an objective variable, features of the subordinate concept being different from each other.
(Ristoski [fig(s) 1] [algorithm 1] “s:=Importance similarity measurement {”Information Gain”, ”Correlation”}”, “11 if similarity ≥ threshold” and “17 add direct ascendants of l to L” [algorithm 2] [sec(s) 4] “The initial selection algorithm is shown in Algorithm 1. The algorithm takes as input the feature hierarchy H, the initial feature set F, a relevance similarity threshold t, and the relevance similarity measure s to be used by the algorithm. The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. The algorithm starts with identifying the leaf nodes of the feature hierarchy. Then, starting from each leaf node l, it calculates the relevance similarity value between the current node and its direct ascendants d. The relevance similarity value is calculated using the selected relevance measure s. If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L. The algorithm for pruning is shown in Algorithm 2. The algorithm takes as input the feature hierarchy H and the previously reduced feature set F. The algorithm starts with identifying all paths P from all leaf nodes to the root node of the hierarchy. Then, for each path p it calculates the average information gain of all features on the path p. All features that have lower information gain than the average information gain on the path, are removed from the feature space F, and from the feature hierarchy H. In cases where a feature is located on more than one path, it is sufficient that the feature has greater information gain than the average information gain on at least one of the paths. This way, we prevent removing relevant features. Practically, the paths from the leafs to the root node, as well as the average information gain per path, can already be precomputed in the initial selection algorithm. The loop in the lines 3 − 6 is only added for illustrating the algorithm.”; e.g., “ascendants” read(s) on “superordinate”. In addition, e.g., paths read(s) on “hypotheses”.)

Regarding claim 2
Ristoski teaches claim 1.

Ristoski further teaches
wherein the certain condition includes a case where equal to or more than a certain rate of hypotheses among the plurality of hypotheses are established.
(Ristoski [fig(s) 1] [algorithm 2] [sec(s) 4] “The algorithm for pruning is shown in Algorithm 2. The algorithm takes as input the feature hierarchy H and the previously reduced feature set F. The algorithm starts with identifying all paths P from all leaf nodes to the root node of the hierarchy. Then, for each path p it calculates the average information gain of all features on the path p. All features that have lower information gain than the average information gain on the path, are removed from the feature space F, and from the feature hierarchy H. In cases where a feature is located on more than one path, it is sufficient that the feature has greater information gain than the average information gain on at least one of the paths. This way, we prevent removing relevant features. Practically, the paths from the leafs to the root node, as well as the average information gain per path, can already be precomputed in the initial selection algorithm. The loop in the lines 3 − 6 is only added for illustrating the algorithm.”; e.g., paths read(s) on “hypotheses”.)

Regarding claim 3
Ristoski teaches claim 1.

Ristoski further teaches
wherein the certain condition includes a case where equal to or more than a certain rate of hypotheses among the plurality of hypotheses are established and a hypothesis obtained by replacing the feature of the subordinate concept with the feature of the superordinate concept is established.
(Ristoski [fig(s) 1] [algorithm 2] [sec(s) 4] “The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. … If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L. The algorithm for pruning is shown in Algorithm 2. The algorithm takes as input the feature hierarchy H and the previously reduced feature set F. The algorithm starts with identifying all paths P from all leaf nodes to the root node of the hierarchy. Then, for each path p it calculates the average information gain of all features on the path p. All features that have lower information gain than the average information gain on the path, are removed from the feature space F, and from the feature hierarchy H. In cases where a feature is located on more than one path, it is sufficient that the feature has greater information gain than the average information gain on at least one of the paths. This way, we prevent removing relevant features. Practically, the paths from the leafs to the root node, as well as the average information gain per path, can already be precomputed in the initial selection algorithm. The loop in the lines 3 − 6 is only added for illustrating the algorithm.”; e.g., paths read(s) on “hypotheses”. In addition, e.g., “If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F” read(s) on “replacing”.)

Regarding claim 4
Ristoski teaches claim 1.

Ristoski further teaches
wherein the specifying includes specifying, in a graph that includes a node that corresponds to a feature value and an edge associated with an attribute that indicates a relationship between nodes that includes a superordinate-subordinate relationship, a feature that corresponds to the node coupled to the node that corresponds to the feature value included in the feature set by the edge associated with the attribute that indicates the superordinate- subordinate relationship.
(Ristoski [fig(s) 1] [algorithm 1] “H: Feature hierarchy”, “s:=Importance similarity measurement {”Information Gain”, ”Correlation”}”, “11 if similarity ≥ threshold” and “17 add direct ascendants of l to L” [algorithm 2] [sec(s) 2] “We describe each instance as an n-dimensional binary feature vector v1,v2,...,vn , with vi ∈ {0,1} for all 1 ≤ i ≤ n. We call V = {v1,v2,...,vn} the feature space. Furthermore, we denote a hierarchic relation between two features vi and vj as vi < vj, i.e., vi is more specific than vj. For hierarchic features, the following implication holds: vi < vj →(vi = 1 →vj =1), (1) i.e., if a feature vi is set, then vj is also set. Using the example of product categories, this means that a product belonging to a category also belongs to that product’s super categories. … Fig. 1 shows a small part of the hierarchical feature space extracted for dataset Sports Tweets T (see section 5.1). If the task is to classify tweets into sports and non sports related, the optimal features are those in the upper rectangle, if the task is to classify them by different kinds of sports, then the features in the lower rectangle are more valuable” [sec(s) 4] “The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. The algorithm starts with identifying the leaf nodes of the feature hierarchy. Then, starting from each leaf node l, it calculates the relevance similarity value between the current node and its direct ascendants d. The relevance similarity value is calculated using the selected relevance measure s. If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L.”;)

Regarding claim 5
Ristoski teaches claim 4.

Ristoski further teaches
wherein the feature set includes the feature that corresponds to the node directly coupled to the node that corresponds to a certain feature value by the edge in the graph.
(Ristoski [fig(s) 1] [algorithm 1] “H: Feature hierarchy”, “s:=Importance similarity measurement {”Information Gain”, ”Correlation”}”, “3 D:=direct ascendants of node l”, “11 if similarity ≥ threshold” and “17 add direct ascendants of l to L” [algorithm 2] [sec(s) 2] “We describe each instance as an n-dimensional binary feature vector v1,v2,...,vn , with vi ∈ {0,1} for all 1 ≤ i ≤ n. We call V = {v1,v2,...,vn} the feature space. Furthermore, we denote a hierarchic relation between two features vi and vj as vi < vj, i.e., vi is more specific than vj. For hierarchic features, the following implication holds: vi < vj →(vi = 1 →vj =1), (1) i.e., if a feature vi is set, then vj is also set. Using the example of product categories, this means that a product belonging to a category also belongs to that product’s super categories. … Fig. 1 shows a small part of the hierarchical feature space extracted for dataset Sports Tweets T (see section 5.1). If the task is to classify tweets into sports and non sports related, the optimal features are those in the upper rectangle, if the task is to classify them by different kinds of sports, then the features in the lower rectangle are more valuable” [sec(s) 4] “The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. The algorithm starts with identifying the leaf nodes of the feature hierarchy. Then, starting from each leaf node l, it calculates the relevance similarity value between the current node and its direct ascendants d. The relevance similarity value is calculated using the selected relevance measure s. If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L.”;)

Regarding claim 8
The claim is a system claim corresponding to the computer readable medium claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 9
The claim is a system claim corresponding to the computer readable medium claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 10
The claim is a system claim corresponding to the computer readable medium claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 11
The claim is a system claim corresponding to the computer readable medium claim 4, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 12
The claim is a method claim corresponding to the computer readable medium claim 1, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 13
The claim is a method claim corresponding to the computer readable medium claim 2, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 14
The claim is a method claim corresponding to the computer readable medium claim 3, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Regarding claim 15
The claim is a method claim corresponding to the computer readable medium claim 4, and is directed to largely the same subject matter. Thus, it is rejected for the same reasons as given in the rejections of the computer readable medium claim.

Claim Rejections - 35 USC § 103
In the event the determination of the status of the application as subject to AIA  35 U.S.C. 102 and 103 (or as subject to pre-AIA  35 U.S.C. 102 and 103) is incorrect, any correction of the statutory basis for the rejection will not be considered a new ground of rejection if the prior art relied upon, and the rationale supporting the rejection, would be the same under either status.  
This application currently names joint inventors. In considering patentability of the claims the examiner presumes that the subject matter of the various claims was commonly owned as of the effective filing date of the claimed invention(s) absent any evidence to the contrary.  Applicant is advised of the obligation under 37 CFR 1.56 to point out the inventor and effective filing dates of each claim that was not commonly owned as of the effective filing date of the later invention in order for the examiner to consider the applicability of 35 U.S.C. 102(b)(2)(C) for any potential 35 U.S.C. 102(a)(2) prior art against the later invention.
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.

Claim(s) 6-7 is/are rejected under 35 U.S.C. 103 as being unpatentable over Ristoski et al. (Feature Selection in Hierarchical Feature Spaces) in view of Dai et al. (A MapReduce Implementation of C4.5 Decision Tree Algorithm)

Regarding claim 6
(Note: Hereinafter, if a limitation has bold brackets (i.e. [·]) around claim languages, the bracketed claim languages indicate that they have not been taught yet by the current prior art reference but they will be taught by another prior art reference afterwards.)

Ristoski teaches claim 1.

Ristoski further teaches
generating [a set of rules] in which a condition represented by the combination of features included in the feature set to which the selected feature of the superordinate concept is added is associated with the objective variable established under the condition.
(Ristoski [fig(s) 1] [algorithm 1] “H: Feature hierarchy”, “s:=Importance similarity measurement {”Information Gain”, ”Correlation”}”, “3 D:=direct ascendants of node l”, “11 if similarity ≥ threshold” and “17 add direct ascendants of l to L” [algorithm 2] [sec(s) 2] “We describe each instance as an n-dimensional binary feature vector v1,v2,...,vn , with vi ∈ {0,1} for all 1 ≤ i ≤ n. We call V = {v1,v2,...,vn} the feature space. Furthermore, we denote a hierarchic relation between two features vi and vj as vi < vj, i.e., vi is more specific than vj. For hierarchic features, the following implication holds: vi < vj →(vi = 1 →vj =1), (1) i.e., if a feature vi is set, then vj is also set. Using the example of product categories, this means that a product belonging to a category also belongs to that product’s super categories. … Fig. 1 shows a small part of the hierarchical feature space extracted for dataset Sports Tweets T (see section 5.1). If the task is to classify tweets into sports and non sports related, the optimal features are those in the upper rectangle, if the task is to classify them by different kinds of sports, then the features in the lower rectangle are more valuable” [sec(s) 4] “The relevance similarity threshold is used to decide whether two features would be similar enough, thus it controls how many nodes from different levels in the hierarchy will be merged. The algorithm starts with identifying the leaf nodes of the feature hierarchy. Then, starting from each leaf node l, it calculates the relevance similarity value between the current node and its direct ascendants d. The relevance similarity value is calculated using the selected relevance measure s. If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L.” [sec(s) 5.3] “To evaluate how well the feature selection approaches perform, we use three classifiers for each approach on all datasets, i.e., Na¨ıve Bayes, k-Nearest Neighbors (with k = 3), and Support Vector Machine.”;)

However, Ristoski does not appear to explicitly teach:
generating [a set of rules] in which a condition represented by the combination of features included in the feature set to which the selected feature of the superordinate concept is added is associated with the objective variable established under the condition.

(Note: Hereinafter, if a limitation has one or more bold underlines, the one or more underlined claim languages indicate that they are taught by the current prior art reference, while the one or more non-underlined claim languages indicate that they have been taught already by one or more previous art references.)

Dai teaches
generating a set of rules in which a condition represented by the combination of features included in the feature set to which the selected feature of the superordinate concept is added is associated with the objective variable established under the condition.
(Dai [fig(s) 1-2] [sec(s) 1] “At the training stage, each internal node split the instance space into two or more parts with the objective of optimizing the performance of classifier. After that, every path from the root node to the leaf node forms a decision rule to determine which class a new instance belongs to.” [sec(s) 2.1] “internal nodes associated with their edges split the instance space into two or more partitions. Each leaf node is a terminal node of the tree with a class label. For example, Figure 1 provides an illustration of a basic decision tree, where circle means decision node and square means leaf node. In this example, we have three splitting attributes, i.e., age, gender and criteria 3, along with two class labels, i.e., YES and NO. Each path from the root node to leaf node forms a classification rule. The general process of building a decision tree is as follows. Given a set of training data, apply a measurement function onto all attributes to find a best splitting attribute. Once the splitting attribute is determined, the instance space is partitioned into several parts. Within each partition, if all training instances belong to one single class, the algorithm terminates. Otherwise, the splitting process will be recursively performed until the whole partition is assigned to the same class. Once a decision tree is built, classification rules can be easily generated, which can be used for classification of new instances with unknown class labels. C4.5 [4] is a standard algorithm for inducing classification rules in the form of decision tree. As an extension of ID3 [5], the default criteria of choosing splitting attributes in C4.5 is information gain ratio. Instead of using information gain as that in ID3, information gain ratio avoids the bias of selecting attributes with many values.”;)

Therefore, it would have been obvious to one of ordinary skill in the art before the effective filing date of the claimed invention to have modified the system of Ristoski with the rule generation of Dai.
One of ordinary skill in the art would have been motived to combine in order to accelerate the construction of decision trees and also ensure the accuracy of classification by leveraging parallel computing techniques.
(Dai [sec(s) 1] “To this end, in this paper we propose a distributed implementation of C4.5 algorithm using MapReduce computing model, and deploy it on a Hadoop cluster. Our goal is to accelerate the construction of decision trees and also ensure the accuracy of classification by leveraging parallel computing techniques.”)

Regarding claim 7
The combination of Ristoski, Dai teaches claim 6.

Ristoski further teaches
wherein the generating includes assigning, to each of [the rules included in the set of the rules], an index according to a number of pieces of data that are positive examples with respect to the objective variable, the data satisfying the condition included in [the rule], and [outputs the rules].
(Ristoski [fig(s) 1] [table(s) 1-2] “positive” [algorithm 1] “s:=Importance similarity measurement {”Information Gain”, ”Correlation”}”, “11 if similarity ≥ threshold” and “17 add direct ascendants of l to L” [algorithm 2] “avg=Information gain average of path p” [sec(s) 4] “If the relevance similarity value is greater or equal to the similarity thresh old t, then the node from the lower level of the hierarchy is removed from the feature space F. Also, the node is removed from the feature hierarchy H, and the paths in the hierarchy are updated accordingly. For the next iteration, the direct ascendants of the current node are added in the list L. The algorithm for pruning is shown in Algorithm 2. The algorithm takes as input the feature hierarchy H and the previously reduced feature set F. The algorithm starts with identifying all paths P from all leaf nodes to the root node of the hierarchy. Then, for each path p it calculates the average information gain of all features on the path p. All features that have lower information gain than the average information gain on the path, are removed from the feature space F, and from the feature hierarchy H. In cases where a feature is located on more than one path, it is sufficient that the feature has greater information gain than the average information gain on at least one of the paths.” [sec(s) 5.1] “The following datasets were used in the evaluation (see Table 1):– Sports Tweets T dataset … – Sports Tweets C … – The Cities dataset … – The NY Daily dataset … – The StumbleUpon dataset” [sec(s) 5.3] “To evaluate how well the feature selection approaches perform, we use three classifiers for each approach on all datasets, i.e., Na¨ıve Bayes, k-Nearest Neighbors (with k = 3), and Support Vector Machine.”;)

Dai further teaches
wherein the generating includes assigning, to each of the rules included in the set of the rules, an index according to a number of pieces of data that are positive examples with respect to the objective variable, the data satisfying the condition included in the rule, and outputs the rules.
(Dai [fig(s) 1-2] [sec(s) 1] “At the training stage, each internal node split the instance space into two or more parts with the objective of optimizing the performance of classifier. After that, every path from the root node to the leaf node forms a decision rule to determine which class a new instance belongs to.” [sec(s) 2.1] “internal nodes associated with their edges split the instance space into two or more partitions. Each leaf node is a terminal node of the tree with a class label. For example, Figure 1 provides an illustration of a basic decision tree, where circle means decision node and square means leaf node. In this example, we have three splitting attributes, i.e., age, gender and criteria 3, along with two class labels, i.e., YES and NO. Each path from the root node to leaf node forms a classification rule. The general process of building a decision tree is as follows. Given a set of training data, apply a measurement function onto all attributes to find a best splitting attribute. Once the splitting attribute is determined, the instance space is partitioned into several parts. Within each partition, if all training instances belong to one single class, the algorithm terminates. Otherwise, the splitting process will be recursively performed until the whole partition is assigned to the same class. Once a decision tree is built, classification rules can be easily generated, which can be used for classification of new instances with unknown class labels. C4.5 [4] is a standard algorithm for inducing classification rules in the form of decision tree. As an extension of ID3 [5], the default criteria of choosing splitting attributes in C4.5 is information gain ratio. Instead of using information gain as that in ID3, information gain ratio avoids the bias of selecting attributes with many values.”;)

The combination of Ristoski, Dai is combinable with Dai for the same rationale as set forth above with respect to claim 6.

Prior Art
The prior art made of record and not relied upon is considered pertinent to applicant's disclosure.
Zhao et al. (Hierarchical Feature Selection with Recursive Regularization) teaches a hierarchical structure among the classes.
Tamaazousti et al. (Diverse Concept-Level Features for Multi-Object Classification) teaches superordinate and subordinate concepts in images.

Conclusion
Any inquiry concerning this communication or earlier communications from the examiner should be directed to SEHWAN KIM whose telephone number is (571)270-7409. The examiner can normally be reached Mon - Fri 9:00 AM - 5:00 PM.
Examiner interviews are available via telephone, in-person, and video conferencing using a USPTO supplied web-based collaboration tool. To schedule an interview, applicant is encouraged to use the USPTO Automated Interview Request (AIR) at http://www.uspto.gov/interviewpractice.
If attempts to reach the examiner by telephone are unsuccessful, the examiner’s supervisor, Michael J Huntley can be reached on (303) 297-4307. The fax phone number for the organization where this application or proceeding is assigned is 571-273-8300.
Information regarding the status of published or unpublished applications may be obtained from Patent Center. Unpublished application information in Patent Center is available to registered users. To file and manage patent submissions in Patent Center, visit: https://patentcenter.uspto.gov. Visit https://www.uspto.gov/patents/apply/patent-center for more information about Patent Center and https://www.uspto.gov/patents/docx for information about filing in DOCX format. For additional questions, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a USPTO Customer Service Representative, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000.



/SEHWAN KIM/Examiner, Art Unit 2129                         
3/13/2026
Read full office action
Prosecution Timeline

Sep 05, 2023
Application Filed
Mar 13, 2026
Non-Final Rejection — §101, §102, §103 (current)
Precedent Cases

Applications granted by this same examiner with similar technology

15/360,454
Patent 12602595
SYSTEM AND METHOD OF USING A KNOWLEDGE REPRESENTATION FOR FEATURES IN A MACHINE LEARNING CLASSIFIER
2y 5m to grant Granted Apr 14, 2026
16/453,380
Patent 12602580
Dataset Dependent Low Rank Decomposition Of Neural Networks
2y 5m to grant Granted Apr 14, 2026
17/098,007
Patent 12602581
Systems and Methods for Out-of-Distribution Detection
2y 5m to grant Granted Apr 14, 2026
17/358,891
Patent 12602606
APPARATUSES, COMPUTER-IMPLEMENTED METHODS, AND COMPUTER PROGRAM PRODUCTS FOR IMPROVED GLOBAL QUBIT POSITIONING IN A QUANTUM COMPUTING ENVIRONMENT
2y 5m to grant Granted Apr 14, 2026
18/081,242
Patent 12541722
MACHINE LEARNING TECHNIQUES FOR VALIDATING AND MUTATING OUTPUTS FROM PREDICTIVE SYSTEMS
2y 5m to grant Granted Feb 03, 2026
Study what changed to get past this examiner. Based on 5 most recent grants.
AI Strategy Recommendation

Get an AI-powered prosecution strategy using examiner precedents, rejection analysis, and claim mapping.
Prosecution Projections

1-2
Expected OA Rounds
60%
Grant Probability
99%
With Interview (+65.6%)
4y 1m
Median Time to Grant
Low
PTA Risk
Based on 144 resolved cases by this examiner. Grant probability derived from career allow rate.
FEATURE SELECTION PROGRAM, FEATURE SELECTION DEVICE, AND FEATURE SELECTION METHOD

This examiner grants 60% of cases after interview

Examiner Intelligence

Statute-Specific Performance

Office Action

Prosecution Timeline

Precedent Cases

Applications granted by this same examiner with similar technology

AI Strategy Recommendation

Prosecution Projections

Ready to respond to this office action?

Sign in with your work email