DETAILED ACTION
The present application, filed on or after March 16, 2013, is being examined under the first inventor to file provisions of the AIA .
This final office action is in response to the amendment filed 20 August 2025.
Claims 1-18 are pending. Claims 1, 11, and 17 are independent claims.
Claim Rejections - 35 USC § 103
The following is a quotation of 35 U.S.C. 103 which forms the basis for all obviousness rejections set forth in this Office action:
A patent for a claimed invention may not be obtained, notwithstanding that the claimed invention is not identically disclosed as set forth in section 102, if the differences between the claimed invention and the prior art are such that the claimed invention as a whole would have been obvious before the effective filing date of the claimed invention to a person having ordinary skill in the art to which the claimed invention pertains. Patentability shall not be negated by the manner in which the invention was made.
Claims 1-10 are rejected under 35 U.S.C. 103 as being unpatentable over Moore et al. (US 20180336487, hereafter Moore) and further in view of Ramprashad (US 20180366138) and further in view of Lobette et al. (WO 2018/110985, published 21 June 2018, hereafter Lobette) and further in view of Harang (US 2019/0266492, filed 28 February 2018).
As per independent claim 1, Moore discloses:
a computer-based method for building a learning machine to understand and explain learning machines, comprising: receiving, by a reference learning machine, a first set of input signals (Paragraph 0004, “Non-limiting examples of the present disclosure describe systems, methods and devices for optimizing machine-learned tree ensemble prediction systems. A plurality of instances may be provided to, and processed by, a computer-implemented tree ensemble.” Here, the tree ensemble is considered analogous to a reference learning machine, and providing a plurality of instances to the tree ensemble is considered analogous to receiving a first set of input signals.)
generating, at each node of the reference learning machine, first outputs for each input signal of the first set of input signals (Paragraph 0020, “As used herein, a leaf node describes an output node in a tree ensemble system. That is, a leaf node is the final node that an instance is provided to in a tree ensemble, and a value and determined hypothesis is provided upon an instance reaching such a node. An instance, as used herein, describes the processing of data through a tree ensemble (i.e., the processing of data that results in a generated hypothesis for that data).” Here, processing the data is considered analogous to generating, at each node of the reference learning machine, first outputs for each input signal of the first set of input signals.)
recording the first outputs generated at each node of the reference learning machine for each input signal of the first set of input signals (Paragraph 0057, “Server computing device 114 may compare the resulting classification of the processed instances to a pre-determined preferred classification of the data.” Here, the system being configured to compare outputs of the processed instances is considered analogous to recording the first outputs generated at each node of the reference learning machine for each input signal of the first set of input signals)
updating one or more parameters in a parameter matrix based at least on one of the recorded first outputs, derived products of the recorded first outputs and corresponding expected output for each input signal of the first set of input signals (Paragraph 0076, “A known node output value for leaf node 320 is the starting value needed to perform node value contribution determination for internal node 308's contribution to leaf node 320. The other value that must be determined is the expected node output value for the internal node directly upstream from leaf node 320. Thus, the expected node output value for internal node 308 may be determined according to the mechanisms described herein. Upon determining the expected node output value for internal node 308, that value may be subtracted from the node output value for leaf node 320 at operation 316 and the difference of those values may be assigned as the node contribution value for internal node 308”; Paragraph 0080, “From operation 408 flow continues to operation 410 where the instance travels through each downstream node in the tree ensemble (and more importantly each node for which the feature is redundant), as well as each node in other trees in the tree ensemble if the ensemble comprises a plurality of trees.” Here, determining node contribution values from output values and expected output values is considered analogous to updating one or more parameters in a parameter matrix based at least on one of the recorded first outputs and corresponding expected output for each input signal of the first set of input signals. Under the broadest reasonable interpretation of the claim, the entirety of node contribution values for each node in a tree are considered analogous to a parameter matrix.)
upon receiving a first query for a degree of importance of each node in the reference learning machine to the reference learning machine's generating of the first outputs given possible states, retrieving and returning, by a reference learning machine component state importance assignment module, a first set of parameters in the parameter matrix (Paragraph 0019, “Similarly, computing the node output value for nodes in an ensemble provides information that can be utilized in determining why instances processed by a tree ensemble end up at certain leaf nodes, as well as the reasons that instances have their finalized values at those leaf nodes”; Paragraph 0041, “The request to access the ensemble tree structure may comprise a request to determine the node contributions of one or more internal nodes, including an entrance node, of the ensemble tree structure as it relates to a particular instance. That is, a request may be received that specifies that a user generating the request would like to be provided with information related to the relative score contribution for a node as it relates to a split from that node when the node splits to a downstream node based on the feature determination made at that node”; Paragraph 0059, “According to additional examples, upon receiving a request to determine the node value contributions of one or more nodes in decision tree 106 and decision tree 108, node contribution values may be determined by server computing device 114 based on the processing of one or more instance through the tree ensemble, and information associated with those node contribution value determinations may be displayed via a graphical user interface on a computing device, such as computing device 104.” Here, the leaf nodes which the instances end up at during processing are considered analogous to possible states. The user requesting to see the contribution values of one or more nodes and the values being displayed is considered analogous to upon receiving a first query for a degree of importance of each node in the reference learning machine to the reference learning machine's generating of the first outputs given possible states, retrieving and returning, by a reference learning machine component state importance assignment module, a first set of parameters in the parameter matrix.)
constructing, by a description generator module, a description of why the reference learning machine generated the first outputs given an input signal of the first set of input signals, the description comprising a dominant state for each component of the input signal that has a degree of importance between a first lower bound and a first upper bound (Paragraph 0018, “Certain aspects provide mechanisms for making determinations regarding the output values that nodes have in a tree ensemble. In some aspects, node feature contribution values may be determined and ranked according to their impact on a resulting hypothesis for an instance and the generated output value that resulted in that hypothesis”; Paragraph 0019, “Similarly, computing the node output value for nodes in an ensemble provides information that can be utilized in determining why instances processed by a tree ensemble end up at certain leaf nodes, as well as the reasons that instances have their finalized values at those leaf nodes”; Paragraph 0041, “The request to access the ensemble tree structure may comprise a request to determine the node contributions of one or more internal nodes, including an entrance node, of the ensemble tree structure as it relates to a particular instance. That is, a request may be received that specifies that a user generating the request would like to be provided with information related to the relative score contribution for a node as it relates to a split from that node when the node splits to a downstream node based on the feature determination made at that node”; Paragraph 0031, “According to some examples, the mechanisms described herein may be utilized in automatically clustering instances … That is, aspects of the invention provide a per-instance, per-feature, score (contribution). As a result, each customer may have a score for each feature associated with that customer. If each customer's feature scores are ranked from highest to lowest, then all customers may be clustered automatically into different clusters by feature score ranking.” Here, the information shown to a user comprising node contribution values and why instances processed by a tree ensemble end up at certain leaf nodes is considered analogous to constructing, by a description generator module, a description of why the reference learning machine generated the first outputs given an input signal of the first set of input signals. The finalized values of each instance at a leaf node is considered analogous to a dominant state for each component of the input signal. In an embodiment of Moore, input instances may have feature contribution scores between a highest ranked and lowest ranked feature contribution score. Under the broadest reasonable interpretation of the claim, this is considered analogous to each component of the input signal having a degree of importance between a first lower bound and a first upper bound.)
Moore fails to teach:
the expected output for each input signal of the first set of input signals is determined by projecting the derived products of the recorded first outputs to a domain of the input signal;
and the domain of the input signal comprises a spatial domain, a time domain or a frequency domain.
However, Ramprashad, which is analogous to the claimed invention because it is directed toward machine learning, discloses:
the expected output for each input signal of the first set of input signals is determined by projecting the derived products of the recorded first outputs to a domain of the input signal; and the domain of the input signal comprises a spatial domain, a time domain or a frequency domain (Paragraph 0050, “a neural network processor trained to process a plurality of noisy, speech coding parameters that have been derived from an input speech sequence, to produce a plurality of clean speech coding parameters; a speech coding model generator configured to process the clean speech coding parameters into formant information, pitch information, or both; a spectral magnitude generator configured to process the format information, pitch information, or both, into estimate clean speech spectral magnitudes; a noise suppressor having a noise estimator that estimates noise that is present in a difference between an original frequency spectrum of the input speech sequence and a scaled version of the estimated clean speech spectral magnitudes, wherein the noise suppressor reduces gains in the original frequency spectrum in accordance with the estimated noise and estimated SNR in each frequency bin to produce an enhanced frequency spectrum; and an inverse transform block configured to convert the enhanced frequency spectrum into time domain, as an output speech sequence.” Here, a machine learning model receives a speech sequence as an input, produces outputs, and derivatives of those outputs are converted to the time domain as a new output. In a potential combination of Moore and Ramprashad, this is considered analogous to the expected output for each input signal of the first set of input signals is determined by projecting the derived products of the recorded first outputs to a domain of the input signal; and the domain of the input signal comprises a spatial domain, a time domain or a frequency domain.)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Ramprashad with Moore, with a reasonable expectation of success, as it would have produced an enhanced signal with clean data to provide a lower error rate (Ramprashad: paragraph 0005). This would have provided the user with less noisy data for processing.
Further, Moore fails to specifically disclose:
determining, based on the first outputs, possible states for each node of the reference learning machine
determining, by a reference learning machine component state importance classification module, based on the possible states for each node of the reference learning machine, a dominant state for each node of the reference learning machine, wherein the dominant state of each node is selected from the possible states of the node based on a highest degree of importance
updating the nodes of the reverence learning machine based on the description and the dominant state for each node of the reference learning machine
However, Lobette, which is analogous to the claimed invention because it is directed toward machine learning, discloses:
determining, based on the first outputs, possible states for each node of the reference learning machine (paragraphs 76-78: Here, a Q-Learning system obtains Q-Learning decisions based on current states (paragraph 76). As states are processed, a best Q-Value is determined (paragraph 77))
determining, by a reference learning machine component state importance classification module, based on the possible states for each node of the reference learning machine, a dominant state for each node of the reference learning machine, wherein the dominant state of each node is selected from the possible states of the node based on a highest degree of importance (paragraphs 76-78; claim 3: Here, the Q-Values for each state are determined (paragraph 77). Additionally, the possible action having the highest rating (highest degree of importance) for each state is determined (claim 3))
updating the nodes of the reverence learning machine based on the description and the dominant state for each node of the reference learning machine (paragraphs 76-78: Here, the Q-Table is updated with the corresponding Q-Values)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Lobette with Moore- Ramprashad, with a reasonable expectation of success, as it would have allowed improved decision making by using a decision-making structure (Lobette: paragraph 7) storing the highest scoring state (Lobette: paragraph 0078 and claim 3).
Finally, Moore fails to specifically disclose:
displaying, on a visual display, a visualization overlaid on a first input signal from the first set of input signals, the second visualization comprising a degree of importance of each component in the first input signal
providing the determined dominant state for each node of the reference learning machine to an input signal component state importance assignment module to project derived products
inputting the projected derived products to both an input signal component state importance assignment module and a reference learning machine component state importance classification neural network
training the reference learning machine based at least in part on the description
However, Harang, which is analogous to the claimed invention because it is directed toward determining shared importance of multiple nodes within a machine learning model, discloses:
a visualization overlaid on a first input signal from the first set of input signals, the second visualization comprising a degree of importance of each component in the first input signal (Figures 5A-5D; paragraph 0066: Here, a Fisher information generator creates a visualization of a degree of importance of a first and second document)
providing the determined dominant state for each node of the reference learning machine to an input signal component state importance assignment module to project derived products (paragraphs 0036-0038 and 0066-0067: Here, a zone of high importance for the first and second file are represented by the Fisher information associated with the nodes of the neural network. The .docx document weights and .doc document weights include a high importance zone where both the x-axis and y-axis are relatively high)
inputting the projected derived products to both an input signal component state importance assignment module and a reference learning machine component state importance classification neural network (Figure 2; paragraph 0040: Here, a machine learning model is modified based on shared importance of each node from a set of modes in the machine learning model)
training the reference learning machine based at least in part on the description (Figure 2; paragraph 0040)
It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined Harang with Moore-Ramprashad-Lobette, with a reasonable expectation of success, as it would have allowed for calculating a shared importance value for each node of a set of nodes associated with a first and second classification in order to modify the neural network based upon the shared importance values (Harang: paragraph 0004).
Finally, the examiner takes official notice that it was notoriously well-known in the art at the time of the applicant’s effective filing date to display a visualization on a display as an overlay. It would have been obvious to one of ordinary skill in the art at the time of the applicant’s effective filing date to have combined the well-known with Moore-Ramprashad-Lobette-Harang, with a reasonable expectation of success, as it would have allowed for displaying data to a user.
As per dependent claim 2, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Moore further discloses:
upon receiving a second query for a set of nodes in the reference learning machine (Paragraph 0041, “The request to access the ensemble tree structure may comprise a request to determine the node contributions of one or more internal nodes, including an entrance node, of the ensemble tree structure as it relates to a particular instance. That is, a request may be received that specifies that a user generating the request would like to be provided with information related to the relative score contribution for a node as it relates to a split from that node when the node splits to a downstream node based on the feature determination made at that node”; Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division”; Here, in an embodiment of Moore, the user may request to see the contribution values of nodes. This is considered analogous to a second query for a set of nodes in the reference learning machine.)
analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine (Paragraph 0061, “According to some examples, server computing device 114 may make a determination that one or more preferred classifications of instances from the dataset do not correspond to a classification made through processing of those instances through the tree ensemble. In such a case, server computing device 114 may further determine that one or more node contribution values of the tree ensemble contributed to a misclassification of the data. Upon such determination, server computing device may cause that information to be displayed such that a user may modify the tree ensemble such that it produces more accurate predictions. Server computing device 114 may also automatically modify one or more aspects of the tree ensemble based on its analysis by, for example, adjusting one or more node contribution values that contributed to the misclassification.” Here, under the broadest reasonable interpretation of the claims, determining which node contributions are associated with a misclassification is considered analogous to analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine.)
and returning a set of nodes that have aggregated parameter values above a first classification threshold (Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division”; Paragraph 0059, “According to additional examples, upon receiving a request to determine the node value contributions of one or more nodes in decision tree 106 and decision tree 108, node contribution values may be determined by server computing device 114 based on the processing of one or more instance through the tree ensemble, and information associated with those node contribution value determinations may be displayed via a graphical user interface on a computing device, such as computing device 104.” Here, under the broadest reasonable interpretation of the claim, the determination and distinction between which nodes of a plurality of nodes had high impact on a metric is considered analogous to returning a set of nodes that have aggregated parameter values above a first classification threshold.)
and parameter value variances below a second classification threshold (Paragraph 0108, “In another aspect, the technology relates to a method for receiving, by an ensemble tree structure, an input instance, wherein the ensemble tree structure comprises a feature that causes a split at a plurality of nodes in the ensemble tree structure based on at least one threshold input value; increasing a lower bound of a baseline range for the feature when an input value for the feature causes a node split for the feature based on the input value being greater than a threshold input value for that node split; and reducing an upper bound of the baseline range for the feature when an input value for the feature causes a node split for the feature based on the input value being less than a threshold input value for that node split. In some examples, the split is a positive indicator split for a hypothesis of the ensemble tree structure. In other examples, the split is a negative indicator split for a hypothesis of the ensemble tree structure.” Here, node splits may be positive or negative. The limitation parameter value variance is not further defined in the claim. Therefore, in an embodiment of Moore, the returned nodes may have a negative indicator split, which is considered analogous to parameter value variances below a second classification threshold under the broadest reasonable interpretation of the claim.)
As per dependent claim 3, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Moore further discloses:
upon receiving a third query for a set of nodes in the reference learning machine (Paragraph 0041, “The request to access the ensemble tree structure may comprise a request to determine the node contributions of one or more internal nodes, including an entrance node, of the ensemble tree structure as it relates to a particular instance. That is, a request may be received that specifies that a user generating the request would like to be provided with information related to the relative score contribution for a node as it relates to a split from that node when the node splits to a downstream node based on the feature determination made at that node”; Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division”; Here, in an embodiment of Moore, the user may request to see the contribution values of nodes. This is considered analogous to a third query for a set of nodes in the reference learning machine.)
analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine (Paragraph 0061, “According to some examples, server computing device 114 may make a determination that one or more preferred classifications of instances from the dataset do not correspond to a classification made through processing of those instances through the tree ensemble. In such a case, server computing device 114 may further determine that one or more node contribution values of the tree ensemble contributed to a misclassification of the data. Upon such determination, server computing device may cause that information to be displayed such that a user may modify the tree ensemble such that it produces more accurate predictions. Server computing device 114 may also automatically modify one or more aspects of the tree ensemble based on its analysis by, for example, adjusting one or more node contribution values that contributed to the misclassification.” Here, under the broadest reasonable interpretation of the claims, determining which node contributions are associated with a misclassification is considered analogous to analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine.)
and returning a set of nodes that have aggregated parameter values above a third classification threshold (Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division”; Paragraph 0059, “According to additional examples, upon receiving a request to determine the node value contributions of one or more nodes in decision tree 106 and decision tree 108, node contribution values may be determined by server computing device 114 based on the processing of one or more instance through the tree ensemble, and information associated with those node contribution value determinations may be displayed via a graphical user interface on a computing device, such as computing device 104.” Here, under the broadest reasonable interpretation of the claim, the determination and distinction between which nodes of a plurality of nodes had high impact on a metric is considered analogous to returning a set of nodes that have aggregated parameter values above a third classification threshold.)
and parameter value variances above a fourth classification threshold. (Paragraph 0108, “In another aspect, the technology relates to a method for receiving, by an ensemble tree structure, an input instance, wherein the ensemble tree structure comprises a feature that causes a split at a plurality of nodes in the ensemble tree structure based on at least one threshold input value; increasing a lower bound of a baseline range for the feature when an input value for the feature causes a node split for the feature based on the input value being greater than a threshold input value for that node split; and reducing an upper bound of the baseline range for the feature when an input value for the feature causes a node split for the feature based on the input value being less than a threshold input value for that node split. In some examples, the split is a positive indicator split for a hypothesis of the ensemble tree structure. In other examples, the split is a negative indicator split for a hypothesis of the ensemble tree structure.” Here, node splits may be positive or negative. The limitation parameter value variance is not further defined in the claim. Therefore, in an embodiment of Moore, the returned nodes may have a positive indicator split, which is considered analogous to parameter value variances above a fourth classification threshold under the broadest reasonable interpretation of the claim.)
As per dependent claim 4, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Moore further discloses:
upon receiving a fourth query for a set of nodes in the reference learning machine (Paragraph 0041, “The request to access the ensemble tree structure may comprise a request to determine the node contributions of one or more internal nodes, including an entrance node, of the ensemble tree structure as it relates to a particular instance. That is, a request may be received that specifies that a user generating the request would like to be provided with information related to the relative score contribution for a node as it relates to a split from that node when the node splits to a downstream node based on the feature determination made at that node”; Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division.” Here, in an embodiment of Moore, the user may request to see the contribution values of nodes in order to determine which have low impact on a metric. This is considered analogous to a fourth query for a set of nodes in the reference learning machine.)
analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine (Paragraph 0061, “According to some examples, server computing device 114 may make a determination that one or more preferred classifications of instances from the dataset do not correspond to a classification made through processing of those instances through the tree ensemble. In such a case, server computing device 114 may further determine that one or more node contribution values of the tree ensemble contributed to a misclassification of the data. Upon such determination, server computing device may cause that information to be displayed such that a user may modify the tree ensemble such that it produces more accurate predictions. Server computing device 114 may also automatically modify one or more aspects of the tree ensemble based on its analysis by, for example, adjusting one or more node contribution values that contributed to the misclassification.” Here, under the broadest reasonable interpretation of the claims, determining which node contributions are associated with a misclassification is considered analogous to analyzing, by a reference learning machine component state importance classification module, statistical properties of a set of parameters in the parameter matrix corresponding to each node in the reference learning machine.)
and returning a set of nodes that would not be returned for a second query for a set of nodes in the reference learning machine that returns a set of nodes that have aggregated parameter values above a first classification threshold and parameter value variances below a second classification threshold and for a third query for a set of nodes in the reference learning machine that returns a set of nodes that have aggregated parameter values above a third classification threshold and parameter value variances above a fourth classification threshold. (Paragraph 0025, “For example, by providing the ability to view node value information and node contribution information as it relates to a corporate division meeting performance goals, and what feature nodes had high and low impact on that metric, managers can attempt to focus on the important features that are modifiable to increase performance of the division”; Paragraph 0059, “According to additional examples, upon receiving a request to determine the node value contributions of one or more nodes in decision tree 106 and decision tree 108, node contribution values may be determined by server computing device 114 based on the processing of one or more instance through the tree ensemble, and information associated with those node contribution value determinations may be displayed via a graphical user interface on a computing device, such as computing device 104.” Here, under the broadest reasonable interpretation of the claim, the determination and distinction between which nodes of a plurality of nodes had low impact on a metric, as opposed to a high impact on a metric, is considered analogous to returning a set of nodes that would not be returned for a second query for a set of nodes in the reference learning machine that returns a set of nodes that have aggregated parameter values above a first classification threshold and parameter value variances below a second classification threshold and for a third query for a set of nodes in the reference learning machine that returns a set of nodes that have aggregated parameter values above a third classification threshold and parameter value variances above a fourth classification threshold, as described in the previous claims above.)
As per dependent claim 5, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 1, and the same rejection is incorporated herein. Moore further discloses:
upon receiving a fifth query for a degree of importance of each component of a first input signal of the first set of input signals to the reference learning machine's generating of outputs associated with the possible states (Paragraph 0055, “A user may interact with the output values for the processed data sets and request, via network 112, that one or more operations be performed to optimize the tree ensemble that comprises decision tree 106 and decision tree 108. According to examples, such a request may comprise one or more of: a request to determine the expected output values of one or more nodes in decision tree 106 and decision tree 108, a request to determine the node value contributions of one or more nodes in decision tree 106 and decision tree 108, and a request to compute feature value ranges for one or more nodes in decision tree 106 and decision tree 108”; Paragraph 0031, “According to some examples, the mechanisms described herein may be utilized in automatically clustering instances. For example, if the instances are customers and the features are attributes associated with each customer, the mechanisms described herein may provide automatic customer clustering based on per instance feature scores. That is, aspects of the invention provide a per-instance, per-feature, score (contribution). As a result, each customer may have a score for each feature associated with that customer”; Paragraph 0032, “The instances may be clustered according to their feature impact rankings, and the accuracy of the tree ensemble predictions may be measured against the true labels for those instances. For clusters that meet a threshold number or percentage of incorrect model prediction, one or more instances within such clusters may be given larger weights and used to re-train the ensemble system. A training algorithm may take these weights into consideration and be more sensitive to correctly predicting these instances, thus improving ensemble accuracy without human intervention, leveraging per-instance feature contributions to do so.” Here, under the broadest reasonable interpretation of the claim, the user request to optimize the tree ensemble, which may include clustering of instances based on feature scores and improving ensemble accuracy based on the clusters, is considered analogous to a fifth query for a degree of importance of each component of a first input signal of the first set of input signals to the reference learning machine's generating of outputs associated with the possible states.)
feeding the first input signal through the reference learning machine (Paragraph 0004, “Non-limiting examples of the present disclosure describe systems, methods and devices for optimizing machine-learned tree ensemble prediction systems. A plurality of instances may be provided to, and processed by, a computer-implemented tree ensemble.” Here, providing an instance to the tree ensemble is considered analogous to feeding the first input signal through the reference learning machine.)
generating, at each node of the first input signal, by the reference learning machine, second outputs (Paragraph 0020, “As used herein, a leaf node describes an output node in a tree ensemble system. That is, a leaf node is the final node that an instance is provided to in a tree ensemble, and a value and determined hypothesis is provided upon an instance reaching such a node. An instance, as used herein, describes the processing of data through a tree ensemble (i.e., the processing of data that results in a generated hypothesis for that data).” Here, processing the data is considered analogous to generating, at each node of the first input signal, by the reference learning machine, second outputs.)
recording the second outputs generated by the reference learning machine (Paragraph 0057, “Server computing device 114 may compare the resulting classification of the processed instances to a pre-determined preferred classification of the data.” Here, the system being configured to compare outputs of the processed instances is considered analogous to recording the second outputs generated by the reference learning machine.)
updating parameters in the parameter matrix (Paragraph 0076, “A known node output value for leaf node 320 is the starting value needed to perform node value contribution determination for internal node 308's contribution to leaf node 320. The other value that must be determined is the expected node output value for the internal node directly upstream from leaf node 320. Thus, the expected node output value for internal node 308 may be determined according to the mechanisms described herein. Upon determining the expected node output value for internal node 308, that value may be subtracted from the node output value for leaf node 320 at operation 316 and the difference of those values may be assigned as the node contribution value for internal node 308”; Paragraph 0080, “From operation 408 flow continues to operation 410 where the instance travels through each downstream node in the tree ensemble (and more importantly each node for which the feature is redundant), as well as each node in other trees in the tree ensemble if the ensemble comprises a plurality of trees.” Here, determining node contribution values from output values and expected output values is considered analogous to updating parameters in the parameter matrix.
feeding the recorded second outputs into an input signal component state importance assignment module, which queries a reference learning machine component state importance classification module for a first set of nodes in the reference learning machine, along with their dominant states (Paragraph 0076, “A known node output value for leaf node 320 is the starting value needed to perform node value contribution determination for internal node 308's contribution to leaf node 320. The other value that must be determined is the expected node output value for the internal node directly upstream from leaf node 320. Thus, the expected node output value for internal node 308 may be determined according to the mechanisms described herein. Upon determining the expected node output value for internal node 308, that value may be subtracted from the node output value for leaf node 320 at operation 316 and the difference of those values may be assigned as the node contribution value for internal node 308”; Paragraph 0080, “From operation 408 flow continues to operation 410 where the instance travels through each downstream node in the tree ensemble (and more importantly each node for which the feature is redundant), as well as each node in other trees in the tree ensemble if the ensemble comprises a plurality of trees.” Here, under the broadest reasonable interpretation of the claim, using the node output values to determine the node contributions of nodes which are directly upstream of leaf nodes is considered analogous to feeding the recorded second outputs into an input signal component state importance assignment module, which queries a reference learning machine component state importance classification module for a first set of nodes in the reference learning machine. The leaf nodes downstream from the nodes in the tree ensemble are considered analogous to dominant states.)
projecting, by the input signal component state importance assignment module, derived products of the recorded second outputs of the first set of nodes (Paragraph 0043, “The above described node contribution steps can be repeated for each additional upstream node. However, in so doing, the previously calculated expected output value replaces the leaf node output value as the minuend in the subtraction operation. The computed feature contributions may be added up for tree ensembles comprising a plurality of trees. Thus, when feature contributions are added up for one or more features that are redundant across a plurality of trees of an ensemble, those features may be ranked according to their overall impact to downstream nodes at the tree ensemble level.” Here, the feature contributions, as part of the determination of node contribution, are considered analogous to derived products of the recorded second outputs of the first set of nodes.)
and aggregating the projected derived products based on dominant states of their associated nodes to determine the degree of importance of each component of the first input signal to the reference learning machine's generating of the second outputs associated with possible states (Paragraph 0043, “The above described node contribution steps can be repeated for each additional upstream node. However, in so doing, the previously calculated expected output value replaces the leaf node output value as the minuend in the subtraction operation. The computed feature contributions may be added up for tree ensembles comprising a plurality of trees. Thus, when feature contributions are added up for one or more features that are redundant across a plurality of trees of an ensemble, those features may be ranked according to their overall impact to downstream nodes at the tree ensemble level”; Paragraph 0031, “According to some examples, the mechanisms described herein may be utilized in automatically clustering instances. For example, if the instances are customers and the features are attributes associated with each customer, the mechanisms described herein may provide automatic customer clustering based on per instance feature scores. That is, aspects of the invention provide a per-instance, per-feature, score (contribution). As a result, each customer may have a score for each feature associated with that customer. If each customer's feature scores are ranked from highest to lowest, then all customers may be clustered automatically into different clusters by feature score ranking.” Here, the clustering and ranking of instances based on their feature scores (contributions) is considered analogous to aggregating the projected derived products based on dominant states of their associated nodes to determine the degree of importance of each component of the first input signal to the reference learning machine's generating of the second outputs associated with possible states.)
As per dependent claim 6, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 5, and the same rejection is incorporated herein. Moore further discloses:
wherein one or more values of one or more components of one or more input signals of the first set of input signals are replaced by one or more alternative values to create an altered first set of input signals (Paragraph 0004, “Non-limiting examples of the present disclosure describe systems, methods and devices for optimizing machine-learned tree ensemble prediction systems. A plurality of instances may be provided to, and processed by, a computer-implemented tree ensemble.” In an embodiment of Moore, a second half of the plurality of instances to be processed may have different values from the first half of the plurality of instances. Under the broadest reasonable interpretation of the claim, this is considered analogous to an altered first set of input signals.)
As per dependent claim 7, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 6, and the same rejection is incorporated herein. Moore further discloses:
wherein the one or more values of the component of the input signal of the first set of input signals have a degree of importance between a second lower bound and a second upper bound (Paragraph 0031, “If each customer's feature scores are ranked from highest to lowest, then all customers may be clustered automatically into different clusters by feature score ranking.” Here, in an embodiment of Moore, a feature score may be ranked between the highest ranked and lowest ranked feature scores and then clustered by its ranking. Under the broadest reasonable interpretation of the claim, the parameters of each cluster are considered analogous to a second lower bound and a second upper bound.)
As per dependent claim 8, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 7, and the same rejection is incorporated herein. Moore further discloses:
generating, at each node of the input signals of the altered first set of input signals, by the reference learning machine, third outputs (Paragraph 0020, “As used herein, a leaf node describes an output node in a tree ensemble system. That is, a leaf node is the final node that an instance is provided to in a tree ensemble, and a value and determined hypothesis is provided upon an instance reaching such a node. An instance, as used herein, describes the processing of data through a tree ensemble (i.e., the processing of data that results in a generated hypothesis for that data).” Here, processing the plurality of instances, some of which could be considered the altered first set of input signals as discussed above, is considered analogous to generating, at each node of the input signals of the altered first set of input signals, by the reference learning machine, third outputs.)
As per dependent claim 9, Moore, Ramprashad, Lobette, and Harang disclose the limitations similar to those in claim 8, and the same rejection is incorporated herein. Moore further discloses:
calculating and aggregating a difference between the first outputs and the third outputs (Paragraph 0031, “If each customer's feature scores are ranked from highest to lowest, then all customers may be clustered automatically into different clusters by feature score ranking”; Paragraph 0032, “According to additional examples, a predicted hypothesis may be generated for each instance, and a true or false label may be associated with each instance based on training data (e.g., an instance may be human classified as relating to one or more true hypothesis). The instances may be clustered according to their feature impact rankings, and the accuracy of the tree ensemble predictions may be measured against the true labels for those instances.” Here, since each instance has its own output (hypothesis), the ranking of each instance by feature score is considered analogous to calculating and aggregating a difference between the first outputs and the third outputs.)
and returning the difference as an additional metric for the degree of importance of each component of the first input signal of the first set of input signals to the reference learning machine's generating of the first outputs associated with possible states (Paragraph 0031, “According to additional examples, a predicted hypothesis may be generated fo