AI and ML for Finance Practitioners

Quiz LO 4.2.3

Test your knowledge of LO 4.2.3

 8%

Question 1 of 12

1. Is the following statement about ‘decision trees‘ generally correct?

Statement:
“Tree-based methods involve segmenting the predictor space into a number of simple regions. In order to make a prediction for a given observation, we typically use the mean or the mode (majority vote) of the training observations in the region to which it belongs. “

Question 1 of 12

Question 2 of 12

2. Which of the following statements about ‘advantages and disadvantages of decision trees‘ is correct?

Statement I:
“Decision trees are easier to interpret than other classification/regression methods.”
Statement II:
“In terms of prediction accuracy, however, decision trees typically are not competitive with the best supervised learning approaches.”
Statement III:
“We can increase the prediction accuracy of the decision trees using special techniques such as bagging, random forest, and boosting, at the expense of some loss in the interpretation.”

Question 2 of 12

Question 3 of 12

3. Is the following statement about ‘Greedy recursive binary splitting‘ generally correct?

Statement:
“The recursive approach is top-down since it begins at the top of the tree and then successively binary splits the predictor space;
Each split is indicated via two new branches further down on the tree. It is greedy because, at each step of the tree-building process, the best split is made at that particular step, rather than looking ahead and picking a split that will lead to a better tree in some future step.”

Question 3 of 12

Question 4 of 12

4. Is the following statement about ‘tree pruning‘ generally correct?

Statement:
“Recursive binary splitting might result in a too complex tree. Therefore, a better strategy is to grow a very large tree T_{0} , and then prune it back in order to obtain a subtree, T \subset T_{0} .”

Question 4 of 12

Question 5 of 12

5. Is the following statement about ‘tree pruning and Cost complexity pruning‘ generally correct?

Statement:
“Recursive binary splitting might result in a too complex tree. Therefore, a better strategy is to grow a very large tree T_{0} , and then prune it back in order to obtain a subtree, T \subset T_{0} . Cost complexity pruning – aka weakest link pruning – provides an efficient way to do the pruning. Rather than considering every possible subtree, we can consider a sequence of trees indexed by a nonnegative tuning parameter \gamma.”

Question 5 of 12

Question 6 of 12

6. Is the following statement about ‘classification error rate‘ generally correct?

Statement:
The classification error rate is the fraction of the training observations in that region that do not belong to the most common class:
\boxed{E=1-\max _{k}\left(\widehat{p}_{m k}\right)}

Question 6 of 12

Question 7 of 12

7. Is the following statement about ‘Gini index‘ generally correct?

Statement:
It is used as a measure of node purity: the lower the Gini index, the purer the node. The Gini Index function is defined by
\boxed{G=\sum_{k=1}^{K} \widehat{p}_{m k}\left(1-\widehat{p}_{m k}\right)}

Question 7 of 12

Question 8 of 12

8. Given the complex relationship between X_{1} and X_{2} shown in the figure below, which model would be the best performer?

423FSource: Assigned reading

Question 8 of 12

Question 9 of 12

9. Is the following statement about ‘Bagging‘ generally correct?

Statement:
“Bootstrap aggregation or bagging is a general procedure for reducing the variance of a statistical learning method based on the principle that averaging a set of observations reduces variance. In other words, assembling weak predictors/classifiers together makes it a stronger predictor/classifier.”

Question 9 of 12

Question 10 of 12

10. Is the following statement about Out-of-bag (OOB) error generally correct?

Statement:
“The OOB error is a valid estimate of the test error for the bagged model since the response for each observation is predicted using only the trees that were not fit using that observation.”

Question 10 of 12

Question 11 of 12

11. Is the following statement about Random forests generally correct?

Statement:
“One of the issues of bagging is that strong predictors will dominate each tree, making the decision trees correlated. Unfortunately, the average of many highly correlated trees does not lead to a significant reduction in variance. Random forests provide a small tweak to de-correlate the trees.”

Question 11 of 12

Question 12 of 12

12. Is the following statement about Boosting generally correct?

Statement:
“Boosting works in a similar way of bagging, except that the trees are grown sequentially.Each tree is grown using information from previously grown trees. Boosting does not involve bootstrap sampling; instead, each tree is fit on a modified version of the original dataset.Unlike fitting a single large decision tree to the data, which amounts to fitting the data hard and potentially overfitting, the boosting approach instead learns slowly, sequentially. Approaches that learn slowly tend to perform well.”

Question 12 of 12


 

x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security