AI and ML for Finance Practitioners

Topic 1: Introduction to Data Science & Big Data

Topic 2: Machine Learning: Introduction to Algorithms

Topic 3: Machine Learning: Regression,Support Vector Machine & Time Series Models

Topic 4: Machine Learning: Regularization, Regression Trees, Random Forest & Overfitting

Topic 5: Machine Learning: Classification & Clustering

Topic 6: Machine Learning: Performance Evaluation, Backtesting & False Discoveries

Topic 7: Data Mining & Machine Learning: Naïve Bayes & Text Mining

Topic 8: Big Data & Machine Learning: Ethical & Privacy Issues

Topic 9: Big Data & Machine Learning in the Financial Industry

Quiz LO 4.2.3

Test your knowledge of LO 4.2.3

8%

Question 1 of 12

1. Is the following statement about ‘decision trees‘ generally correct?

Statement:
“Tree-based methods involve segmenting the predictor space into a number of simple regions. In order to make a prediction for a given observation, we typically use the mean or the mode (majority vote) of the training observations in the region to which it belongs. “

Not correct

Correct

Question 1 of 12

Question 2 of 12

2. Which of the following statements about ‘advantages and disadvantages of decision trees‘ is correct?

Statement I:
“Decision trees are easier to interpret than other classification/regression methods.”
Statement II:
“In terms of prediction accuracy, however, decision trees typically are not competitive with the best supervised learning approaches.”
Statement III:
“We can increase the prediction accuracy of the decision trees using special techniques such as bagging, random forest, and boosting, at the expense of some loss in the interpretation.”

I

I and II

I, II, and III

Question 2 of 12

Question 3 of 12

3. Is the following statement about ‘Greedy recursive binary splitting‘ generally correct?

Statement:
“The recursive approach is top-down since it begins at the top of the tree and then successively binary splits the predictor space;
Each split is indicated via two new branches further down on the tree. It is greedy because, at each step of the tree-building process, the best split is made at that particular step, rather than looking ahead and picking a split that will lead to a better tree in some future step.”

Not correct

Correct

Question 3 of 12

Question 4 of 12

4. Is the following statement about ‘tree pruning‘ generally correct?

Statement:
“Recursive binary splitting might result in a too complex tree. Therefore, a better strategy is to grow a very large tree $T_{0}$ , and then prune it back in order to obtain a subtree, $T \subset T_{0}$ .”

Not correct

Correct

Question 4 of 12

Question 5 of 12

5. Is the following statement about ‘tree pruning and Cost complexity pruning‘ generally correct?

Statement:
“Recursive binary splitting might result in a too complex tree. Therefore, a better strategy is to grow a very large tree $T_{0}$ , and then prune it back in order to obtain a subtree, $T \subset T_{0}$ . Cost complexity pruning – aka weakest link pruning – provides an efficient way to do the pruning. Rather than considering every possible subtree, we can consider a sequence of trees indexed by a nonnegative tuning parameter $\gamma$ .”

Not correct

Correct

Question 5 of 12

Question 6 of 12

6. Is the following statement about ‘classiﬁcation error rate‘ generally correct?

Statement:
The classiﬁcation error rate is the fraction of the training observations in that region that do not belong to the most common class:
$\boxed{E=1-\max _{k}\left(\widehat{p}_{m k}\right)}$

Not correct

Correct

Question 6 of 12

Question 7 of 12

7. Is the following statement about ‘Gini index‘ generally correct?

Statement:
It is used as a measure of node purity: the lower the Gini index, the purer the node. The Gini Index function is defined by

\boxed{G=\sum_{k=1}^{K} \widehat{p}_{m k}\left(1-\widehat{p}_{m k}\right)}

Not correct

Correct

Question 7 of 12

Question 8 of 12

8. Given the complex relationship between

X_{1}

and

X_{2}

shown in the figure below, which model would be the best performer?

423F

423F

Source: Assigned reading

Linear model

Tree model

Both

Question 8 of 12

Question 9 of 12

9. Is the following statement about ‘Bagging‘ generally correct?

Statement:
“Bootstrap aggregation or bagging is a general procedure for reducing the variance of a statistical learning method based on the principle that averaging a set of observations reduces variance. In other words, assembling weak predictors/classifiers together makes it a stronger predictor/classifier.”

No correct

Correct

Question 9 of 12

Question 10 of 12

10. Is the following statement about Out-of-bag (OOB) error generally correct?

Statement:
“The OOB error is a valid estimate of the test error for the bagged model since the response for each observation is predicted using only the trees that were not ﬁt using that observation.”

No correct

Correct

Question 10 of 12

Question 11 of 12

11. Is the following statement about Random forests generally correct?

Statement:
“One of the issues of bagging is that strong predictors will dominate each tree, making the decision trees correlated. Unfortunately, the average of many highly correlated trees does not lead to a significant reduction in variance. Random forests provide a small tweak to de-correlate the trees.”

No correct

Correct

Question 11 of 12

Question 12 of 12

12. Is the following statement about Boosting generally correct?

Statement:
“Boosting works in a similar way of bagging, except that the trees are grown sequentially.Each tree is grown using information from previously grown trees. Boosting does not involve bootstrap sampling; instead, each tree is ﬁt on a modiﬁed version of the original dataset.Unlike ﬁtting a single large decision tree to the data, which amounts to ﬁtting the data hard and potentially overﬁtting, the boosting approach instead learns slowly, sequentially. Approaches that learn slowly tend to perform well.”

No correct

Correct

Question 12 of 12

Loading…

x

This Site Is Protected By
Shield Security →