AI and ML for Finance Practitioners

Quiz LO 7.1.3

  Test your knowledge of LO 7.1.3

 14%

Question 1 of 7

1. Is the following definition about ‘the meaning of “terms” when used in the field of information retrieval‘ generally correct?

Definition:
“Terms (or tokens) are constitutive elements (usually words) contained in a document.”

Question 1 of 7

Question 2 of 7

2. Is the following statement about ‘Bag of Words approach’ generally correct?

Statement:
“In this approach, we treat every document as a collection of individual words. This approach ignores grammar, word order, sentence structure, and punctuation. It treats every word in a document as a potentially important keyword of the document. Each word is a token, and each document is represented by a one (if the token is present in the document) or a zero (the token is not present in the document). This approach simply reduces a document to the set of words contained in it.”

Question 2 of 7

Question 3 of 7

3. Is the following statement about ‘Term frequency (TF)‘ generally correct?

Statement:
“Term frequency measures how many times a word has cited in a document.”

Question 3 of 7

Question 4 of 7

4. Is the following statement about ‘Inverse document frequency (IDF)‘ generally correct?

Statement:
“IDF measures the sparseness of a term over a corpus (i.e., a collection of documents). The fewer documents in which a term occurs, the more significant it is. The sparseness of a term t is measured by the equation of the inverse document frequency (IDF), which is defined by”

\boxed{\operatorname{IDF}(t)=1+\log \left(\frac{\text { Total number of documents }}{\text { Number of documents containing } t}\right)}

Question 4 of 7

Question 5 of 7

5. Is the following statement about ‘TFIDF‘ generally correct?

Statement:
“TFID, a standard indicator of the frequency of term t in a given document d, is obtained by combining the TF (Term frequency) and the IDF (Inverse document frequency). The TDIDF equation is defined by”

\boxed{\operatorname{TDIDF}(t, d)=T F(t, d) \times \operatorname{IDF}(t)}

Question 5 of 7

Question 6 of 7

6. Is the following statement about ‘methods to create a TFIDF representation of a query‘ generally correct?

Statement:
“In TFIDF, each document is represented as a feature vector, and the corpus is the set of these feature vectors. This set of feature vectors can then be used in a data mining algorithm for classification, clustering, or retrieval.”

Question 6 of 7

Question 7 of 7

7. Is the following statement about ‘entropy in terms of the IDF measure‘ generally correct?

Statement:
The entropy of the term t is defined by \boxed{\operatorname{entropy}(t)=p \cdot \operatorname{IDF}(t)+(1-p)[\operatorname{IDF}(\operatorname{Not} t)]}

Question 7 of 7


 

x  Powerful Protection for WordPress, from Shield Security
This Site Is Protected By
Shield Security