题目内容

Remember that in Problem Set 6, we used different linkage distance measures to calculate the distances between clusters and decide which cluster a point should belong to. Consider this new method of finding linkage distances, which makes use of the linkage distance methods from the problem set:You are given the following data points with the following feature values:Answer the following 3 questions based on the above code. You are asked to run the hierarchical clustering algorithm from Problem Set 6 with the singleLinkage, maxLinkage, averageLinkage, and mysteryLinkage distance metrics and asked to report the results at a cutoff of 4 clusters. The final clusters will be the same, no matter which linkage we use.

A. True
B. False

查看答案
更多问题

Your boss comes back one last time with new information. He can now tell you the topic of each document. However, he found some more documents for which the topic is still unknown. Given this information, can we use a supervised learning algorithm to classify the new documents?

A. Yes
B. No

Your boss comes back with a list of 60 specific keywords as well as 5 specific topics that each keyword is best associated with. Which of the following is true, given this additional information?

A. We can switch to a supervised learning algorithm.
B. We can use the k-means clustering algorithm with k = 60
C. We can use the k-means clustering algorithm with k = 5

Given the above information, which of the following would be the most appropriate feature to use?

A. The author's name.
B. The number of pages in a document.
C. The number of times a particular keyword appears in a document.
D. The number of times particular keywords and common words (for example, "the", "in", "at") appear in a document.

Suppose are given a stack of documents and are told that documents with similar sets of keywords are about the same topic. Your job is to organize the documents as best you can by topic. The following 4 questions refer to this situation. For this situation, it is best to use an unsupervised learning algorithm.

A. True
B. False

答案查题题库