MINING: Consider a 2-dimensional data set with points at: (1,1), (1,2), (1,100), (2,3), (2,4), (3,1), (3,3) Can you divide this data set into two clusters? State what (objective) criteria you would use for the clustering. Is it reasonable to require that clusters be equal in size? Consider a mortgage company trying to decide whether to grant someone a loan. They only have 3 pieces of information available to them for this purpose. (1) Whether applicant has a job (2) Whether applicant is married (3) Whether applicant has defaulted on a loan before. A decision tree is being built to classify applicants into two categories -- loanworthy and not loanworthy. What is the most complex decision tree that could be constructed under the circumstances -- how many nodes, leaves, edges, levels? Draw an example. What is the simplest decision tree possible? Draw an example. Simulate the running of the A Priori algorithm over the following data set: ABC, ABD, ABCE, ABDFG, ABCG, BCEFG Use a confidence threshold of 0.9 and a support threshold of 0.25. (The above data means that items A, B, and C were purchased together in the first transaction; A,B and D in the second, and so on) Instead of 6 transactions, suppose I had 6 million transactions, but still only 7 different items. How should I change my confidence and support thresholds. Argue qualitatively. If I have 6 million transactions, and 100,000 different items (instead of 7), then how should I change my confidence and support thresholds.