All responses should be of sufficient depth and detail. Answer the questions succinctly and clearly, and explain your answer. Use references but do not quote anybody else, use your own words. Answers will be evaluated on the following criteria: key content, logical flow, and clarity.
Why is outlier mining important? Briefly describe the different approaches behind statistical-based outlier detection, distanced-based outlier detection, density-based local outlier detection, and deviation-based outlier detection.
A group of students are linked to each other in a social network via advisors, courses, research groups, and friendship relationship. Present a clustering method that may partition students into different groups according to their research interest.
What are the differences between visual data mining and data visualization? Data visualization may suffer from the data abundance problem. For example, it is not easy to visually discover interesting properties of network connections if a social network is huge with complex and dense connections. Propose a visualization method that may help people see through the network topology to the interesting features of a social network.
An e-mail database is a database that stores a large number of electronic mail (e-mail) messages. It can be viewed as a semi-structured database consisting mainly of text data. Discuss the following:
a. What can be mined from such an e-mail database?
b. Suppose you have roughly classified a set of your previous e-mail messages as junk, unimportant, normal, or important. What type of data mining problem or problem is/are this? Describe how a data mining system may take this as the training set to automatically classify new e-mail messages or unclassified ones.
Suppose that your local bank has a data mining system. The bank has been studying your credit and debit card usage patterns. Noticing that you make many transactions at home renovation stores, the bank decides to contact you, offering information regarding their special loans for home improvements. Discuss how this may conflict with your right to privacy.
The President of the University has approached you, a professor who teaches a data mining class. He has heard about this incredible tool called data mining. He does not know much about the technology but he has decided to mine all of the databases in the university to gain “actionable knowledge” and wants you to be the project chief.
Describe your response to him. Be sure to address the benefits of data mining in the context of a university including what possible actionable knowledge that can be gained through this exercise. Outline a plan of action for implementing data mining at the university. Discuss all relevant issues and challenges and suggest how to address them. (Note: Any resemblance to real persons, living or dead is purely coincidental.)