Mining of Big Datasets (5cr)
Course unit code: C-10122-DATA--ML--370
General information
- Credits
- 5 cr
- Institution
- University of Tampere
Content
Core contentThe concept and terminology of data mining.Understanding the principles of processing large, non-structured datasets.Basic methods and algorithms for the analysis of large datasetsCommon tasks of mining large datasets such as similarity analysis, link analysis, finding frequent itemsets, clusteringCommon applications of mining large datasets such as recommendation systems, web search, mining of social network graphsComplementary knowledgeMining data streamsSpecial challenges of processing large datasets: memory usage and data formats. Deep learning methods in mining large datasetsSpecialist knowledgeMapreduce algorithm.Locality-sensitive hashingDistance measuresMore advanced algorithms for mining large datasets
Further information
Partial completions of the course must be carried out during the same implementation round.