Skip to main content

Mining of Big Datasets (5cr)

Course unit code: C-10122-DATA--ML--370

General information


Credits
5 cr
Institution
University of Tampere

Content

Core contentThe concept and terminology of data mining.Understanding the principles of processing large, non-structured datasets.Basic methods and algorithms for the analysis of large datasetsCommon tasks of mining large datasets such as similarity analysis, link analysis, finding frequent itemsets, clusteringCommon applications of mining large datasets such as recommendation systems, web search, mining of social network graphsComplementary knowledgeMining data streamsSpecial challenges of processing large datasets: memory usage and data formats. Deep learning methods in mining large datasetsSpecialist knowledgeMapreduce algorithm.Locality-sensitive hashingDistance measuresMore advanced algorithms for mining large datasets

Further information

Partial completions of the course must be carried out during the same implementation round.

Go back to top of page