The course provides advanced skills related to data analysis. It provides insights on data mining methodologies and the applications of these methodologies to knowledge extraction from data. The student will learn both the theoretical background and the practical issues for data analysis. Learning
This course aims at providing an introductory and unifying view of information extraction and model building from data, as addressed by many research fields like DataMining, Statistics, Computational Intelligence, Machine Learning, and PatternRecognition. The course will present an overview of the theoretical background of learning from data, including the most used algorithms in the field, as well as practical applications in industrial areas such as transportation, manufacturing, etc.
- Data, information and models: induction, deduction, abduction, transduction and retroduction.
- Statistical inference: Bayesians vs. Frequentists.
- Exploratory Data Analysis.
- Problem taxonomy: Classification, Regression, Clustering, Novelty Detection, Ranking.
- Naive and linear models: Association rules, Naive Bayes, k-NN, Perceptron, LS/RLS, LASSO, L1-L2, kmeans.
- From linear to nonlinear models: Neural Networks, Trees and Forests, Kernelization and Support Vector Machines, RKLS, Spectral clustering.
- Model selection and error estimation: out-of-sample techniques (Hold Out, Cross Validation, Bootstrap).
- Advances in model selection and error estimation: Statistical Learning Theory, Union Bound, VapnikChervonenkis, Rademacher Complexities, Algorithmic Stability, PAC Bayes, Compression Bound, Differential Privacy.
- Applications in industrial areas such as transportation, manufacturing, etc.
- Implementations and computational issues.