
University of Beira Interior
DATA SCIENCE
 What is Data Science? Data infrastructure: challenges due to volume, heterogeneity and inconsistency/incompleteness;
 Data Science Fundamentals: Framing Problems, Data Wrangling, Exploratory Analysis, Feature Extraction and Modelling;
 Data Encoding and File Formats;
 Databases: Relational, NonStructured Data;
 Data Visualization and Summarization;
 Pie, Bar Charts, Histograms, Boxplots, Scatterplots and Heat maps;
 Dimensionality Reduction
 Axis Rotation (PCA);
 Type Transformation (Wavelets, Spectral Analysis)
 Probability Distributions;
 Anscombe’s Quartet;
 Big Data;
 Hadoop, HDFS, PySpark;
 MapReduce Paradigm;
 Frequent Pattern Mining Model;
 Outlier Analysis;;
 MetaAlgorithms;
 Mining Web Data and Social Network Analysis;
 Software Engineering and Computational Performance
 CRAP Design;
 Key Data Structures;
 Amortized and Average Performance;
 C. Aggarwal. Data Mining: the textbook. Springer, ISBN: 9783319141411, 2015.
 John Kelleger. Data Science. MIT Press Essential Knowledge Series, ISBN: 0262535432, 2018.
 Field Cady. The Data Science Handbook. Wiley, ISBN: 1119092949, 2017.
 Assiduity (A) To get approved at this course, students should attend to  at least  80% of the theoretical and practical classes
 Practical Project (P) The practical projects of this course weights 50% (10/20) of the final mark
 To get approved at the course, a minimal mark of 5/20 should be obtained in the practical project part;
 The pratical project mark is conditioned to an individual presentation and discussion by each student;
 Written Test (F) Monday, June 6th, 2022, 14:00. Room 6.18
 Mark (M) M = (A >= 0.8) * (P * 10/20 + F * 10/20)
 Admission to Exams Students with M >= 6 are admitted to final exams
 The practical projects mark is considered in all exam epochs;
Theoretical slides: [pdf]
Theoretical slides: [pdf]
Theoretical slides: [pdf]
Theoretical slides: [pdf]
Theoretical slides (Clustering): [pdf]
Theoretical slides (Models Interpretability): [pdf]
Theoretical slides (Meta Learning): [pdf]
Theoretical slides (SemiSupervised Learning): [pdf]
Theoretical slides: [pdf]
Theoretical slides: [pdf]