The aim is for student to develop a critical statistical sense, which is necessary in the business environment. Traditional statistical methods such as regression models, decision trees or dimensionality reduction are used. More recent methodologies such as Machine Learning and Deep Learning are also taught. Among others, Random Forest, K-Means, Natural Language Processing or Neural Networks are studied.
Data preparation and cleaning (2)
Exploratory data analysis
Pre-processing of data
Noise and outlier detection.
Processing of missing values
Treatment of the unbalanced problem
Structured and unstructured data
Evaluation of the distributions of variables
Statistical analysis of data (8)
Review of probability and random variables
Bernoulli, binomial, Poisson, negative binomial, exponential, normal distribution.
Multidimensional random variables. Joint density and mass. Conditional distributions. Covariance and correlation. Expectation of a random vector, variance and covariance matrix. Independence of random variables.
Inferential statistics. sample and population. Central limit theorem. Point and interval estimation. Maximum likelihood estimators. Booststrapping. Student's t-distribution and chi-square.
Hypothesis testing. Power, power ratio, significance level and sample size. Interpretation of p-value. Difference between statistically significant and technically significant.
Analysis of variance.
Multiple linear regression and logistic regression
Factor analysis and PCA
Machine Learning (6)
Supervised vs. unsupervised learning. Classification rules, variable and model selection.
Deep Learning (3)
Advanced neural networks.
Markov random fields and Kalman Bucy Filters.
Cellular automation and discrete dynamical systems.
Starting from a basic level, the aim is for student to acquire medium-advanced knowledge and to be able to follow the rest of the subjects, as well as to develop a certain degree of autonomy for the personal learning phase that precedes the master's degree. The programming languages taught are R and Python, as they are the most popular and in demand in the professional field.
This module includes the data extraction phase, where student acquires the skills to work with traditional data instructions , which are currently the most common in companies. It also includes the management of environments properly understood as Big Data, such as Hadoop or Spark, and data collection techniques in social networks such as Twitter or Facebook, web scanning or image collection.
Finally, visualisation techniques will be addressed using the tools most in demand in today's business environment.
Python for data analysis (5)
Syntax and Structures of data.
Data storage and manipulation
Numpy, Pandas, Matplotlib and Seaborn Libraries
instructions data (2)
Static structure of the instructions data.
model Entity Relationship
New types of instructions data.
instructions of NoSQL data
subject document: ElasticSearch.
subject graph. Neo4J
subject column: Cassandra.
General visualisation concepts
Storytelling with data
Commercial platforms for visualisation
Data collection techniques (2)
Master Data Management (MDM).
Data mining in environments similar to business (SQL, Hive).
Big Data Techniques (3)
Evolution of computer architecture, Computer networks, birth of Big Data.
Parallelisation (MapReduce Paradigm). Hadoop vs Spark.
Big Data Frameworks: Hortonworks, Cloudera, MapR, BDE.
Cloud computing as an enabling technology for Big Data.
Secure access to cloud providers. Notions of network configuration and security.
Amazon Web Services and Big Data tools.
Google's Big Data tools.
The Master aims to provide a solid training in terms of technical knowledge, but also a business vision, so that once the Master is completed, students can act as a bridge between the executive and technical levels of a project. In this way, they will be taught by professionals from leading companies and multinationals, practical and successful cases, seeking to apply concepts acquired in the first two modules. In addition, we have the collaboration of IESE Business School, the Business School of the University of Navarra.
Project management (5)
Project planning: identification, definition and objectives.
Privacy and transparency. Ethics of artificial intelligence.
Application to the capstone project
Workshops with companies (4)
Presentation of projects and real cases by companies.
It plays an important role in the programme. A practical approach is sought that at the same time provides solutions to real problems and projects proposed by companies with which there are collaboration agreements. It can be co-directed by both companies and academics from the University of Navarra, and is an excellent opportunity for students to lead the implementation of projects that have an impact on their professional environment.
Master's Thesis (18)
The TFM will consist of an original work in which the competences acquired during the Master's degree must be put into practice. It can be developed in the framework of a business or institution that proposes a project of collection, cleaning, preparation, advanced analytics and visualisation of the results.
Ethical aspects of data processing, as well as the economic and social impact of the results, should be highlighted. The student must demonstrate that they know how to plan a project and carry it out in a real working environment, in such a way that they acquire a very practical experience in the field of Data Science and Big Data.
Data Analysis Module (19 ECTS credit)
Module Programming and Computing (14 ECTS credit)
|Preparation and data collection||