Publicador de contenidos

Aplicaciones anidadas


DATAI Seminars. Course 2023-2024


NEXT seminar
2/05/2024. 10:00h.

RNA-to-Image synthesis: generating synthetic digital pathology tiles based on NGS data using deep generative models Francisco Carrillo-Perez

Stanford University


Aplicaciones anidadas


David Martínez Rubio


Accelerated and Sparse Algorithms for Approximate Personalized PageRank

10/04/2024 / David Martínez Rubio



This talk will go over the basics of the PageRank problem, studied initially by the founders of Google, which allowed them to create their search engine by applying it to the internet graph with hyperlinks defining edges. Then, I will explain our new results on the problem for undirected graphs, whose main application is finding local clusters in networks, and is used in many branches of science. We have now algorithms that find local clusters fast in a time that does not depend on the whole graph but on the local cluster itself, which is significantly smaller. This is a joint work with Elias Wirth and Sebastian Pokutta.

Aplicaciones anidadas



Criminal Analysis and Criminal Forecasts

20/03/2024 / Gaston Pezzuchi



In essence, crime analysis has as goal to know the criminal reality with the goal to provide anticipatory capacity to the public safety and criminal justice system and to the police and judicial agencies involved in them. We will very briefly explore the state of the art and some advances in spatio-temporal crime forecasting methods. The disciplinary field we will explore includes tools from the Science of data, and Geographic Information Science and Systems.

Aplicaciones anidadas


Santiago Mazuelas


Beyond Empirical Risk Minimization

02/21/2024 / Santiago Mazuelas



The empirical risk minimization (ERM) approach for supervised learning chooses prediction rules that fit training samples and are "simple" (generalize). This approach has been the workhorse of machine learning methods and has enabled a myriad of applications. However, ERM methods strongly rely on the specific training samples available and cannot easily address scenarios affected by distribution shifts and corrupted samples. Robust risk minimization (RRM) is an alternative approach that does not aim to fit training examples and instead chooses prediction rules minimizing the maximum expected loss (risk). This talk presents a learning framework based on the generalized maximum entropy principle that leads to minimax risk classifiers (MRCs). The proposed MRCs can efficiently minimize worst-case expected 0-1 loss and provide tight performance guarantees. In particular, MRCs are strongly universally consistent using feature mappings given by characteristic kernels. MRC learning is based on expectation estimates and does not strongly rely on specific training samples. Therefore, the methods presented can provide techniques that are robust to practical situations that defy conventional assumptions, e.g., training samples that follow a different distribution or are corrupted by noise.

Aplicaciones anidadas


Vinny Dunne


Problems of the digital age: Computer forensics and data recovery. data

01/24/2024 / Vinny Dunne



In today's talk, I will delve into the realms of digital forensics, shedding light on the crucial aspects of my work. Together, we will explore the challenges and triumphs encountered in the pursuit of recovering invaluable data. It is not just a profession for me; it is a passion that has driven me to establish a pioneering data recovery lab right here in Navarra. I will share insights into real-world cases that highlight the intricacies of digital forensics. These cases will not only captivate your interest but also provide a glimpse into the critical role that data recovery plays in our technologically driven world. 

Aplicaciones anidadas


fernando carazo


Unveiling the Black Box: Applying Explainable AI in Precision Medicine

12/13/2023 / Fernando Carazo



In today's AI-driven landscape, understanding the decisions made by artificial intelligence systems is crucial. Explainable AI (XAI) emerges as a pivotal solution, shedding light on the opaque nature of some AI models. The significance of XAI spans various industries, and its application is particularly transformative in the pharmaceutical sector. As we delve into a practical case study, we'll witness how XAI has revolutionized drug research and development, offering both efficiency gains and cost reductions and a profound comprehension of critical decisions in this vital field.

Aplicaciones anidadas



Monitoring research and innovation from heterogeneous sources using knowledge graphs

11/22/23 / Vanni Zaravella



Knowledge Graphs are machine-readable representations of the information via predicative triples, typically defined by an underlining ontology schema. The recent rise of the Open Science paradigm and advances in Natural Language Processing models has led to the creation of Information Extraction pipelines that can generate large-scale scholarly Knowledge Graphs from scientific publications and patents, enabling advanced 'semantic' services such as fine-grained document classification, retrieval, question answering, and innovation tracking. However, tracking the complex research-industry dynamics of a target technological domain requires also incorporating alternative text sources like news and micro-blogging posts, from where conventional NLP methods and models typically struggle to accurately extract information with high recall. In this talk, we present an enhanced information extraction pipeline tailored to the generation of a knowledge graph comprising open-domain entities from micro-blogging posts. It leverages dependency parsing and classifies entity relations in an unsupervised manner through hierarchical clustering over non-contextual word embeddings. We provide a use case that demonstrates the extraction of semantic triples within the domain of Digital Transformation from X/Twitter.

Aplicaciones anidadas



Resource-Constrained Project Scheduling Problem: A bi-objective approach with time-dependent resource costs

10/25/23 / Laura Anton Sanchez




This talk provides new insights on bi-criteria resource-constrained project scheduling problems. We define a realistic problem where the objectives to combine are the makespan and the total cost for resource usage. Time-dependent costs are assumed for the resources, i.e., they depend on when a resource is used. An optimization model is presented, followed by the development of an algorithm aiming at finding the set of Pareto solutions. The intractability of the optimization models underlying the problem also justifies the development of a metaheuristic for approximating the same front. We design a bi-objective evolutionary algorithm that includes problem-specific knowledge and is based on the Non-dominated Sorting Genetic Algorithm (NSGA-II). The results demonstrate the efficiency of the proposed metaheuristic. In a more recent work, another six multi-objective evolutionary algorithms have been implemented to solve this problem and then, an exhaustive comparison of their performance with the NSGA-II based algorithm has been carried out. A computational and statistically supported study is conducted, using instances built from those available in the literature and applying a set of performance measures to the solution sets obtained by each methodology.