Are talks by a DaSCI senior researcher who presents the latest advances on DaSCI consolidated research lines.The seminars will last about 1 hour and 30 minutes (45min. Speaker + 30 min. for questions)
DaSCI Lectures 2023
Soon new lectures
DaSCI Lectures 2022
DSGAME, a board game
Speaker: Isaac Martín de Diego has been a full professor at Universidad Rey Juan Carlos since 2018. He is co-founder and coordinator of the Data Science Lab. He leads the research group in Fundamentals and Applications of Data Science, as well as an associated teaching innovation group. He has more than 25 years of experience in data analysis in different sectors: medicine, economy, telecommunications, livestock, marketing, energy, etc. He is the author of more than 50 articles in JCR and more than 70 communications to congresses. He has participated in more than 25 R&D projects funded in competitive calls and in more than 25 non-competitive R&D contracts.
Abstract: DSGAME is a board game designed by the Data Science Laboratory, DSLAB, of the URJC, with the collaboration of the Academia Joven de España. The objective of the game is to build a complete Data Science project, from the acquisition of data in a company interested in a certain problem, to the development of the final product, going through all the phases related to data cleaning, modeling by Machine Learning, evaluation of results, presentation of the results… We have designed a version for children, the DSGAME-KIDS, and a version for researchers and undergraduate students in Data Science and Engineering and Artificial Intelligence. In this seminar I will show you the components of the game and its relation with the theoretical and practical part of Data Science.
DaSCI Lectures 2021
Federated Learning for Preserving Data Privacy
Lecturer: Eugenio Martínez Cámara completed his degree and PhD in Computer Science at University of Jaén (Spain). His research line is focused on Natural Language Processing (NLP), standing out his works in sentiment analysis and the use of Deep Learning methods in several NLP tasks. Currently, he is working in the research line federated learning and the development of data-privacy preserving AI methods. He worked as postdoctoral researcher at the Technishe Universität Darstadt in the UKP research group, and he is working as postdoctoral fellow (Juan de la Cierva Incorporación) in the DaSCI Research Institute at the University of Granada (Spain).
Abstract: Artificial intelligence has to face up the challenges of its progress, as the availability of a greater amount of data, the diversity of data distributions when data is distributed among several data sources, and the increasingly evident requirement of preserving data privacy and integrity. These challenges cannot be overcome with the traditional centralised or distributed paradigm of AI, because of the storage and communications costs, and the difficulty of preserving data privacy. This talk is focused on Federated Learning, which is a nascent learning paradigm proposed as a solution for these challenges. It allows to train learning models in a federated way among several clients under the orchestration of a central entity or server, and its main feature is that the data are kept in their data silos. Hence, the availability of data is enlarged by means the creation of networks of data centres, and the learning capacity of AI algorithms will be furthered.
New pooling layer structures using linear combinations of increasing functions and grouping functions
Lecturer: Humberto Bustince Sola is a full professor of Computer Science and Artificial Intelligence at the Public University of Navarra and honorary professor at the University of Nottingham since 2017. He is the main researcher of the Research Group on Artificial Intelligence and Approximate Reasoning, whose research lines are both theoretical (data fusion functions, information and comparison measures, fuzzy sets and their extensions) and applied (Deep learning, image processing, classification, machine learning, data mining, big data or the computational brain). He has led 13 research projects funded by national and regional governments, and two excellence networks on soft computing. He has been the main researcher in projects with companies and entities such as Caja de Ahorros de Navarra, INCITA, Gamesa Tracasa or the Servicio Navarro de Salud. He has taken part in two international projects. He has authored or coauthored more than 300 works, according to Web of Science, including around 160 in Q1 journals. He was a highly cited researcher among the top 1%most relevant scientists in the world in 2018, according to Clarivate Analytics. He collaborates with first line research groups from countries such as United Kingdom, Belgium, Australia, the Czech Republic, Slovakia, Canada or Brasil. He is editor in chief of the Mathware&Soft Computing online magazine of the European Society of Fuzzy Logic and technologies and of the Axioms journal. Associated editor of the IEEE Transactions on Fuzzy Systems journal and member of the editorial boards of the journals Fuzzy Sets and Systems, Information Fusion, International Journal of Computational Intelligence Systems and Journal of Intelligent & Fuzzy Systems. Moreover, he is a coauthor of a book about averaging functions, and has been the co-editor of several books. He has been in charge of organizing several first level international conferences such as EUROFUSE 2009 and AGOP 2013. He is Senior Member of IEEE y Fellow of the International Fuzzy Systems Association (IFSA). Member of the Basque Academy of Sciences, Arts and Literature, Jakiunde, since 2018. He has advised 11 Ph.D thesis.
Abstract: In this talk, we start making a revision of the main concepts of aggregation functions and how these functions have been applied to artificial intelligence. We also discuss how some application in artificial intelligence have led to consider specific classes of aggregations as well as wider families of increasing functions. As an example of the applicability of these developments, we consider the case of convolutional neural networks. Traditional convolutional neural networks make use of the maximum or arithmetic mean in order to reduce the features extracted by convolutional layers. In this work we replace this downsampling process, known as pooling operation, by several alternative functions. We consider linear combinations of order statistics and generalizations of the Sugeno integral, extending the later’s domain to the whole real line and setting the theoretical base for their application. We apply these new functions to three different architectures of increasing complexity, showing that the best pooling function is architecture dependent and should be fine tuned similarly to other model hyper parameters. However, we also empirically prove over multiple datasets that linear combinations outperform traditional pooling functions in most cases, and that combinations with either the Sugeno integral or one of its generalizations usually yield the best results, proving a strong candidate to apply in most architectures.
Bayesian Modelling and Inference with Applications to Image Recovery and Classification
Lecturer: Rafael Molina received the degree in Mathematics (Statistics) and the Ph.D. degree in optimal design in linear models from the University of Granada, Granada, Spain, in 1979 and 1983, respectively. In 2000, he became Professor of Computer Science and Artificial Intelligence at the University of Granada. His research interest focuses mainly on using Bayesian modelling and inference in image restoration (applications to astronomy and medicine), super-resolution of images and video, active learning, supervised and unsupervised learning and crowdsourcing.
Abstract: A fundamental principle of the Bayesian philosophy is to regard all parameters and unobservable variables of a given problem as unknown stochastic quantities. The inference goal is to calculate or approximate the distribution of all the unknowns given the observations.
Variational Bayesian (VB) inference is a family of deterministic probability distribution approximation procedures that offers distinct advantages over alternative approaches based on stochastic sampling and those providing only point estimates. VB inference is flexible to be applied in different practical problems, yet it is broad enough to subsume as its special cases several alternative inference approaches including Maximum A Posteriori (MAP) and the Expectation-Maximization (EM) algorithm. VB inference and Expectation Propagation are both variational methods that minimize functionals based on the Kullback-Leibler (KL) divergence. Connections between VB and marginalization-based Loopy Belief Propagation (LBP) can also be easily established.
In this talk, I provide a personal overview of Bayesian modeling and inference methods for image recovery (regression) and classification problems. I will place emphasis on the relationship between the KL-divergence and the Evidence Lower Bound (ELBO), the pros and cons of Variational Bayesian (VB) methods, their connections to other inference methods, and the use of local variational bounds. The talk will also include a (brief) description of some VB applications: blind image and color deconvolution, super-resolution, (deep) Gaussian processes, activation uncertainty in neural networks , histological image classification, multiple instance learning, and crowdsourcing in medicine and the LIGO problem.
The Artificial Intelligence of the new Agriculture
Lecturer: Salvador Gutiérrez
Abstract: In agriculture, the aim is to reduce costs and environmental impact, improve sustainability and increase crop quality and yield. In order to develop useful applications for farmers, crop information is needed that can be used to make better decisions. New advances in non-invasive sensor technologies allow the acquisition of large amounts of data from the field. And given the revolution brought about by artificial intelligence, the combination of artificial intelligence with data from multiple sensors allows the extraction of useful information for the farmer. This presentation will present the current advances in AI and sensors in agriculture, mainly in viticulture, and future lines and challenges will be discussed.
Artificial Intelligence from the point of view of ethics and law
First Lecturer (20 min): Francisco Lara
Title & Abstract: “The Ethics of the Artificial Intelligence”. In my talk I will briefly review the main questions being asked by authors working on ethical issues in AI, from the more speculative (about the possibilities and effects of superintelligence or the status of robots) to the concrete and immediate (the bias or lack of transparency of algorithms). I will also talk a bit about EthAI+, the project I am currently coordinating, which aims to theorise about a virtual assistant that would improve the ethical abilities of human beings.
Second Lecturer (20 min) : Javier Valls
Title & Abstract: “Law implications in the Artificial Intelligence”. The impact of technology on society and the legal consequences that can be derived from it, both positive and negative, will be explained. To this end, some systems of human rights impact of artificial intelligence will be analysed. Finally, we will look at what an artificial intelligence risk system is.
Development of Community Finding Algorithms based on Evolutionary Strategies
Lecturer: David Camacho
Abstract: The seminar will provide a brief introduction to Community Finding Algorithms, widely studied in the area of Graph Network Analysis and Degree-Based Computing, where the problem of detecting this type of node groups, or clusters, when modelled as a temporal problem will be explored in depth. A brief introduction will be given to traditional detection methods (static in nature), and to more current methods that attempt to address the problem from a dynamic perspective. In particular, it will briefly describe how bio-inspired strategies, namely evolutionary algorithms (single and multi-objective) are being used to try to find stable and quality communities over time. This talk will briefly present the design, implementation and empirical analysis of a new multi-objective genetic algorithm that combines an immigrant-based scheme with local search strategies for dynamic community detection.
Artificial intelligence, refugees and border security. Ethical implications of technological and political worlds
Lecturer: Ana Valdivia
Abstract: Over the last decade, a large number of people are on the move due to conﬂict, instability, climatic emergency consequences, and other economic reasons. In Europe, the so-called refugee crisis has become a testing ground to explore the use of artificial intelligence for law enforcement and border security. Interoperable databases, facial recognition and fingerprints registration, iris data collection, lie detectors, and other forms of data-driven risk assessments now all form part of the European border policies for refugees.
In this webinar, we will explore which socio-technical systems are applied nowadays at European borders by analysing technical specifications. After that, we will discuss the ethical impact and human rights violation that this situation is causing. It is now necessary that computer scientist and data engineers recognise how technology might perpetuate harms, and collaborate with academics from other disciplines to mitigate discrimination.
EXplainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
Lecturer: Natalia Díaz Rodríguez
Abstract: The overview presented in this article examines the existing literature and contributions already done in the field of XAI, including a prospect toward what is yet to be reached. For this purpose we summarize previous efforts made to define explainability in Machine Learning, establishing a novel definition of explainable Machine Learning that covers such prior conceptual propositions with a major focus on the audience for which the explainability is sought. Departing from this definition, we propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at explaining Deep Learning methods for which a second dedicated taxonomy is built and examined in detail. This critical literature analysis serves as the motivating background for a series of challenges faced by XAI, such as the interesting crossroads of data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to the field of XAI with a thorough taxonomy that can serve as reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.
DaSCI Lectures 2020
Autoencoders: an Overview and Applications
Lecturer: David Charte
Abstract: In this talk, we motivate the need for representation learning techniques, especially those based in artificial neural networks. We arrive to a definition of autoencoders which then are further developed in a step-by-step example. Next, several applications of autoencoders are described and illustrated with case studies as well as uses in the literature. Last, some comments on the current situation and possible future trends are provided.
AI Ethics: encompassing the facets of FATE. Fairness, transparency and auditability
Lecturer: José Daniel Pascual Triana
Abstract: Artificial Intelligence Ethics is the field that strives to apply the ethical and moral principles of humans to the development and operation of machine learning. This includes, amongst other topics, bias reduction to enforce parity, transparency and model auditing.
Due to the sheer amount of data that is currently generated and used, as well as the increased awareness of the population and the evolving legislation to keep up with the times, AI Ethics’ relevance keeps rising as a means to maintain the trust in the analysis and treatment of data. In this seminar, a taxonomy of AI Ethics will be presented, current techniques to promote it will be shown and the usefulness of several tools for data and model treatment will be discussed.
Are short talks by a DaSCI PhD Student who presents recents results on the different DaSCI research lines. Two presentations per day. Each presentation will be approximately 30 minutes long, followed by 15 minutes for questions
Descriptive analysis of breast cancer using data mining
Lecturer: Manuel Trasierras Fresco
Abstract: This work presents an approach based on emerging pattern mining to analyse cancer through genomic data. Unlike existing approaches, mainly focused on predictive purposes, this proposal aims to improve the understanding of cancer descriptively, not requiring either any prior knowledge or hypothesis to be validated. Additionally, it enables to consider high-order relationships, so both direct and indirect gene relationships related to different functional pathways in the disease can be retrieved. The prime hypothesis is that splitting genomic cancer data into two subsets, that is, cases and controls, will allow us to determine which genes, and their expressions, are associated with the disease. The possibilities of the proposal are demonstrated by analysing a set of paired breast cancer samples in RNA-Seq format. Some of the extracted insights were already described in the related literature as good cancer biomarkers, while others could describe new functional relationships between different genes.
PAF-ND: addressing multi-class imbalance learning with Nested Dichotomies
Lecturer: José Alberto Fernández Sánchez
Abstract: Multi-class classification tasks add additional difficulties to the binary classification problem from several sides. Among them, the possibility of obtaining a homogeneous distribution of the classes involved is often one of the most recurring issues in real world problems. This issue leads to what are known as imbalanced learning scenarios. In this work, we explore a method that improves the predictive ability of models when using a type of decomposition strategy known as Nested Dichotomies. Nested Dichotomies is a solution that hierarchically decomposes the classes of the problem and uses an inference method based on probabilities. The method presented here attempts to modify the probability estimates of these models within the hierarchy towards a more equitable classification of the classes by means of Bézier curves.
Reducing Data Complexity using Autoencoders with Class-informed Loss Functions
Lecturer: David Charte
Abstract: The data we currently use for knowledge extraction can show different kinds of complexity: class overlap, complex boundaries, dimensionality, etc. This work proposes and evaluates three autoencoder-based models which help reduce complexity by learning from class labels. We also check which complexity measures are better predictors of classification performance.
Multi-step Histogram Based Outlier Scores for Unsupervised Anomaly Detection: ArcelorMittal Engineering Dataset Case of Study
Lecturer: Ignacio Aguilera
Abstract: Multi-step Histogram Based Outlier Scores for Unsupervised Anomaly Detection: ArcelorMittal Engineering Dataset Case of Study.
Abstract: Anomaly detection is the task of detecting samples that behave differently from the rest of the data or that include abnormal values. Unsupervised anomaly detection is the most extended scenario, which means that the algorithms cannot train with a labeled input and do not know the anomaly behavior beforehand. Histogram-based methods are one of the most popular and widely used approaches, remarking a good performance and a low runtime. Despite the good performance, histogram-based anomaly detectors are not capable of processing data flows while updating their knowledge and deal with a high amount of samples. In this paper we propose a new histogram-based approach for addressing the aforementioned problems introducing the ability of updating the information inside a histogram. We have applied these strategies to design a new algorithm called Multi-step Histogram Based Outlier Scores (MHBOS), including five new histogram update mechanisms. MHBOS has been validated using the ODDS Library as a general case of use. A real engineering problem provided by the multinational company ArcelorMittal has been used to further validate the performance in a real scenario. The results have shown the performance and validity of MHBOS as well as the proposed strategies in terms of performance and computing times.
StyleGAN: Background and evolution
Lecturer: Guillermo Gómez Trenado
Abstract: The work developed by Tero Karras and his team at Nvidia has been the state-of-the-art in GAN for image generation since 2017. In this DaSCI reading we’ll use this results to discuss different aspects of GAN, the iterative process by which the authors detected and corrected the limitations of their work, the technological solutions that allowed such results and the difficulties that we may find if we face related tasks.
Action Recognition for Anomaly Detection using Transfer Learning and Weak Supervision
Lecturer: Francisco Luque
Abstract: Automatic video surveillance is an emerging research area, where a huge number of publications are appearing everyday. Particularly, action anomaly detection is a fairly relevant task nowadays. The mainstream approach to the problem using deep models consists in transfer learning from action recognition and weakly supervised fine-tuning for anomaly detection. The objective of the current study is to identify the key aspects of this approaches, and assess the importance of each decision on the training process. To this end, we propose a specific pipeline, where a model is defined by three key aspects: the action recognition model, the pretraining dataset and the weakly supervised fine-tuning policy. Furthermore, we perform extensive experiments to validate the impact of each of the previous aspects in the final solution.
Fuzzy Monitoring of In-bed Postural Changes for the Prevention of Pressure Ulcers using Inertial Sensors Attached to Clothing
Lecturer: Edna Rocío Bernal Monroy
Abstract: Postural changes while maintaining a correct body position are the most efficient method of preventing pressure ulcers. However, executing a protocol ofpostural changes over a long period of time is an arduous task for caregivers.To address this problem, we propose a fuzzy monitoring system for posturalchanges which recognizes in-bed postures by means of micro inertial sensors attached to patients’ clothes. First, we integrate a data-driven model to classifyin-bed postures from the micro inertial sensors which are located in the socksand t-shirt of the patient. Second, a knowledge-based fuzzy model computes thepriority of postural changes for body zones based on expert-defined protocols.Results show encouraging performance in the classification of in-bed posturesand high adaptability of the knowledge-based fuzzy approach.
COVID-19 study based on chest X-rays of patients
Lecturer: Anabel Gómez
Abstract: COVID-19 is becoming one of the most infectious diseases of the 21st century. Due to the importance of its early detection, new ways to detect it are emerging. In this study, we focus on its detection using chest X-rays, pointing out the main problems of the most used data sets for this purpose. We propose a new data set and a new methodology that allows us to detect cases of COVID-19 with an accuracy of 76.18%, which is higher than the accuracies obtained by experts.
Image inpainting using non-adversarial networks. Towards a deeper semantic understanding of images
Lecturer: Guillermo Gómez
Abstract: In this study we explore the problem of image inpainting from a non-adversarial perspective. Can we use general generative models to solve problems other than those for which it was trained to? Do models acquire a deeper and transferable knowledge about the nature of the images they generate? We propose a novel methodology for the image inpainting problem using the implicit knowledge acquired in non-adversarial generative models.
Sentiment Analysis based Multi-person Multi-criteria Decision Making (SA-MpMcDM) Methodology
Lecturer: Cristina Zuheros
Abstract: Traditional decision making models are limited by pre-defined numerical and linguistic terms. We present the SA-MpMcDM methodology, which allows experts to evaluate through unlimited natural language and even through numerical ratings. We propose a deep learning model to extract the expert knowledge from the evaluations. We evaluate the methodology in a real case study, which we collect into the TripR-2020 dataset
MonuMAI: Architectural information extraction of monuments through Deep Learning techniques
Lecturer: Alberto Castillo
Abstract: An important part of art history can be discovered through the visual information in monument facades. However, the analysis of this visual information, i.e, morphology and architectural elements, requires high expert knowledge. An automatic system for identifying the architectural style or detecting the architectural elements of a monument based on one image will certainly help improving our knowledge in art and history.
The aim of this seminary is to introduce the MonuMAI (Monument with Mathematics and Artificial Intelligence) framework published in the related work . In particular, we designed MonuMAI dataset considering the proposed architectural styles taxonomy, developed MonuMAI deep learning pipeline, and built citizen science based MonuMAI mobile app that uses the proposed deep learning pipeline and dataset for performing in real life conditions.
 Lamas, Alberto & Tabik, Siham & Cruz, Policarpo & Montes, Rosana & Martínez-Sevilla, Álvaro & Cruz, Teresa & Herrera, Francisco. (2020) MonuMAI: Dataset, deep learning pipeline and citizen science based app for monumental heritage taxonomy and classification. Neurocomputing. doi.org/10.1016/j.neucom.2020.09.041