Data Processing

In recent years there has been an immense growth in data, leading to the Big Data. This requires large computing infrastructure with high performance processing capabilities. Getting large data ready for analysis and knowledge extraction is a difficult task and requires data to be pre-processed to improve the quality of the raw data. Data representation and quality is one of the most important facets in the data science process. Data preprocessing is a preliminary practice in data science in which the raw data are transformed into a format suitable for analysis and the modeling algorithms. It improves data quality by cleaning, normalizing, transforming, reducing, and extracting relevant characteristics from the raw data. Data pre-processing significantly improves the performance of the automatic learning algorithms, which in turn results in accurate model extraction. Discovering knowledge from noisy, irrelevant, and redundant data is a difficult task, so accurately identifying outliers and outliers, supplanting missing values, and reducing the volume of useful data poses challenging problems in data science. The challenges in data pre-processing are focused on automation and accurate decision-making in their linked use; adjustment to address complex data structure and adaptation of techniques to increase reliability, fairness and transparency of models subsequently obtained by data science algorithms.

Responsible: Salvador García López

Investigadores relacionados:

Letra:

  Name Email Area Cat.
Val Muñoz, Coral del delval@decsai.utUKy3gvxy@Ggr.es DaSCI Technology Applications Area PhD
Charte Ojeda, Francisco fcharteUa_9qB1mY@g9@ujaen.es Data Science and Big Data Area PhD
Martínez del Río, Francisco fmartin@ujae@zA7UkZLn.es Data Science and Big Data Area PhD
Górriz Sáez, Juan Manuel gorriz@ugr.CeA_bn.lUifTes DaSCI Technology Applications Area PhD
Herrera Triguero, Francisco herrera@decsaiG7jMgTw9v.ugr.es DaSCI Technology Applications Area, Data Science and Big Data Area, Computational Intelligence Area PhD
Benítez Sánchez, José Manuel J.M.Benitez@dFeDB@DIGl.oecsai.ugr.es Data Science and Big Data Area, Computational Intelligence Area PhD
Cano de Amo, José Ramón jrcano@ujaeF2DcCZoYAn.es Data Science and Big Data Area PhD
Luengo Martín, Julián julianlm@d3Kbuzhyecsai.ugr.es Data Science and Big Data Area PhD
Romero Zaliz, Rocío rocio@bVk7iziFugr.es DaSCI Technology Applications Area PhD
García López, Salvador salvagl@decmV8SbGsai.ugr.es Data Science and Big Data Area PhD
Scroll Up