Workshop: Data Preprocessing in Data Mining

Organizers: Julián Luengo, Salvador García and Francisco Herrera.

This workshop is proposed for the IEEE International Conference on Data Mining which will be held in Barcelona, Spain at December 13-15, 2016.

Data preprocessing for Data Mining (DM) focuses on one of the most meaningful issues within the famous Knowledge Discovery from Data process. Data will likely have inconsistencies, errors, out of range values, impossible data combinations, missing values or most substantially, data is not suitable to start a DM process. In addition, the growing amount of data in current business applications, science, industry and academia, demands to the requirement of more complex mechanisms to analyze it. With data preprocessing, converting the impractical into possible is achievable, adapting the data to accomplish the input requirements of each DM algorithm.

Data preprocessing includes data preparation, compounded by integration, cleaning, normalization and transformation of data; and data reduction tasks, which aim at reducing the complexity of the data, detecting or removing irrelevant and noisy elements from the data through feature selection, instance selection or discretization processes. The outcome expected after a reliable connection of data preprocessing processes is a final data set, which can be contemplated correct and useful for further DM algorithms.

Objectives and Scope

This workshop aims at gathering researchers with an interest in the research area described above. Specifically, we are interested in contributions towards the development of novel preprocessing techniques for DM problems, as well as approaches for developing areas in DM as Big Data.

Contributions to this special session are expected to pay special attention to the rigorous motivation of the approaches put forward and to support all aspects of the models developed with a corresponding theoretical sound framework. Straight approaches lacking such scientific approach are discouraged.

Indicative, but not complete, lists of topics covered in this focus session include:

Data preprocessing for classical DM problems: classification, regression, association rules, time series, etc.
Feature and instance selection
Noise filtering and correction
Missing values treatment
Data transformation
Discretization
Instance generation
Imbalanced data treatment: oversampling and undersampling
Data preprocessing for multilabel, multi-instance and ordinal classification
Data streams preprocessing
Data preprocessing for subgroup discovery
Big Data preprocessing
Data preprocessing for Deep Learning

Organizers and Contact

Julián Luengo. Contact information:
Email address: julianlm@decsai.ugr.es
Postal address: Department of Computer Science and Artificial Intelligence, University of Granada, E-18071 Granada, Spain
Telephone number: +34-958-244258
Fax Number: +34 958 243317
Salvador García. Contact information:
Email address: salvagl@decsai.ugr.es
Postal address: Department of Computer Science and Artificial Intelligence, University of Granada, E-18071 Granada, Spain
Telephone number: +34-958-244258
Fax Number: +34 958 243317
Francisco Herrera. Contact information:
Email address: herrera@decsai.ugr.es
Postal address: Department of Computer Science and Artificial Intelligence, University of Granada, E-18071 Granada, Spain
Telephone number: +34-958-244258
Fax Number: +34 958 243317

You are here

Workshop: Data Preprocessing in Data Mining

Objectives and Scope

Organizers and Contact

User login

SCI2S Web-site Related