Evolutionary and Fuzzy Data Mining and Intelligent Systems

A Laboratory of the Research Group
"Soft Computing and Intelligent Information Systems"


Research Summary

EFDAMIS is a laboratory focused on the development of intelligent systems and data mining algorithms based on the use of fuzzy systems and evolutionary algorithms.

Fuzzy systems are a kinf of intelligent systems that, opposite to classical hard computing techniques, are tolerant to imprecision, uncertainty, partial truth, and approximation, and exploit this tolerance to achieve tractability, robustness and low solution cost to real-world problems. They have been successfully applied to solve different kinds of problems in various applications domains.

The interpretability is crucial in the field of Data Mining where knowledge should be extracted from data bases and represented in a comprehensible form, as well as for decision support systems where the reasoning process should be transparent to the user. In fact, the use of linguistic variables and linguistic terms in a knowledge discovery process facilitates the interpretation of rules in linguistic terms, and avoids unnatural boundaries in the partitioning of the attribute domains.

Evolutionary algorithms, particularly Genetic Algorithms, have been demonstrated to be an important tool for learning and knowledge extraction. Experts have used them in combination with multiple models of knowledge representation, such as neural networks, fuzzy rules, interval rules, approaches based on prototypes, feature selection and instance selection, extraction of association rules, etc.

Although evolutionary algorithms were not specifically designed for learning, but as global search algorithms, they offer a set of advantages for machine learning. Many methodologies for machine learning are based on a search of a good model inside a space of models, such as the space of rule sets, the space of weighs and topologies of neural networks, or the space of groups of prototypes, to mention just a few. Therefore, these methodologies may model the learning problem as a search problem or as an underlying optimization. The evolutionary algorithms and particularly the genetic algorithms, that have been used in most of the evolutionary learning models, perform the search in the space of models by means of the codification of the model in a chromosome. In this sense, they are very flexible because the same evolutionary algorithm can be used with different representations.

In this laboratory we work in the development of evolutionary and fuzzy models in Data Mining and Intelligent systems, developing new proposals in the field of knowledge extraction using evolutionary algorithms and fuzzy logic, and the study of intelligent systems based on the use of fuzzy models and genetic algorithms for applying them into robotics and other engineering applications.

The more important areas of development are:

Fuzzy Rule Based Systems. In a broad sense, a Fuzzy Rule Based System (FRBS) is a rule-based system where Fuzzy Logic is used as a tool for representing different forms of knowledge about the problem being solved, as well as for modeling the interactions and relationships existing among its variables.
Knowledge representation is enhanced with the use of linguistic variables and their linguistic values, that are defined by context-dependent fuzzy sets whose meanings are specified by gradual membership functions. Inference methods such as generalized modus ponens, tollens, etc., form the bases of approximate reasoning with pattern matching scores of similarity.
We are interested in the analysis of the FRBS components, the FRBS design and their application to different science and engineering problems.

Genetic Fuzzy Systems (GFS). A GFS is a Fuzzy System that is augmented with an evolutionary learning process. The most extended GFS type is the Genetic Fuzzy Rule-Based System (GFRBS) where a genetic algorithm is employed to learn or tune different components of a FRBS. We are working in the development of new GFS models, focused on the trade-off between interpretability and precision. The use of multi-objective genetic algorithms is an interesting tool for getting this trade-off in a genetic learning process.
We are interested in the development of GFS models and their application.

Preprocessing in Data Mining. Data preprocesing consist of manipulating and transforming raw data so that the information content enfolded in the data set can be exposed, or made more easily accessible. Data preparation is comprised of those techniques concerned with analyzing raw data so as to yield quality data, mainly including data collecting, data integration, data transformation, data cleaning, data reduction and data discretization.
We are developing new trends in the use of scalable evolutionary algorithms to reduce data (specially in instance and feature selection) to obtain more interpretable models in a further predictive process.

Subgroup Discovery (SD). It is defined as "given a population of individuals and a property of those individuals, we are interested in finding a population of subgroups as large as possible and have the most unusual statistical characteristic with respect to the property of interest". SD is situated at the intersection of predictive and descriptive induction. In the SD task the rules or subgroups are discovered using heuristics which try to find the best subgroups in terms of rule coverage and distributional unusualness.
We are working in the use of fuzzy rules for modelling the subgroups and the use of genetic algorithms as tool for searching rules or subgroups.

Learning with imbalanced data sets. It is one of the recent challenges in machine learning. The problem appears when the data presents a class imbalance, which consists of containing many more examples of one class than the other. Various solutions have been proposed in order to find a treatment for this problem, such as modifying methods or applying a preprocessing stage. Many applications have appeared in learning with imbalanced domains, like as fraud detection, intrusion detection, biological and medical identification, etc.
We are interested in the use genetic algorithms as an instance selection approach for getting a balance between classes for learning good models. We are also interested in the use of Genetic fuzzy systems for getting fuzzy models for this kind of problems.

Extracting association rules. Linguistic variables with linguistic terms can contribute in a substantial way to advance in the design of association rules and subgroup discovery, in particular, and the analysis of data to establish relationships and identify patterns, in general. On the other hand, genetic algorithms are widely used for evolving rules extraction and patterns association in Data Mining. Their conjunction in the GFS field provide novel useful tools for patterns analysis and for extracting new kinds of useful information with a main advantage over other techniques, its interpretability in terms of fuzzy if-then rules.
We are interested in the development of GFS models for learning association rules.

KEEL software tool. This platform allows us to analyze the behaviour of evolutionary learning for different kinds of problems: regression, classification, unsupervised learning, etc. It includes evolutionary learning algorithms based on the different approaches, pitt, michigan, IRL, and the integration of evolutionary learning algorithms with different pre-processing techniques, providing a complete analysis of any learning model in comparison with the existing ones.

Engineering applications. The use of knowledge-based systems can represent an efficient approach for system management in engineering/industrial applications, providing automatic control strategies with Artificial Intelligence capabilities. By means of Artificial Intelligence, the system is capable of assessing, diagnosing and suggesting the best operation mode. One important Artificial Intelligence tool for automatic control is the use of Fuzzy Logic Controllers (FLCs). FLCs are Fuzzy Rule-Based Systems comprising the expert knowledge in form of linguistic rules. These rules are usually constructed by an expert in the field of interest who can link the facts or evidence with conclusions. When a real-world situation is presented to the computer, it can use these rules to draw conclusions based on different situations in the way an expert would. However, this way to work sometimes fails to obtain an optimal behavior. Evolutionary algorithms, particularly Genetic Algorithms, can be successfully applied to enhance the controller behavior based on available data or system simulations.
In this framework, we have used genetic algorithms to develop smartly tuned FLCs dedicated to the control of Heating, Ventilating and Air Conditioning Systems (HVAC systems) concerning energy performance and indoor comfort requirements. This problem is of special interest due to the complexity of the system (many variables and rules) and is a good tool to check the performance of our new developments in Genetic Fuzzy Systems.

Intelligent Robotics. Autonomous mobile robots are those robots that have the ability to move and perform tasks in real environments without human supervision. The environments in which an autonomous robot moves are unconstrained, and have a high amount of uncertainty. Furthermore, information provided by robot sensors is noisy and unreliable. Fuzzy logic has shown to be an useful tool when dealing with this uncertainty and has been widely used for the design of behaviours in robotics
We are specially interested on applying machine learning techniques to privide mobile robots the capability to autonomously design strategies to improve its interaction with the environment. Specifically, we use different metaheuristics (such as genetic algorithms or ant colony optimization) to design fuzzy rule-based controllers by either supervised or reinforcement learning.

Webmaster: Julián Luengo
© 2007 EFDaMIS - Evolutionary and Fuzzy Data Mining and Intelligent Systems