Data Mining and Soft Computing

Post-Graduate Program Course: Data Mining and Soft Computing Dottorato di Ricerca in Ingegneria dell'Informazione

Francisco Herrera (Dpto. de Ciencias de la Computación e I.A.)

Summary of Sessions

Additional Slides:

  • Session 0: Presentation.
  • Session 1: Introduction to Data Mining and Knowledge Discovery.
  • Session 2: Data Preparation.
  • Session 3: Introduction to Prediction, Classification, Clustering and Association.
  • Session 4: Data Mining - From the Top 10 Algorithms to the New Challenges. Part I: , Part II:
  • Session 5: Introduction to Soft Computing. Focusing our attention in Fuzzy Logic and Evolutionary Computation.
  • Session 6: Soft Computing Techniques in Data Mining: Fuzzy Data Mining and Knowledge Extraction based on Evolutionary Learning.
  • Session 7: Genetic Fuzzy Systems: State of the Art and New Trends. Part I: , Part II: .
    • J. Alcalá-Fdez et al. (March 2008). KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems
    • H. Ishibuchi (July 2007). Multiobjective Genetic Fuzzy Systems: Review and Future Research Directions
  • Session 8: Some Advanced Topics I: Classification with Imbalanced Data Sets.
  • Session 9: Some Advanced Topics II: Subgroup Discovery.
  • Session 10: Some advanced Topics III: Data Complexity.
  • Session 11: Final talk: How must I Do my Experimental Study? Design of Experiments in Data Mining/Computational Intelligence. Using Non-parametric Tests. Some Cases of Study.

Bibliography

J. Han, M. Kamber. Data Mining. Concepts and Techniques. Morgan Kaufmann, 2006 (Second Edition) http://www.cs.sfu.ca/~han/dmbook
I.H. Witten, E. Frank. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition, Morgan Kaufmann, 2005. http://www.cs.waikato.ac.nz/~ml/weka/book.html
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. Introduction to Data Mining (First Edition). Addison Wesley, (May 2, 2005) http://www-users.cs.umn.edu/~kumar/dmbook/index.php
Margaret H. Dunham. Data Mining: Introductory and Advanced TopicsPrentice Hall, 2003 http://lyle.smu.edu/~mhd/book
Dorian Pyle. Data Preparation for Data Mining. Morgan Kaufmann, Mar 15, 1999 
Mamdouh Refaat. Data Preparation for Data Mining Using SAS. Morgan Kaufmann, Sep. 29, 2006) 
O. Cordón, F. Herrera, F. Hoffmann, L. Magdalena. Genetic Fuzzy Systems. Evolutionary Tuning and Learning of Fuzzy Knowledge Bases, Vol. 19 of Advances in Fuzzy Systems - Applications and Theory. World Scientific, 2001 http://sci2s.ugr.es/publications/geneticFuzzySystems.php
H. Ishibuchi, T. Nakashima, M. Nii Classification and modeling with linguistic information granules: Advanced approaches to linguistic data mining, Springer, 2004.
M. Basu and T.K. Ho (Eds.) Data Complexity in Pattern Recognition, Springer, 2006
J.H. Zar. Biostatistical Analysis, Prentice Hall, 1999. 
D. Sheskin. Handbook of parametric and nonparametric statistical procedures. Chapman & Hall/CRC, 2003.

 

  • Qiang Yang and Xindong Wu (Contributors: Pedro Domingos, Charles Elkan, Johannes Gehrke, Jiawei Han, David Heckerman, Daniel Keim, Jiming Liu, David Madigan, Gregory Piatetsky-Shapiro, Vijay V. Raghavan, Rajeev Rastogi, Salvatore J. Stolfo, Alexander Tuzhilin, and Benjamin W. Wah), 10 Challenging Problems in Data Mining Research, International Journal of Information Technology & Decision Making, Vol. 5, No. 4, 2006, 597-604
  • Xindong Wu, Vipin Kumar, J. Ross Quinlan, Joydeep Ghosh, Qiang Yang, Hiroshi Motoda, Geoffrey J. McLachlan, Angus Ng, Bing Liu, Philip S. Yu, Zhi-Hua Zhou, Michael Steinbach, David J. Hand, Dan Steinberg. Top 10 algorithms in data mining. Knowledge Information Systems (2008) 14:1–37
  • Hans-Peter Kriegel, Karsten M. Borgwardt, Peer Kröger, Alexey Pryakhin, Matthias Schubert, Arthur Zimek. Future trends in data mining. Data Mining and Knowledge Discovery (2007) 15:87–97
  • Gregory Piatetsky-Shapiro. Data mining and knowledge discovery 1996 to 2005: overcoming the hype and moving from “university” to “business” and “analytics”. Data Mining and Knowledge Discovery (2007) 15:99–105
  • Piero P. Bonissone. Soft computing: the convergence of emerging reasoning technologies. Soft Computing (1997) 1:6-18
  • José L. Verdegay, Ronald R. Yager, Piero P. Bonissone. On heuristics as a fundamental constituent of soft computing. Fuzzy Sets and Systems (2008) 159:7 846-855
  • G.E.A.P.A. Batista, R.C. Prati, M.C. Monard. A study of the behavior of several methods for balancing machine learning training data. SIGKDD Explorations 6:1 (2004) 20-29.
  • T. Ho and M. Basu. Complexity measures of supervised classification problems. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):289–300, 2002.
  • N. Lavrac, B. Cestnik, Gamberger D, Flach P (2004) Decision support through subgroup discovery: three case studies and the lessons learned. Machine Learning 57:115-143
  • E. Hüllermeier. Fuzzy methods in machine learning and data mining: Status and prospects. Fuzzy Sets and Systems 156(3), 2005, 387-406.
  • Demsar, J., Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research. Vol. 7. pp. 1–30. 2006.
  • S. García, D. Molina, M. Lozano, F. Herrera, A Study on the Use of Non-Parametric Tests for Analyzing the Evolutionary Algorithms' Behaviour: A Case Study on the CEC'2005 Special Session on Real Parameter Optimization. Journal of Heuristics, in press (2008),   (ENLACE ROTO)
  • S. García, F. Herrera, An Extension on "Statistical Comparisons of Classifiers over Multiple Data Sets" for all Pairwise Comparisons. Journal of Machine Learning Research, in press (2008), (ENLACE ROTO)