Missing Values in Data Mining - Bibliography

Bibliography on Missing Values in Data Mining sorted by years:

1997 (1 paper)
1998 (1 paper)
1999 (1 paper)
2001 (3 papers)
2002 (4 papers)
2003 (6 papers)
2004 (7 papers)
2005 (8 papers)
2006 (6 papers)
2007 (18 papers)
2008 (3 papers)
2009 (2 papers)
2010 (9 papers)
2011 (7 papers)

1997 (1 paper)

S.M. Chen, M.S. Yeh. Generating fuzzy rules from relational database systems for estimating null values. Cybernetics and Systems 28:8 (1997) 695-723 doi:10.1080/019697297125912

1998 (1 paper)

M.R. Berthold, K.P. Huber. Missing Values and Learning of Fuzzy Rules. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6:2 (1998) 171-178 doi:10.1142/S021848859800015X

1999 (1 paper)

M. Kryszkiewicz. Rules in incomplete information systems. Information Sciences 113 (1999) 271-292 doi:10.1016/S0020-0255(98)10065-8

2001 (3 papers)

C.M. Ennett, M. Frize, C.R. Walker. Influence of missing values on artificial neural network performance. Medinfo 10 (2001) 449-453 doi:10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2
T. Schneider. Analysis of incomplete climate data: Estimation of Mean Values and covariance matrices and imputation of Missing values. Journal of Climate 14 (2001) 853-871 doi:10.1175/1520-0442(2001)014<0853:AOICDE>2.0.CO;2
O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, R.B. Altman. Missing value estimation methods for DNA microarrays. Bioinformatics 17:6 (2001) 520-525 doi:10.1093/bioinformatics/17.6.520

2002 (4 papers)

B. Gabrys. Neuro-fuzzy approach to processing inputs with missing values in pattern recognition problems. International Journal of Approximate Reasoning 30:3 (2002) 149-179 doi:10.1016/S0888-613X(02)00070-1
X. Huang, Q. Zhu. A pseudo-nearest-neighbor approach for missing data recovery on Gaussian random data sets. Pattern Recognition Letters 23:13 (2002) 1613-1622 doi:10.1016/S0167-8655(02)00125-3
J.L. Schafer, J.W. Graham. Missing data: our view of the state of the art. Psychol Methods 7:2 (2002) 147-177 doi:10.1037/1082-989X.7.2.147
J.L. Schafer, R.M. Yucel. Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics 11:2 (2002) 437-457 doi:10.1198/106186002760180608

2003 (6 papers)

G.E.A.P.A. Batista, M.C. Monard. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17 (2003) 519-533 doi:10.1080/713827181
S.M. Chen, C.M. Huang. Generating weighted fuzzy rules from relational database systems for estimating null values using genetic algorithms. IEEE Transactions on Fuzzy Systems 11:4 (2003) 495-506 doi:10.1109/ICECE.2006.355637
S.M. Chen, S.W. Lee. A new method to generate fuzzy rules from relational database systems for estimating null values. Cybernetics and Systems 34:1 (2003) 33-57 doi:10.1080/01969720302850
S.A. Oba, M.A. Sato, I.C. Takemasa, M.C. Monden, K.I. Matsubara, S.A. Ishii. A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19:16 (2003) 2088-2096 doi:10.1093/bioinformatics/btg287
S.M. Tseng, K.H. Wang, C.I. Lee. A pre-processing method to deal with missing values by integrating clustering and regression techniques. Applied Artificial Intelligence 17:5-6 (2003) 535-544 doi:10.1080/713827170
X.A. Zhou, X.B. Wang, E.R. Dougherty. Missing-value estimation using linear and non-linear regression with Bayesian gene selection. Bioinformatics 19:17 (2003) 2302-2307 doi:10.1093/bioinformatics/btg323

2004 (7 papers)

O.T. Abdala, M.A. Saeed. Estimation of missing values in clinical laboratory measurements of ICU patients using a weighted K-nearest neighbors algorithm. Computers in Cardiology 31 (2004) 693-696 doi:10.1109/CIC.2004.1443033
F.A. Barzi, M.A. Woodward. Imputations of missing values in practice: Results from imputations of serum cholesterol in 28 cohort studies. American Journal of Epidemiology 160:1 (2004) 34-45 doi:10.1093/aje/kwh175
T.H. Bo, B. Dysvik, I. Jonassen. LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic acids research 32:3 (2004) 1-8 doi:10.1093/nar/gnh026
A. Figueroa, J.B. Borneman, T.A. Jiang. Clustering binary fingerprint vectors with missing values for DNA array data analysis. Journal of Computational Biology 11:5 (2004) 887-901 doi:10.1109/CSB.2003.1227302
P.A. Gourraud, E.B. Genin, A.A. Cambon-Thomsen. Handling missing values in population data: Consequences for maximum likelihood estimation of haplotype frequencies. European Journal of Human Genetics 12:10 (2004) 805-812 doi:10.1038/sj.ejhg.5201233
K. Honda, H. Ichihashi. Linear fuzzy clustering techniques with missing values and their application to local principal component analysis. IEEE Transactions on Fuzzy Systems 12:2 (2004) 183-193 doi:10.1109/TFUZZ.2004.825073
R.A. Little, H.A. An. Robust likelihood-based analysis of multivariate data with missing values. Statistica Sinica 14:3 (2004) 949-968

2005 (8 papers)

M. Abdella, T. Marwala. The use of genetic algorithms and neural networks to approximate missing data in database. Computing and Informatics 24:6 (2005) 577-589
S.M. Chen, S.W. Lee. Estimating null values in relational database systems based on genetic algorithms. Cybernetics and Systems 36:1 (2005) 85-106 doi:10.1080/01969720590887333
S.M. Chen, H.R. Hsiao. A new method to estimate null values in relational database systems based on automatic clustering techniques. Information Sciences 169:1 (2005) 47-69 doi:10.1016/j.ins.2004.02.012
H.A. Kim, G.H.B. Golub, H.A. Park. Missing value estimation for DNA microarray gene expression data: Local least squares imputation. Bioinformatics 21:2 (2005) 187-198 doi:10.1093/bioinformatics/bth499
S. Konias, I.A. Chouvarda, I.B. Vlahavas, N.A. Maglaveras. A novel approach for incremental uncertainty rule generation from databases with missing values handling: Application to dynamic medical databases. Medical Informatics and the Internet in Medicine 30:3 (2005) 211-225 doi:10.1080/14639230500209336
K.A. Pelckmans, J.B. De Brabanter, J.A.K.A. Suykens, B.A. De Moor. Handling missing values in support vector machine classifiers. Neural Networks 18:5-6 (2005) 684-692 doi:10.1016/j.neunet.2005.06.025
I.A. Scheel, M.B. Aldrin, I.K.A. Glad, R.A. Sorum, H.C. Lyng, A.B. Frigessi. The influence of missing value imputation on detection of differentially expressed genes from microarray data. Bioinformatics 21:23 (2005) 4272-4279 doi:10.1093/bioinformatics/bti708
S. Zhang, Z. Qin, C.X. Ling, S. Sheng. “Missing is Useful”: Missing values in cost-sensitive decision trees. IEEE Transactions on Knowledge and data engineering 17:12 (2005) 1-5 doi:10.1109/TKDE.2005.188

2006 (6 papers)

X.B.C. Chai, R.A.D. Pan. Test-cost sensitive classification on data with missing values. IEEE Transactions on Knowledge and Data Engineering 18:5 (2006) 626-637 doi:10.1109/TKDE.2006.84
I.A. Fortes, L.B. Mora-Lopez, R.B. Morales, F.B. Triguero. Inductive learning models with missing values. Mathematical and Computer Modelling 44:9-10 (2006) 790-806 doi:10.1016/j.mcm.2006.02.013
R.S. Lokupitiya, E.B. Lokupitiya, K.B. Paustian. Comparison of missing value imputation methods for crop yield data. Environmetrics 17:4 (2006) 339-349 doi:10.1002/env.773
M.K. Markey, G.D. Tourassi, M. Margolis, D.M. DeLong. Impact of missing data in evaluating artificial neural networks trained on complete data. Computers in Biology and Medicine 36:5 (2006) 516-525 doi:10.1016/j.compbiomed.2005.02.001
A. Vellido. Missing data imputation through GTM as a mixture of t-distributions. Neural Networks 19:10 (2006) 1624-1635 doi:10.1016/j.neunet.2005.11.003
X. Wang, Z. Jiang, H. Feng. Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme. BMC Bioinformatics 7:32 (2006) 1-10 doi:10.1186/1471-2105-7-32

2007 (18 papers)

J. Banasik, J. Crook. Reject inference, augmentation, and sample selection. European Journal of Operational Research 183:3 (2007) 1582-1594 doi:10.1016/j.ejor.2006.06.072
L.P. Bras, J.C. Menezes. Improving cluster-based missing value estimation of DNA microarray data. Biomolecular Engineering 24:2 (2007) 273-282 doi:10.1016/j.bioeng.2007.04.003
F.A. Dah. Convergence of random k-nearest-neighbour imputation. Computational Statistics & Data Analysis 51:12 (2007) 5913-5917 doi:10.1016/j.csda.2006.11.007
M. Di Zio, U. Guarnera, O. Luzi. Imputation through finite Gaussian mixture models. Computational Statistics and Data Analysis 51:11 (2007) 5305-5316 doi:10.1016/j.csda.2006.10.002
A. Farhangfar, L.A. Kurgan, W. Pedrycz. A novel framework for imputation of missing values in databases. IEEE Transactions on Systems, Man, and Cybernetics 37:5 (2007) 692-709 doi:10.1109/TSMCA.2007.902631
J.W. Graham, A.E. Olchowski, T.D. Gilreath. How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science 8:3 (2007) 206-213 doi:10.1007/s11121-007-0070-9
E.R. Hruschka Jr., E.R. Hruschka, N.F.F. Ebecken. Bayesian networks for imputation in classification problems. Journal of Intelligent Information Systems 29:3 (2007) 231-252 doi:10.1007/s10844-006-0016-x
K. Metaxoglou, A. Smith. Maximum likelihood estimation of VARMA models using a state-space em algorithm. Journal of Time Series Analysis 28:5 (2007) 666-685 doi:10.1111/j.1467-9892.2007.00529.x
M. Mojirsheibani. Nonparametric curve estimation with missing data: A general empirical process approach. ournal of Statistical Planning and Inference 137:9 (2007) 2733-2758 doi:10.1016/j.jspi.2006.02.016
J.D. Parker, N. Schenker. Multiple imputation for national public-use datasets and its possible application for gestational age in United States Natality files. Paediatric and Perinatal Epidemiology 21:2 (2007) 97-105 doi:10.1111/j.1365-3016.2007.00866.x
H. Peng, S. Zhu. Handling of incomplete data sets using ICA and SOM in data mining. Neural Computing and Applications 16:2 (2007) 167-172 doi:10.1007/s00521-006-0058-6
Y. Qin, S. Zhang, X. Zhu, J. Zhang, C. Zhang. Semi-parametric optimization for missing data imputation. Applied Intelligence 27:1 (2007) 79-88 doi:10.1007/s10489-006-0032-0
M. Saar-Tsechansky, F. Provost. Handling missing values when applying classification models. Journal of Machine Learning Research 8 (2007) 1625-1657
T.H. Scheike, Y. Sun. Maximum likelihood estimation for tied survival data under Cox regression model via EM-algorithm. Lifetime Data Analysis 13:3 (2007) 399-420 doi:10.1007/s10985-007-9043-3
Q. Song, M. Shepperd. A new imputation method for small software project data sets. Journal of Systems and Software 80:1 (2007) 51-62 doi:10.1016/j.jss.2006.05.003
D. Williams, X. Liao, Y. Xue, L. Carin, B. Krishnapuram. On Classification with Incomplete Data. IEEE Transactions on Pattern Analysis and Machine Intelligence 29:3 (2007) 427-436 doi:10.1109/TPAMI.2007.52
D.S.V. Wong, F.K. Wong, G.R. Wood. A multi-stage approach to clustering and imputation of gene expression profiles. Bioinformatics 23:8 (2007) 998-1005 doi:10.1093/bioinformatics/btm053
D. Yoon, E.K. Lee, T. Park. Robust imputation method for missing values in microarray data. BMC bioinformatics 8:2 (2007) 1-7 doi:10.1186/1471-2105-8-S2-S6

2008 (3 papers)

G. Corani, M. Zaffalon. Learning Reliable Classifiers From Small or Incomplete Data Sets: The Naive Credal Classifier 2. Journal of Machine Learning Research 9 (2008) 581-621
A. Farhangfar, L. Kurgan, J. Dy. Impact of imputation of missing values on classification error for discrete data. Pattern Recognition 41 (2008) 3692-3705 doi:10.1016/j.patcog.2008.05.019
Q. Song, M. Shepperd, X. Chen, J. Liu. Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation. Journal of Systems and Software 81:12 (2008) 2361-2370 doi:10.1016/j.jss.2008.05.008

2009 (2 papers)

P.J. García-Laencina, J.L. Sancho-Gómez, A.R. Figueiras-Vidal. Pattern classification with missing data: a review. Neural Computation & Applications 9:1 (2009) 1-12 doi:10.1007/s00521-009-0295-6
B. Twala. An empirical comparison of techniques for handling incomplete data using decision trees. Applied Artificial Intelligence 23 (2009) 373-405 doi:10.1080/08839510902872223

2010 (9 papers)

W-K. Ching, L. Li, N.K. Tsing, C.W. Tai, T.W. Ng, A.S. Wong. A Weighted Local Least Squares Imputation method for missing value estimation in microarray gene expression data. International Journal of Data Mining and Bioinformatics 4:3 (2010) 331-347
B. Twala, M. Cartwright. Ensemble missing data techniques for software effort prediction. Intelligent Data Analysis 14:3 (2010) 299-331
T.P. Hong, L.H. Tseng, B.C. Chien. Mining from incomplete quantitative data by fuzzy rough sets. Expert Systems With Applications 37:3 (2010) 2644-2653
I.A. Gheyas, L.S. Smith. A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73:16-18 (2010) 3039-3065
W-K. Ching, L. Li, N.K. Tsing, C.W. Tai, T.W. Ng, A.S. Wong. A Weighted Local Least Squares Imputation method for missing value estimation in microarray gene expression data. International Journal of Data Mining and Bioinformatics 4:3 (2010) 331-347
Y. Ding, J.S. Simonoff. An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data. Journal of Machine Learning Research 11 (2010) 131-170
M. Ghannad-Rezaie, H. Soltanian-Zadeh, H. Ying, M. Dong. Selection-fusion approach for classification of data sets with missing values. Pattern Recognition 43 (2010) 2340-2350 doi:10.1016/j.patcog.2009.12.003
J. Luengo, S. García, F. Herrera. A Study on the Use of Imputation Methods for Experimentation with Radial Basis Function Network Classifiers Handling Missing Attribute Values: The good synergy between RBFs and EventCovering method. Neural Networks 23 (2010) 406-418 doi:10.1016/j.neunet.2009.11.014
P. Merlin, A. Sorjamaa, B. Maillet, A. Lendasse. X-SOM and L-SOM: A double classification approach for missing value imputation. Neurocomputing 73 (7-9) (2010) 1103-1108

2011 (7 papers)

J. Ning, P.E. Cheng. A comparison study of nonparametric imputation methods. Statistics and Computing in press (2011) 1-13
Y. Endo, Y. Hasegawa, Y. Hamasuna, Y. Kanzawa. Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization. Journal of Advanced Computational Intelligence and Intelligent Informatic 15:1 (2011) 76-82
P. Rey-del-Castillo, J. Cardeñosa. Fuzzy min-max neural networks for categorical data: application to missing data imputation. Neural Computing and Applications in press (2011) 1-14
E.L. Silva-Ramírez, R. Pino-Mejías, M. López-Coello, M.D. Cubiles-de-la-Vega. Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Networks 24:1 (2011) 121-129
Y. Endo, Y. Hasegawa, Y. Hamasuna, Y. Kanzawa. Fuzzy c-means clustering for uncertain data using quadratic penalty-vector regularization. Journal of Advanced Computational Intelligence and Intelligent Informatic 15:1 (2011) 76-82
S. Zhang. Shell-neighbor method and its application in missing data imputation. Applied Intelligence 35:1 (2011) 123-133
X. Zhu, S. Zhang, Z. Jin, Z. Zhang, Z. Xu. Missing value estimation for mixed-attribute data sets. IEEE Transactions on Knowledge and Data Engineering 23:1 (2011) 110-121

You are here

Missing Values in Data Mining - Bibliography

1997 (1 paper)

1998 (1 paper)

1999 (1 paper)

2001 (3 papers)

2002 (4 papers)

2003 (6 papers)

2004 (7 papers)

2005 (8 papers)

2006 (6 papers)

2007 (18 papers)

2008 (3 papers)

2009 (2 papers)

2010 (9 papers)

2011 (7 papers)

User login

SCI2S Web-site Related