public class Datos
extends java.lang.Object
Constructor and Description |
---|
Datos(java.lang.String trainFileNames1,
java.lang.String testFileNames1,
int k)
Creates a new instance of Datos
|
Modifier and Type | Method and Description |
---|---|
int |
diff(int numCarac,
int numInstancia1,
int numInstancia2)
data must be discretized. it is used by the RELIEF method. returns 1 if the feature value passed
as argument is equal in both instances (also passed as arguments),0 in other case.
|
int |
findNearestHit(int posI)
returns the nearest instance according to the instance passed as an argument.
|
int |
findNearestMiss(int posI)
returns the nearest instance according to the instance passed as an argument.
|
void |
generarFicherosSalida(java.lang.String ficheroTrainSalida,
java.lang.String ficheroTestSalida,
boolean[] solucion)
this method generates the output files .tra and .tst, removing the non-selected features
|
double |
LVO(boolean[] featuresVector)
Calculates the precision (errors/total_instances) in the prediction of the instance class.
|
double |
LVOTest(boolean[] featuresVector)
calculates the precision (errors/total_instances) in the classification of all instances in the TEST DATASET
using the given features and THE SAME TEST DATASET TO PREDICT.
|
double |
measureIEP(boolean[] featuresVector)
Calculates the inconsistent example pairs ratio (IEP)
|
double |
medidaInconsistencia(boolean[] featuresVector)
Calculates the inconcistency ratio.
|
double[][] |
obtenerIMVars()
Computes the mutual information between every two variables.
|
double[] |
obtenerIMVarsClase()
Calculates the mutual information measure between the variables and the class.
|
int |
returnNumFeatures()
Returns the number of features of the datasets
|
int |
returnNumInstances()
Returns the number of instances of the datasets
|
double |
sumDifferentClasses(int posExample,
int feature)
Sums the prior probabilities of the classes that are different to the one of the example given.
|
double |
validacionCruzada(boolean[] featuresVector)
calculates the precision (errors/total_instances) in the classification of all instances in the TEST DATASET
using the given features and THE TRAINING DATASET TO PREDICT.
|
public Datos(java.lang.String trainFileNames1, java.lang.String testFileNames1, int k)
trainFileNames1
- Training filename.testFileNames1
- Test filename.k
- number of nearest neighbours to consider.public int returnNumFeatures()
public int returnNumInstances()
public double LVO(boolean[] featuresVector)
featuresVector
- a boolean array with selected featurespublic double medidaInconsistencia(boolean[] featuresVector)
featuresVector
- a boolean array with the selected featurespublic double measureIEP(boolean[] featuresVector)
featuresVector
- a boolean array with the selected featurespublic double[] obtenerIMVarsClase()
public double[][] obtenerIMVars()
public int findNearestHit(int posI)
posI
- is the instance position to find the nearestpublic int findNearestMiss(int posI)
posI
- is the instance position to find the nearestpublic double sumDifferentClasses(int posExample, int feature)
posExample
- example id.feature
- feature id.public int diff(int numCarac, int numInstancia1, int numInstancia2)
numCarac
- is the number of feature to checknumInstancia1
- is the position of the first instancenumInstancia2
- is the position of the second instancepublic double LVOTest(boolean[] featuresVector)
featuresVector
- a boolean array with the selected featurespublic double validacionCruzada(boolean[] featuresVector)
featuresVector
- a boolean array with the selected featurespublic void generarFicherosSalida(java.lang.String ficheroTrainSalida, java.lang.String ficheroTestSalida, boolean[] solucion)
ficheroTrainSalida
- is a string with the pathname of the output training fileficheroTestSalida
- is a string with the pathname of the output test filesolucion
- is a boolean array with the selected features