public class MODL extends Discretizer
classOfInstances, cutPoints, iClassIndex, realAttributes, realValues
Constructor and Description |
---|
MODL(java.lang.String processType)
Parameter constructor.
|
Modifier and Type | Method and Description |
---|---|
void |
applyNeighbour(java.util.Vector cp,
Neighbour neig)
Apply the neighbour to the current interval set.
|
static double |
binomialLog(int m,
int n)
Returns the natural logarithm of m over n.
|
static double |
combinatoria(int m,
int n)
Function that calculates combinatory of two integers
|
java.util.Vector |
createCP(java.util.Vector intervals)
Construct an array of cutpoints from the set of intervals.
|
protected java.util.Vector |
discretizeAttribute(int attribute,
int[] values,
int begin,
int end)
This abstract method creates the cut points of the attribute given using
its values for each instances given.
|
protected java.util.Vector |
exhaustiveMerge(int attribute,
int[] values,
int begin,
int end)
Performs an exhaustive bottom-up merge of all unitary intervals to a unique interval.
|
double |
factDivision(int i,
int[] ni,
int[][] nij)
Computes the division of factorials of the form (ni[i]!
|
static double |
factorialLog(int n)
Returns the natural logarithm of n!.
|
protected java.util.Vector |
greedyMODL(int attribute,
int[] values,
int begin,
int end)
This method implements the greedy version of the MODL discretizer.
|
double |
intervalCost(java.util.ArrayList<java.lang.Double> interval,
int index,
int[] values,
int I)
Computes the cost of the interval in the current discretization scheme
|
double |
mergeCostVariation(java.util.ArrayList<java.lang.Double> na,
int indexna,
java.util.ArrayList<java.lang.Double> nb,
int indexnb,
int I,
int[] values)
Computes the cost derived form merging two adjacent intervals na and nb
|
double |
modl(java.util.ArrayList<java.util.ArrayList<java.lang.Double>> disc,
int[] values)
Computes the MODL value for a current discretization scheme
|
double |
modl(java.util.Vector disc,
int[] values)
Computes the MODL value for a current discretization scheme
|
protected java.util.Vector |
optimalMODL(int attribute,
int[] values,
int begin,
int end)
It seachs for the best possible optimization scheme.
|
double |
partitionCost(int I,
int n)
Computes the cost of the partition
|
protected java.util.Vector |
postOptimizationMODL(int attribute,
int[] values,
int begin,
int end)
Implements the post-optimization procedure for MODL, after obtaining the
best initial interval division.
|
protected Neighbour |
split(java.util.ArrayList<java.lang.Double> interv,
int index,
int n,
int I,
int[] values)
Search for the best cutpoint in a given interval.
|
static double |
stirling(int n)
Stirling formula for aproximating Log(n!)
|
applyDiscretization, buildCutPoints, discretize, getCutPoint, getNumIntervals, sortValues
public MODL(java.lang.String processType)
processType
- given processing type (optimal, greedy or optimized).protected java.util.Vector discretizeAttribute(int attribute, int[] values, int begin, int end)
Discretizer
discretizeAttribute
in class Discretizer
attribute
- given attribute to discretize.values
- given attribute values in the dataset.begin
- First position of the section to discretize.end
- Last position of the section to discretize.protected java.util.Vector postOptimizationMODL(int attribute, int[] values, int begin, int end)
attribute
- the attribute which is being discretizedvalues
- the values of the attributebegin
- the initial position of the valuesend
- the final position of the valuesprotected Neighbour split(java.util.ArrayList<java.lang.Double> interv, int index, int n, int I, int[] values)
interv
- The interval which could be partitionedindex
- Index of the first element of the interval in the complete real value listn
- Total number of real valuesI
- Current number of intervalsvalues
- Mapping between instance number and the sorted rank by attribute valuespublic void applyNeighbour(java.util.Vector cp, Neighbour neig)
cp
- The interval setneig
- The neighbour (Split, MergeSplit or MergeMergeSplit) we want to applypublic double partitionCost(int I, int n)
I
- the number of intervalsn
- the number of different elementspublic double intervalCost(java.util.ArrayList<java.lang.Double> interval, int index, int[] values, int I)
interval
- the interval to be consideredindex
- the index of the intial element of the interval in the global array of valuesvalues
- the global array of valuesI
- the current number of intervalsprotected java.util.Vector exhaustiveMerge(int attribute, int[] values, int begin, int end)
attribute
- The attribute of the data set we are discretizingvalues
- Mapping between instance number and the sorted rank by attribute valuesbegin
- First position of values to be considered.end
- Last position of values to be considered.protected java.util.Vector greedyMODL(int attribute, int[] values, int begin, int end)
attribute
- the attribute which is being discretizedvalues
- the global array of values (sorted)begin
- the initial position of the values to be discretizedend
- the final position of the values to be discretizedprotected java.util.Vector optimalMODL(int attribute, int[] values, int begin, int end)
attribute
- the attribute to be discretizedvalues
- the global array of values (sorted)begin
- the initial position of the arrayend
- the final position in the arraypublic double modl(java.util.Vector disc, int[] values)
disc
- The discretization scheme to be evaluated. Comprises the intervals as ArrayList of values.values
- Array in which position i there is the number of instance which explanatory (real) value has rank i after sortingpublic double modl(java.util.ArrayList<java.util.ArrayList<java.lang.Double>> disc, int[] values)
disc
- The discretization scheme to be evaluated. Comprises the intervals as ArrayList of values.values
- Array in which position i there is the number of instance which explanatory (real) value has rank i after sortingpublic double mergeCostVariation(java.util.ArrayList<java.lang.Double> na, int indexna, java.util.ArrayList<java.lang.Double> nb, int indexnb, int I, int[] values)
na
- Interval to the left to mergeindexna
- Index of the first element of na in the whole list of real valuesnb
- Right interval to mergeindexnb
- Index of the first element of nb in the whole list of real valuesI
- Current number of intervals (the total intervals prior to the merging)values
- Array in which position i there is the number of instance which explanatory (real) value has rank i after sortingpublic double factDivision(int i, int[] ni, int[][] nij)
i
- The interval consideredni
- Number of instances which belong to interval inij
- Number of instances of class j which belong to interval ipublic static double factorialLog(int n)
n
- argumentlog(n!)
java.lang.IllegalArgumentException
- if preconditions are not met.public static double stirling(int n)
n
- Number to factorizepublic static double binomialLog(int m, int n)
m
- Upper argumentn
- Lower argumentpublic static double combinatoria(int m, int n)
m
- first integern
- second integerpublic java.util.Vector createCP(java.util.Vector intervals)
intervals
- Vector which contains the intervals in ArrayList format