public class UCPD extends Discretizer
This class implements the UCPD algorithm
classOfInstances, cutPoints, iClassIndex, realAttributes, realValues
Constructor and Description |
---|
UCPD(InstanceSet is)
Constructor of the class
|
Modifier and Type | Method and Description |
---|---|
boolean |
areSimilar(java.util.Vector<Itemset> A,
java.util.Vector<Itemset> B)
It checks if two frequents itemsets are similar
|
void |
discretizeAllAttributes()
It computes the cutpoints for each continuous variable
|
protected java.util.Vector |
discretizeAttribute(int attribute,
int[] values,
int begin,
int end)
It returns a vector with the discretized values
|
java.util.Vector<Itemset> |
frequentItemsetForInterval(java.util.Vector<Itemset> its,
int[] instances,
int numi)
It computes the frequent itemsets of the given instances
|
java.lang.Object[] |
getInstancesInto(double[][] FinalData,
int dim,
int[] selected,
double[] cutp,
int ncp,
int sp,
int opt)
It computes the indexes of instances that fall into the interval selected
|
double[] |
KMeans(int k,
double[][] FinalData,
int dim)
It calculates the cutpoints using the K-Means algorithm
|
double |
KNN(int att,
double value,
double[][] FinalData,
int dim)
It computes the cutpoint using KNN algorithm
|
void |
normalizeAndCenter()
It normalizes continuous attributes and center them on their mean
|
double[] |
uniformFrequencyCutpoints(int k,
double[][] FinalData,
int att)
It calculates the cutpoints with uniform frequency
|
applyDiscretization, buildCutPoints, discretize, getCutPoint, getNumIntervals, sortValues
public UCPD(InstanceSet is)
Constructor of the class
is
- set of instancesprotected java.util.Vector discretizeAttribute(int attribute, int[] values, int begin, int end)
It returns a vector with the discretized values
discretizeAttribute
in class Discretizer
attribute
- index of the attribute to discretizevalues
- not usedbegin
- not usedend
- not usedpublic void discretizeAllAttributes()
It computes the cutpoints for each continuous variable
public void normalizeAndCenter()
It normalizes continuous attributes and center them on their mean
public double[] KMeans(int k, double[][] FinalData, int dim)
It calculates the cutpoints using the K-Means algorithm
k
- number of intervalsFinalData
- the mapped data with eigendimensiondim
- the eigendimension to discretizepublic double KNN(int att, double value, double[][] FinalData, int dim)
It computes the cutpoint using KNN algorithm
att
- original index of the attribute to compute the cutpointvalue
- value to find its nearest neighborsFinalData
- data matrix of PCAdim
- dimension to find the nearest neighborspublic double[] uniformFrequencyCutpoints(int k, double[][] FinalData, int att)
It calculates the cutpoints with uniform frequency
k
- number of cutpoints to computeFinalData
- matrix of data of PCAatt
- index of the dimensionpublic java.util.Vector<Itemset> frequentItemsetForInterval(java.util.Vector<Itemset> its, int[] instances, int numi)
It computes the frequent itemsets of the given instances
its
- instances
- numi
- public boolean areSimilar(java.util.Vector<Itemset> A, java.util.Vector<Itemset> B)
It checks if two frequents itemsets are similar
A
- first set of frequents itemsetsB
- second set of frequents itemsetspublic java.lang.Object[] getInstancesInto(double[][] FinalData, int dim, int[] selected, double[] cutp, int ncp, int sp, int opt)
It computes the indexes of instances that fall into the interval selected
FinalData
- matrix of data of PCAdim
- index of the dimensionselected
- indexes of the selected cutpointscutp
- array of cutpointsncp
- number of cutpointssp
- index of the cutpoint to form the intervalopt
- equals to LEFT to indicate the left interval of sp and equals to RIGHT to indicate the right
interval