public class Cut
extends java.lang.Object
Class to implement the calculus of the cut point
Modifier and Type | Field and Description |
---|---|
protected Classification |
classification
Classification of class values.
|
protected int |
numSubsets
Number of subsets.
|
Constructor and Description |
---|
Cut(Classification dist)
Function to use when no cut is necessary.
|
Cut(int index,
int nObj,
double weights)
Function to initialize the cut model.
|
Modifier and Type | Method and Description |
---|---|
int |
attributeIndex()
Returns the index of the attribute to cut on.
|
boolean |
checkModel()
Function to check if generated model is valid.
|
Classification |
classification()
Returns the classification created by the model.
|
void |
classify(MyDataset trainItemsets)
Function to create the cut point.
|
double |
classProbability(int classIndex,
Itemset itemset,
int subset)
Function to compute the probability for itemset.
|
MyDataset[] |
cutDataset(MyDataset data)
Function to cut the dataset in subsets.
|
static java.lang.String |
doubleToString(double value,
int afterDecimalPoint)
Function to round a double and converts it into String.
|
static java.lang.String |
doubleToString(double value,
int width,
int afterDecimalPoint)
Function to round a double and converts it into String.
|
double |
gainRatioCutCrit(Classification values,
double totalnoInst,
double numerator)
Function to compute the gain ratio.
|
double |
getCutPoint()
It returns the cutpoint
|
double |
getGainRatio()
Returns the gain ratio for the cut.
|
double |
getInfoGain()
Returns information gain for the generated cut.
|
double |
infoGainCutCrit(Classification values,
double totalNoInst,
double oldEnt)
Function to compute the information gain.
|
java.lang.String |
label(int index,
MyDataset data)
Function to print label for subset index of itemsets.
|
java.lang.String |
leftSide(MyDataset data)
Function to print left side of condition.
|
protected double |
logFunc(double num)
Returns the log2
|
double |
newEntropy(Classification values)
Function to compute entropy of classification after cutting.
|
int |
numSubsets()
Returns the number of created subsets for the cut.
|
double |
oldEntropy(Classification values)
Function to compute entropy of classification before cutting.
|
void |
resetClassification(MyDataset data)
Function to reset the classification of the model.
|
java.lang.String |
rightSide(int index,
MyDataset data)
Function to print the condition satisfied by itemsets in a subset.
|
void |
setCutPoint(MyDataset allItemsets)
Function to set the cut point.
|
double[] |
weights(Itemset itemset)
Returns weights if itemset is assigned to more than one subset, null otherwise.
|
int |
whichSubset(Itemset itemset)
Returns index of subset itemset is assigned to.
|
protected Classification classification
protected int numSubsets
public Cut(int index, int nObj, double weights)
index
- The attribute index.nObj
- Minimum number of itemsets.weights
- The weight of all the itemsets.public Cut(Classification dist)
dist
- Distribution of values per class.public void classify(MyDataset trainItemsets) throws java.lang.Exception
trainItemsets
- The dataset to classify.java.lang.Exception
- If the classification cannot be made.public final double classProbability(int classIndex, Itemset itemset, int subset)
classIndex
- The index of the class.itemset
- The itemset.subset
- The index of the subset.public final void setCutPoint(MyDataset allItemsets)
allItemsets
- The dataset used for the cut.public final MyDataset[] cutDataset(MyDataset data) throws java.lang.Exception
data
- The dataset to cut.java.lang.Exception
- If the dataset cannot be cut.public void resetClassification(MyDataset data) throws java.lang.Exception
data
- The new dataset used.java.lang.Exception
- If the classification cannot be reset.public final double[] weights(Itemset itemset)
itemset
- The itemset.public final int whichSubset(Itemset itemset)
itemset
- The itemset.public final boolean checkModel()
public final Classification classification()
public final int numSubsets()
public final double gainRatioCutCrit(Classification values, double totalnoInst, double numerator)
values
- The classification used to compute the gain ratio.totalnoInst
- Number of itemsets.numerator
- The information gain.public final double infoGainCutCrit(Classification values, double totalNoInst, double oldEnt)
values
- The classification used to compute the information gain.totalNoInst
- Number of itemsets.oldEnt
- The value for the entropy before cutting.public final double oldEntropy(Classification values)
values
- The classification used to compute the entropy before cutting.public final double newEntropy(Classification values)
values
- The classification used to compute the entropy after cutting.protected final double logFunc(double num)
num
- The number to compute the log2.public final double getInfoGain()
public final double getGainRatio()
public final java.lang.String leftSide(MyDataset data)
data
- The dataset.public final java.lang.String rightSide(int index, MyDataset data)
index
- The index of the value.data
- The dataset.public final java.lang.String label(int index, MyDataset data)
index
- The index of the subset.data
- The dataset.public final int attributeIndex()
public static java.lang.String doubleToString(double value, int afterDecimalPoint)
value
- The value to print.afterDecimalPoint
- Number of decimals positions.public static java.lang.String doubleToString(double value, int width, int afterDecimalPoint)
value
- The value to print.width
- The width that must have the string generated.afterDecimalPoint
- Number of decimals positions.public double getCutPoint()