public class Ripper
extends java.lang.Object
Modifier and Type | Field and Description |
---|---|
static int |
A
Flag ('Accuracy'metric)
|
static int |
W
Flag ('Worth' metric)
|
Constructor and Description |
---|
Ripper(parseParameters parameters)
It reads the data from the input files (training, validation and test) and parse all the parameters
from the parameters array.
|
Modifier and Type | Method and Description |
---|---|
void |
execute()
It launches the algorithm.
|
MyDataset |
getData()
Returns the training dataset.
|
Rule |
grow(MyDataset data,
Mask positives,
Mask negatives)
It grows a rule maximizing the following heuristic:
h= p*(log(p/t)-log(P/T))
p/t: number of positive/total instances covered by the current rule
P/T: number of positive/total instances
|
Rule |
grow(Rule rule,
MyDataset data,
Mask grow_pos,
Mask grow_neg)
It expands a rule, greedily adding simple rules, maximizing the following heuristic:
h= p*(log(p/t)-log(P/T))
p/t: number of positive/total instances covered by the current rule
P/T: number of positive/total instances
|
Ruleset |
IREPstar(Ruleset rules,
MyDataset data,
Mask pos,
Mask neg)
It implements the Ripper2's Build Phase:
Iteratively, it grows and prunes rules until the descrition length (DL) of the ruleset
and examples is 64 bits greater than the smallest DL met so far, or there are
no positive examples, or the error rate >= 50%.
|
Ruleset |
optimize(Ruleset rules,
MyDataset data,
Mask positives,
Mask negatives)
It implements the Ripper2's Optimization Phase:
After generating the initial ruleset {Ri},
generate and prune two variants of each rule Ri from randomized data
using the grow and prune method.
|
Rule |
prune(Rule rule,
MyDataset data,
Mask positives,
Mask negatives,
int metric)
It prunes a rule, according with one of two heuristics:
W= (p+1)/(t+2)
A= (p+n')/T
p/t: number of positive/total instances covered by the current rule
n': number of negative instances not covered by the current rule (true negatives)
T: number of total instances
|
Ruleset |
ripperK(MyDataset data,
Mask positives,
Mask negatives)
It implements the algorithm Ripper2:
1.
|
Ruleset[] |
ripperMulticlass(MyDataset data)
It implements the algorithm Ripperk itself:
1.
|
public static int W
public static int A
public Ripper(parseParameters parameters)
parameters
- parseParameters It contains the input files, output files and parameterspublic void execute()
public Rule grow(MyDataset data, Mask positives, Mask negatives)
data
- MyDataset the datasetpositives
- Mask active positive entriesnegatives
- Mask active negative entriespublic Rule grow(Rule rule, MyDataset data, Mask grow_pos, Mask grow_neg)
rule
- Rule the base ruledata
- MyDataset the datasetgrow_pos
- Mask active positive entriesgrow_neg
- Mask active negative entriespublic Rule prune(Rule rule, MyDataset data, Mask positives, Mask negatives, int metric)
rule
- Rule the rule to prunedata
- MyDataset the datasetpositives
- Mask active positive entriesnegatives
- Mask active negative entriesmetric
- int heuristic's selector (A or W)public Ruleset[] ripperMulticlass(MyDataset data)
data
- MyDataset the datasetpublic Ruleset IREPstar(Ruleset rules, MyDataset data, Mask pos, Mask neg)
rules
- Ruleset the rules generated so fardata
- MyDataset the datasetpos
- Mask active positive entries of dataneg
- Mask active negative entries of datapublic Ruleset optimize(Ruleset rules, MyDataset data, Mask positives, Mask negatives)
rules
- Ruleset the rules from the build phasedata
- MyDataset the datasetpositives
- Mask active positive entriesnegatives
- Mask active negative entriespublic Ruleset ripperK(MyDataset data, Mask positives, Mask negatives)
data
- MyDataset the datasetpositives
- Mask active positive entriesnegatives
- Mask active negative entriespublic MyDataset getData()