NoiseFramework
NoiseFramework
In this framework, two Big Data preprocessing approaches to remove noisy examples are proposed: an homogeneous ensemble (HME_BD) and an heterogeneous ensemble (HTE_BD) filter. A simple filtering approach based on similarities between instances (ENN_BD) is also implemented.
Status
Use
To Include this package in your Spark application via spark-shell or pySpark, you must use it like:
$SPARK_HOME/bin/spark-shell --packages djgarcia:NoiseFramework:1.2
where $SPARK_HOME is your Spark path.
Release
The latest version is : 1.2 / Date: 2018-04-18 / Scala version: 2.11