DiReliefF

Feature selection (FS) is a key in the machine learning : removing irrelevant and redundant features usually helps to reduce the effort required to process a dataset while maintaining or even improving the processing algorithm’s accuracy. However, traditional algorithms lack scalability to deal with the increasing amount of data that have become available in the current Big Data era. ReliefF is one of the most important algorithms successfully implemented in many FS applications. It is a completely redesigned distributed version of the popular ReliefF algorithm based on the novel Spark cluster computing model.

Status

Use

To run DiReliefF, the following lines must be added to the .sbt:

name := "spark-relieff"\
version := "0.1.0"\
organization := "rauljosepalma"\
scalaVersion := "2.10.5"\
val sparkVersion = "1.6.0"\
libraryDependencies ++= Seq(\
"org.apache.spark" %% "spark-core" % sparkVersion,\
"org.apache.spark" %% "spark-mllib" % sparkVersion)

Release

The latest version of DireliefF is:

Reference

Palma-Mendoza, Raul-Jose, Daniel Rodriguez, and Luis de-Marcos. "Distributed ReliefF-based feature selection in Spark." Knowledge and Information Systems (2018): 1-20.

You are here

DiReliefF

DiReliefF

Status

Use

Release

Reference

User login

SCI2S Web-site Related