This is a subproject for better file handling with mlr and OpenML.
Please install the proper CRAN releases in the usual way. If you absolutely have to install from here (you should not):
devtools::install_github("mlr-org/farff")
ARFF files are like CSV files, with a little bit of added meta information in a header and standardized NA values. They are quite often used for machine learning data sets and were introduced for the WEKA machine learning java toolbox.
Several reasons motivated the development of farff:
- The java dependency of RWeka is annoying.
- The I/O code in RWeka is pretty slow, at least the reading of files in farff is much faster.
library(farff)
# import arff format file
d = readARFF("iris.arff")
# export arff format file
writeARFF(iris, path = "iris.arff")
- We read the ARFF header with pure R code.
- We preprocess the data section a bit with custom C code and write the result into a temporary file TEMP.
- The TEMP file, i.e., the data section, is parsed with readr::read_delim. Support for data.table::fread is planned for future releases.