zlibsvm is an object-oriented, easy to use and simple Java binding for the famous LIBSVM library hosted on GitHub.
It encapsulates the cross-compiled Java code from LIBSVM behind an object-oriented API which can be easily used via Apache Maven in your own projects.
To use the latest release of zlibsvm, please use the following snippet in your pom.xml
<dependency>
<groupId>de.hs-heilbronn.mi</groupId>
<artifactId>zlibsvm-core</artifactId>
<version>2.1.0</version>
</dependency>
A code example can be found here.
The dataset format for LIBSVM is
label feature_id1:feature_value1 feature_id2:feature_value2 ...
Thus, every feature (or value) needs its own unique identifier.
For three different class labels 1,2,3
and a feature set consisting of a(id=1),b(id=2),c=(id=3)
, a valid data representation for three data points d1,d2,d3
would be
2 1:0.5325 3:0.523
3 2:0.7853 3:0.6326
1 1:0.53265 2:0.5422
Meaning:
d1
contains featurea(id=1)
andc(id=3)
d2
contains featureb(id=2)
andc(id=3)
d3
contains featurea(id=1)
andb(id=2)
Note, that it is not necessary to provide feature_id1:feature_value1
for features, which are not contained in the given data point.
First of all, you need to implement your custom SvmDocument
and a custom SvmFeature
, which could be like:
public class SvmDocumentImpl implements SvmDocument {
private final List<SvmFeature> features;
private final List<SvmClassLabel> classLabels = new ArrayList<>();
public SvmDocumentImpl(List<SvmFeature> features) {
this.features = features;
}
public List<SvmFeature> getSvmFeatures() {
return features;
}
public SvmClassLabel getClassLabelWithHighestProbability() {
if (classLabels.isEmpty()) {
return null;
}
return Collections.max(classLabels);
}
@Override
public List<SvmClassLabel> getAllClassLabels() {
return Collections.unmodifiableList(classLabels);
}
@Override
public void addClassLabel(SvmClassLabel classLabel) {
assert (classLabel != null);
this.classLabels.add(classLabel);
}
}
public record SvmFeatureImpl(int index, double value) implements SvmFeature {
public int getIndex() {
return index;
}
public double getValue() {
return value;
}
@Override
public int compareTo(SvmFeature o) {
return Integer.compare(getIndex(), o.getIndex());
}
}
To obtain an SvmModel
the SVM needs to be trained. This is done via an SvmConfigurationImpl.Builder()
, which is used to specify your custom SVM configuration.
The default configuration is the same as described here: C_SVC, RBF-Kernel with gamma 0 and cost 1.
SvmTrainer trainer = new SvmTrainerImpl(new SvmConfigurationImpl.Builder().build(),"my-custom-trained-model");
SvmModel model = trainer.train(documentsForTraining);
After this step it is possible to use this SvmModel
for prediction.
SvmClassifier classifier = new SvmClassifierImpl(model);
List<SvmDocument> classified = classifier.classify(documentsForPrediction, true);
for(SvmDocument d : classified) {
System.out.println(d.toString() " was classified as category:" d.getClassLabelWithHighestProbability().getNumeric());
}
Published under Apache License 2.0