Transpile trained scikit-learn estimators to C, Java, JavaScript and others.
It's recommended for limited embedded systems and critical applications where performance matters most.
Navigation: Estimators • Installation • Usage • Known Issues • Development • Citation • License
This table gives an overview over all supported combinations of estimators, programming languages and templates.
Programming language | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
C | Go | Java | JS | PHP | Ruby | |||||||||||||
svm.SVC | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | |
svm.NuSVC | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | |
svm.LinearSVC | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | ✓ | ✓ | × | |
tree.DecisionTreeClassifier | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | |
ensemble.RandomForestClassifier | × | ✓ᴾ | × | × | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | × | ||||||
ensemble.ExtraTreesClassifier | × | ✓ᴾ | × | × | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | ✓ᴾ | × | ||||||
ensemble.AdaBoostClassifier | × | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | ✓ᴾ | ||||||||||||
neighbors.KNeighborsClassifier | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | |||
naive_bayes.BernoulliNB | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ||||||||||||
naive_bayes.GaussianNB | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ||||||||||||
neural_network.MLPClassifier | ✓ᴾ | ✓ᴾ | × | ✓ᴾ | ✓ᴾ | × | ||||||||||||
neural_network.MLPRegressor | ✓ | ✓ | × | |||||||||||||||
ᴀ | ᴇ | ᴄ | ᴀ | ᴇ | ᴄ | ᴀ | ᴇ | ᴄ | ᴀ | ᴇ | ᴄ | ᴀ | ᴇ | ᴄ | ᴀ | ᴇ | ᴄ | |
Template |
✓ = support of predict
, ᴾ = support of predict_proba
, × = not supported or feasible
ᴀ = attached model data, ᴇ = exported model data (JSON), ᴄ = combined model data
Purpose | Branch | Build | Command |
---|---|---|---|
Production | stable | pip install sklearn-porter |
|
Development | master | pip install https://github.com/nok/sklearn-porter/zipball/master |
In both environments the only prerequisite is scikit-learn >= 0.17, <= 0.22
.
Try it out yourself by starting an interactive notebook with Binder:
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn_porter import port, save, make, test
# 1. Load data and train a dummy classifier:
X, y = load_iris(return_X_y=True)
clf = DecisionTreeClassifier()
clf.fit(X, y)
# 2. Port or transpile an estimator:
output = port(clf, language='js', template='attached')
print(output)
# 3. Save the ported estimator:
src_path, json_path = save(clf, language='js', template='exported', directory='/tmp')
print(src_path, json_path)
# 4. Make predictions with the ported estimator:
y_classes, y_probas = make(clf, X[:10], language='js', template='exported')
print(y_classes, y_probas)
# 5. Test always the ported estimator by making an integrity check:
score = test(clf, X[:10], language='js', template='exported')
print(score)
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn_porter import Estimator
# 1. Load data and train a dummy classifier:
X, y = load_iris(return_X_y=True)
clf = DecisionTreeClassifier()
clf.fit(X, y)
# 2. Port or transpile an estimator:
est = Estimator(clf, language='js', template='attached')
output = est.port()
print(output)
# 3. Save the ported estimator:
est.template = 'exported'
src_path, json_path = est.save(directory='/tmp')
print(src_path, json_path)
# 4. Make predictions with the ported estimator:
y_classes, y_probas = est.make(X[:10])
print(y_classes, y_probas)
# 5. Test always the ported estimator by making an integrity check:
score = est.test(X[:10])
print(score)
In addition you can use the sklearn-porter on the command line. The command calls porter
and is available after the installation.
porter {show,port,save} [-h] [-v]
porter show [-l {c,go,java,js,php,ruby}] [-h]
porter port <estimator> [-l {c,go,java,js,php,ruby}]
[-t {attached,combined,exported}]
[--skip-warnings] [-h]
porter save <estimator> [-l {c,go,java,js,php,ruby}]
[-t {attached,combined,exported}]
[--directory DIRECTORY]
[--skip-warnings] [-h]
You can serialize an estimator and save it locally. For more details you can read the instructions to model persistence.
from joblib import dump
dump(clf, 'estimator.joblib', compress=0)
After that the estimator can be transpiled by using the subcommand port
:
porter port estimator.joblib -l js -t attached > estimator.js
For further processing you can pass the result to another applications, e.g. UglifyJS.
porter port estimator.joblib -l js -t attached | uglifyjs --compress -o estimator.min.js
- In some rare cases the regression tests of the support vector machine, SVC and NuSVC, fail since
scikit-learn>=0.22
. Because of that aQualityWarning
will be raised which should reminds you to evaluate the result by using thetest
method.
The following commands are useful time savers in the daily development:
# Install a Python environment with `conda`:
make setup
# Start a Jupyter notebook with examples:
make notebook
# Start tests on the host or in a separate docker container:
make tests
make tests-docker
# Lint the source code with `pylint`:
make lint
# Generate notebooks with `jupytext`:
make examples
# Deploy a new version with `twine`:
make deploy
The prerequisite is Python 3.6 which you can install with conda:
conda env create -n sklearn-porter_3.6 python=3.6
conda activate sklearn-porter_3.6
After that you have to install all required packages:
pip install --no-cache-dir -e ".[development,examples]"
All tests run against these combinations of scikit-learn and Python versions:
Python | |||||
3.5 | 3.6 | 3.7 | 3.8 | ||
scikit-learn | 0.17 | cython 0.27.3 | cython 0.27.3 | not supported by scikit-learn |
no support by scikit-learn |
numpy 1.9.3 | numpy 1.9.3 | ||||
scipy 0.16.0 | scipy 0.16.0 | ||||
0.18 | cython 0.27.3 | cython 0.27.3 | not supported by scikit-learn |
not supported by scikit-learn |
|
numpy 1.9.3 | numpy 1.9.3 | ||||
scipy 0.16.0 | scipy 0.16.0 | ||||
0.19 | cython 0.27.3 | cython 0.27.3 | not supported by scikit-learn |
not supported by scikit-learn |
|
numpy 1.14.5 | numpy 1.14.5 | ||||
scipy 1.1.0 | scipy 1.1.0 | ||||
0.20 | cython 0.27.3 | cython 0.27.3 | cython 0.27.3 | not supported by joblib |
|
numpy | numpy | numpy | |||
scipy | scipy | scipy | |||
0.21 | cython | cython | cython | cython | |
numpy | numpy | numpy | numpy | ||
scipy | scipy | scipy | scipy | ||
0.22 | cython | cython | cython | cython | |
numpy | numpy | numpy | numpy | ||
scipy | scipy | scipy | scipy |
For the regression tests we have to use specific compilers and interpreters. On 19th November 2019 the following compilers and interpreters are used for these tests:
Name | Source | Version |
---|---|---|
GCC | https://gcc.gnu.org | 9.2.1 |
Go | https://golang.org | 1.13.4 |
Java (OpenJDK) | https://openjdk.java.net | 1.8.0 |
Node.js | https://nodejs.org | 10.17.0 |
PHP | https://www.php.net | 7.3.11 |
Ruby | https://www.ruby-lang.org | 2.5.7 |
Please notice that in general you can use older compilers and interpreters with the generated source code. For instance you can use Java 1.6 to compile and run models.
You can activate logging by changing the option logging.level
.
from sklearn_porter import options
from logging import DEBUG
options['logging.level'] = DEBUG
You can run the unit and regression tests either on your local machine (host) or in a separate running Docker container.
pytest tests -v \
--cov=sklearn_porter \
--disable-warnings \
--numprocesses=auto \
-p no:doctest \
-o python_files="EstimatorTest.py" \
-o python_functions="test_*"
docker build \
-t sklearn-porter \
--build-arg PYTHON_VER=${PYTHON_VER:-python=3.6} \
--build-arg SKLEARN_VER=${SKLEARN_VER:-scikit-learn=0.21} \
.
docker run \
-v $(pwd):/home/abc/repo \
--detach \
--entrypoint=/bin/bash \
--name test \
-t sklearn-porter
docker exec -it test ./docker-entrypoint.sh \
pytest tests -v \
--cov=sklearn_porter \
--disable-warnings \
--numprocesses=auto \
-p no:doctest \
-o python_files="EstimatorTest.py" \
-o python_functions="test_*"
docker rm -f $(docker ps --all --filter name=test -q)
If you use this implementation in you work, please add a reference/citation to the paper. You can use the following BibTeX entry:
@unpublished{sklearn_porter,
author = {Darius Morawiec},
title = {sklearn-porter},
note = {Transpile trained scikit-learn estimators to C, Java, JavaScript and others},
url = {https://github.com/nok/sklearn-porter}
}
The package is Open Source Software released under the MIT license.