benchmarks

Benchmarks

All benchmarks reported here were performed on an Intel i7-7820x CPU. GPU Benchmarks were done on a NVIDIA GTX 1080 Ti.

Spark Comparison

The benchmark_spark.py script compares the AlternatingLeastSquares model found here to the implementation found in Spark MLlib.

To run this comparison, you should first compile Spark with native BLAS support.

This benchmark compares the Conjugate Gradient solver found in implicit on both the CPU and GPU, to the Cholesky solver used in Spark.

The times per iteration are average times over 5 iterations.

last.fm 360k dataset

For the lastm.fm dataset at 256 factors, implicit on the CPU is 30x faster than Spark and the GPU version of implicit is 93x faster than Spark:

MovieLens 20M dataset

For the ml20m dataset at 256 factors, implicit on the CPU was 8x faster than Spark while the GPU version was 68x faster than Spark:

Note that this dataset was filtered down for all versions to reviews that were positive (4 stars), to simulate a truly implicit dataset.

Implicit on the CPU seems to suffer a bit here relative to the other options. It seems like there might be a single threaded bottleneck at some point thats worth examining later.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
benchmark_als.py		benchmark_als.py
benchmark_qmf.py		benchmark_qmf.py
benchmark_spark.py		benchmark_spark.py
spark_speed_lastfm.png		spark_speed_lastfm.png
spark_speed_ml20m.png		spark_speed_ml20m.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmarks

benchmarks

README.md

Benchmarks

Spark Comparison

last.fm 360k dataset

MovieLens 20M dataset

Files

benchmarks

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmarks

Folders and files

parent directory

README.md

Benchmarks

Spark Comparison

last.fm 360k dataset

MovieLens 20M dataset