Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis,[2] advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio sits between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License.

Alluxio
Original author(s)Haoyuan Li
Developer(s)UC Berkeley AMPLab
Initial releaseApril 8, 2013; 11 years ago (2013-04-08)
Stable release
v2.9.3 / March 24, 2023; 20 months ago (2023-03-24)[1]
Repositoryhttps://github.com/Alluxio/alluxio
Written inJava
Operating systemmacOS, Linux
Available inJava
LicenseApache License 2.0
Websitewww.alluxio.io

Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs (such as Hadoop HDFS API, S3 API, FUSE API) provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Apache Spark, Presto, TensorFlow, Trino, Apache Hive, and PyTorch, etc.

Alluxio can be deployed on-premise, in the cloud (e.g. Microsoft Azure, AWS, Google Compute Engine), or a hybrid cloud environment. It can run on bare-metal or in a containerized environments such as Kubernetes, Docker, Apache Mesos.

History

edit

Alluxio was initially started by Haoyuan Li at UC Berkeley's AMPLab in 2013, and open sourced in 2014. Alluxio had in excess of 1000 contributors in 2018,[3] making it one of the most active projects in the data eco-system.

Version Original release date Latest version Release date
Old version, no longer maintained: 0.2 2013-04-08 0.2.1 2013-04-25
Old version, no longer maintained: 0.3 2013-10-21 0.3.0 2013-10-21
Old version, no longer maintained: 0.4 2014-02-02 0.4.1 2014-02-25
Old version, no longer maintained: 0.5 2014-07-20 0.5.0 2014-07-20
Old version, no longer maintained: 0.6 2015-03-01 0.6.4 2015-04-23
Old version, no longer maintained: 0.7 2015-07-17 0.7.1 2015-08-10
Old version, no longer maintained: 0.8 2015-10-21 0.8.2 2015-11-10
Old version, no longer maintained: 1.0 2016-02-23 1.0.1 2016-03-27
Old version, no longer maintained: 1.1 2016-06-06 1.1.1 2016-07-04
Old version, no longer maintained: 1.2 2016-07-17 1.2.0 2016-07-17
Old version, no longer maintained: 1.3 2016-10-05 1.3.0 2016-10-05
Old version, no longer maintained: 1.4 2017-01-12 1.4.0 2017-01-12
Old version, no longer maintained: 1.5 2017-06-11 1.5.0 2017-06-11
Old version, no longer maintained: 1.6 2017-09-24 1.6.1 2017-11-02
Old version, no longer maintained: 1.7 2018-01-14 1.7.1 2018-03-26
Old version, yet still maintained: 1.8 2018-07-07 1.8.2 2019-08-05
Old version, yet still maintained: 2.0 2019-06-27 2.0.1 2019-09-03
Old version, yet still maintained: 2.1 2019-11-06 2.1.2 2020-02-04
Old version, yet still maintained: 2.2 2020-03-11 2.2.2 2020-06-24
Old version, yet still maintained: 2.3 2020-06-30 2.3.0 2020-06-30
Old version, yet still maintained: 2.4 2020-10-19 2.4.1 2020-11-20
Old version, yet still maintained: 2.5 2021-03-10 2.5.0 2021-03-10
Old version, yet still maintained: 2.6 2021-06-23 2.6.2 2021-09-17
Old version, yet still maintained: 2.7 2021-11-16 2.7.4 2022-04-19
Old version, yet still maintained: 2.8 2022-05-04 2.8.1 2022-08-17
Current stable version: 2.9 2022-11-16 2.9.3 2023-03-27
Legend:
Old version, not maintained
Old version, still maintained
Latest version
Latest preview version
Future release

Enterprises that use Alluxio

edit

The following is a list of notable enterprises that have used or are using Alluxio:

See also

edit

References

edit
  1. ^ "Releases · Alluxio/alluxio". github.com. Retrieved 2023-03-04.
  2. ^ Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.
  3. ^ Open HUB Alluxio development activity
  4. ^ "This New Open Source Project Is 100X Faster than Spark SQL In Petabyte-Scale Production".
  5. ^ "Making the Impossible Possible with Tachyon: Accelerate Spark Jobs from Hours to Seconds".
  6. ^ "China Unicom's big bet on open source".
  7. ^ "Operationalizing Machine Learning—Managing Provenance from Raw Data to Predictions".
  8. ^ "Cray Analytics and Alluxio – Wrangling Enterprise Storage". Archived from the original on 2019-07-14. Retrieved 2019-02-19.
  9. ^ "Alluxio's Use and Practice in Didi".
  10. ^ "Data Transformation in Financial Services".
  11. ^ "ArcGIS and Alluxio - Using Alluxio to enhance ArcGIS data capability and get faster insights from all your data".
  12. ^ "Huawei hugs open-sourcey Alluxio: Thanks for the memories". The Register.
  13. ^ "How Alluxio is Accelerating Apache Spark Workloads". Archived from the original on 2019-07-14. Retrieved 2019-02-19.
  14. ^ "Getting Started with Tachyon by Use Cases".
  15. ^ "Using Alluxio as a fault-tolerant pluggable optimization component of JD.com's compute frameworks".
  16. ^ "World's Largest Computer Maker Lenovo Selects Alluxio for Data Management of Worldwide Smartphone Data".
  17. ^ "Enhancing the Value of Alluxio with Samsung NVMe SSDs".
  18. ^ "Tencent Delivering Customized News to Over 100 Million Users per Month with Alluxio".
  19. ^ "The Practice of Alluxio in Near Real-Time Data Platform at VIPShop".
  20. ^ "Bringing Data to Life - Data Management and Visualization Techniques".
edit