A curated list of platforms, tools, practices and resources to create, improve DevOps culture and SRE Team in the organization.
DevOps is the combination of cultural philosophies, practices, and tools that increases an organization’s ability to deliver applications and services at high velocity: evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes. This speed enables organizations to better serve their customers and compete more effectively in the market.
- Cloud Platforms
- Open Source Cloud Platforms
- Operating Systems
- Distributed Filesystems
- Applications Platforms
- Container Image Registry
- Automation & Orchestration
- Continuous Integration & Delivery
- Source Code Management
- Web Servers
- SSL
- Databases
- Observability and Monitoring
- Service Discovery & Service Mesh
- Chaos Engineering
- API Gateway
- Code review
- Distributed messaging
- Programming Languages
- Chat and ChatOps
- Secret Management
- Sharing
- VPN
- Resources
Public and Private Cloud Platforms.
- Amazon Web Services (AWS) - Cloud Computing Services.
- Google Cloud Platform (GCP) - Cloud Computing Services.
- Azure - Cloud Computing Platform & Services.
- Alibaba Cloud - Integrated suite of cloud products and services.
- Oracle Cloud - Comprehensive and fully integrated stack of cloud applications and platform services.
- DigitalOcean - Helping developers easily build, test, manage, and scale applications of any size.
- Scaleway - Single way to create, deploy and scale your infrastructure in the cloud.
- Vultr - Easily deploy cloud servers, bare metal, and storage worldwide.
- VMware Cloud - Run, manage, connect and protect all of your apps on any cloud.
- IBM Cloud - Tools, data & APIs to make AI real now.
- Stackpath - Platform of computing infrastructure and services built at the edge of the cloud.
- Linode - Accelerate innovation in the cloud, virtual computing must be more accessible, affordable, and simple.
- Kinsta - Create and deploy web applications and databases in minutes.
Private, Public and Hybrid open source Cloud Platforms.
- Openstack - Open source software for creating private and public clouds.
- Apache CloudStack - Designed to deploy and manage large networks of virtual machines.
- OpenNebula - Build Private Clouds and manage Data Center virtualization based on KVM, LXD and VMware.
- Eucalyptus - Building AWS-compatible private and hybrid clouds.
- DC/OS - Distributed operating system based on the Apache Mesos distributed systems kernel.
- Apache Mesos - Program against your datacenter like it’s a single pool of resources.
- Localstack - Fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline.
Operating Systems - Server Platform.
- Ubuntu - Enterprise Open Source and Linux.
- Rocky Linux - Open-source enterprise operating system designed to be 100% bug-for-bug compatible with Red Hat Enterprise Linux.
- CoreOS - The pioneering lightweight container host.
- OSv - Versatile modular unikernel designed to run unmodified Linux applications securely on micro-VMs in the cloud.
- Atomic - Use immutable infrastructure to deploy and scale your containerized applications.
- Photon - Linux container host optimized for cloud-native applications, cloud platforms, and VMware infrastructure.
Network distributed filesystems.
- Ceph - Highly scalable object, block and file-based storage under one whole system.
- Gluster - Free and open source software scalable network filesystem.
- LINBIT - Create, remove, and replicate block storage devices for datacenter scale environments.
- XtreemFS - Fault-tolerant distributed file system for all storage needs.
- min.io - High performance, distributed object storage system.
Applications management platforms, Containers platform and Containers management.
- Openshift - The Kubernetes platform for big ideas.
- Dokku - Helps you build and manage the lifecycle of applications.
- Flynn - Open source platform (PaaS) for running applications in production.
- Docker - Create, deploy, and run applications by using containers.
- Docker Compose - Define and run multi-container applications with Docker.
- Docker Swarm - Docker-native clustering system.
- Kubernetes - Automating deployment, scaling, and management of containerized applications.
- LXC - Lets Linux users easily create and manage system or application containers.
- Rancher - Lets you deliver Kubernetes-as-a-Service.
- OpenVz - Container-based virtualization for Linux.
- Singularity - Run the application from the local environment to the cloud.
- AppScale - Easy-to-manage serverless platform for building and running scalable web and mobile applications.
- Kata Containers - Building lightweight virtual machines that seamlessly plug into the containers ecosystem.
- K3S - The certified Kubernetes distribution built for IoT and Edge computing.
- Podman - A tool for managing OCI containers and pods.
- Linx - General-purpose low-code platform for building and hosting backend solutions.
Container Image registry.
- Quay - Container image registry that enables you to build, organize, distribute, and deploy containers.
- Dockyard - Container & Artifact Repository.
- Harbor - An open source trusted cloud native registry project that stores, signs, and scans content.
Tools for automation, orchestration, deployment, provisioning and configuration management.
- Ansible - Simple IT automation platform that makes your applications and systems easier to deploy.
- Salt - Automate the management and configuration of any infrastructure or application at scale.
- Puppet - Unparalleled infrastructure automation and delivery.
- Chef - Automate infrastructure and applications.
- Juju - Simplifies how you configure, scale and operate today's complex software.
- Rundeck - Runbook Automation For Modernizing Your Operations.
- StackStorm - Connects all your apps, services, and workflows. Automate DevOps your way.
- Bosh - Release engineering, deployment, and lifecycle management of complex distributed systems.
- Cloudify - Connect, Control, & Automate from core to edge: unlimited locations, clouds and devices.
- Tsuru - An extensible and open source Platform as a Service software.
- Fabric - High level Python library designed to execute shell commands remotely over SSH.
- Capistrano - A remote server automation and deployment tool.
- Mina - Really fast deployer and server automation tool.
- Terraform - use Infrastructure as Code to provision and manage any cloud, infrastructure, or service.
- Pulumi - Modern infrastructure as code platform that allows you to use familiar programming languages and tools to build, deploy, and manage cloud infrastructure.
- Packer - Build Automated Machine Images.
- Vagrant - Development Environments Made Easy.
- Foreman - Complete lifecycle management tool for physical and virtual servers.
- Nomad - Deploy and Manage Any Containerized, Legacy, or Batch Application.
- Marathon - A production-grade container orchestration platform for DC/OS and Apache Mesos.
- OctoDNS - Managing DNS across multiple providers. DNS as code.
- ManageIQ - Manage containers, virtual machines, networks, and storage from a single platform.
- Ignite - Open Source Virtual Machine (VM) manager with a container UX and built-in GitOps management.
- Spacelift - Flexible orchestration solution for IaC development.
- Atlantis - Terraform Pull Request Automation
- KubeVela - Modern application delivery platform that makes deploying and operating applications across today's hybrid, multi-cloud environments easier, faster and more reliable.
- Stacktape - Developer-friendly Infrastructure as a Code framework built on top of AWS.
- Score - Open Source developer-centric and platform-agnostic workload specification.
Continuous Integration, Continuous Delivery and Continuous Delivery. GitOps.
- On premises
- Buildbot - automate all aspects of the software development cycle.
- Gitlab CI - pipelines build, test, deploy, and monitor your code as part of a single, integrated workflow.
- Jenkins - automation server for building, deploying and automating any project.
- Drone - a Container-Native, Continuous Delivery Platform.
- Concourse - pipeline-based continuous thing-doer.
- Spinnaker - fast, safe, repeatable deployments for every Enterprise.
- goCD - Delivery and Release Automation server.
- Teamcity - enterprise-level CI and CD.
- Bamboo - tie automated builds, tests, and releases together in a single workflow.
- Integrity - Continuous Integration server.
- Zuul - drives continuous integration, delivery, and deployment systems with a focus on project gating.
- Argo - Open Source Kubernetes native workflows, events, CI and CD.
- Strider - Continuous Deployment/Continuous Integration platform.
- Evergreen - A Distributed Continuous Integration System from MongoDB.
- werf - Open Source CI/CD tool for building Docker images & deploying them to Kubernetes using a GitOps approach.
- Flux - automatically ensures that the state of your Kubernetes cluster matches the configuration you’ve supplied in Git.
- Flagger - progressive delivery Kubernetes operator (Canary, A/B Testing and Blue/Green deployments).
- Tekton - powerful and flexible open-source framework for creating CI/CD systems.
- PipeCD - Continuous Delivery for Declarative Kubernetes, Serverless and Infrastructure Applications.
- Gitploy - Build the deployment system around GitHub in minutes.
- Public Services
- Travis CI - easily sync your projects, you’ll be testing your code in minutes.
- Circle CI - powerful CI/CD pipelines that keep code moving.
- Bitrise - CI/CD for mobile applications.
- Buildkite - run fast, secure, and scalable continuous integration pipelines on your own infrastructure.
- Cirrus CI - continuous integration system built for the era of cloud computing.
- Codefresh - GitOps automation platform for Kubernetes apps.
- Github actions - GitHub Actions makes it easy to automate all your software workflows, now with world-class CI/CD.
- Kraken CI - Modern CI/CD, open-source, on-premise system that is highly scalable and focused on testing.
- Earthly - Develop CI/CD pipelines locally and run them anywhere.
Source Code management, Git-repository manager, Version Control. Some of them are included in Code review section.
- GitHub - Helps developers store and manage their code, as well as track and control changes to their code.
- Gitlab - Entire DevOps lifecycle in one application.
- Bitbucket - Gives teams one place to plan projects, collaborate on code, test, and deploy
- Phabricator - A collection of web applications which help software companies build better software.
- Gogs - A painless self-hosted Git service.
- Gitea - A painless self-hosted Git service.
- Gitblit - Pure Java Git solution for managing, viewing, and serving Git repositories.
Web servers and reverse proxy.
- Nginx - High performance load balancer, web server and reverse proxy.
- Apache - Web server and reverse proxy.
- Caddy - Web server with automatic HTTPS.
- Cherokee - Highly concurrent secured web applications.
- Lighttpd - Optimized for speed-critical environments while remaining standards-compliant, secure and flexible.
- Uwsgi - Application server container.
Tools for automating the management of SSL certificates.
- Certbot - Automate using Let’s Encrypt certificates on manually-managed websites to enable HTTPS.
- Let’s Encrypt - Free, automated, and open Certificate Authority.
- Cert Manager - K8S add-on to automate the management and issuance of TLS certificates from various issuing sources.
Relational (SQL) and non-relational (NoSQL) databases.
- Relational (SQL)
- PostgreSQL - Powerful, open source object-relational database system.
- MySQL - Open-source relational database management system.
- MariaDB - Fast, scalable and robust, with a rich ecosystem of storage engines, plugins and many other tools.
- SQLite - Small, fast, self-contained, high-reliability, full-featured, SQL database engine.
- Non-relational (NoSQL)
- Cassandra - Manage massive amounts of data, fast, without losing sleep.
- Apache HBase - Distributed, versioned, non-relational database.
- Couchdb - Database that completely embraces the web.
- Elasticsearch - Distributed, RESTful search and analytics engine capable of addressing a growing number of use cases.
- MongoDB - General purpose, document-based, distributed database built for modern applications.
- Rethinkdb - Open-source database for the realtime web.
- Key-Value
- Couchbase - Distributed multi-model NoSQL document-oriented database that is optimized for interactive applications.
- Leveldb - Fast key-value storage library.
- Redis - In-memory data structure store, used as a database, cache and message broker.
- RocksDB - A library that provides an embeddable, persistent key-value store for fast storage.
- Etcd - Distributed reliable key-value store for the most critical data of a distributed system.
Observability, Monitoring, Metrics/Metrics collection and Alerting tools.
- Sensu - Simple. Scalable. Multi-cloud monitoring.
- Alerta - Scalable, minimal configuration and visualization monitoring system.
- Cabot - Self-hosted, easily-deployable monitoring and alerts service.
- Amon - Modern server monitoring platform.
- Flapjack - Monitoring notification routing event processing system.
- Icinga - Monitors availability and performance, gives you simple access to relevant data and raises alerts.
- Monit - Managing and monitoring Unix systems.
- Naemon - Fast, stable and innovative while giving you a clear view of the state of your network and applications.
- Nagios - Computer-software application that monitors systems, networks and infrastructure.
- Sentry - Error monitoring that helps all software teams discover, triage, and prioritize errors in real-time.
- Shinken - Monitoring framework.
- Zabbix - Mature and effortless monitoring solution for network monitoring and application monitoring.
- Glances - Monitoring information through a curses or Web based interface.
- Healthchecks - Cron monitoring tool.
- Bolo - Building distributed, scalable monitoring systems.
- cAdvisor - Analyzes resource usage and performance characteristics of running containers.
- ElastiFlow - Network flow monitoring (Netflow, sFlow and IPFIX) with the Elastic Stack.
- Co-Pilot - System performance analysis toolkit.
- Metrics/Metrics collection
- Thundra Foresight - Visibility into CI pipeline by spotting test failures in no time.
- Prometheus - Power your metrics and alerting with a leading open-source monitoring solution.
- Collectd - The system statistics collection daemon.
- Facette - Time series data visualization software.
- Grafana - Analytics & monitoring solution for every database.
- Graphite - Store numeric time-series data and render graphs of this data on demand.
- Influxdata - Time series database.
- Netdata - Instantly diagnose slowdowns and anomalies in your infrastructure.
- Freeboard - Real-time dashboard builder for IOT and other web mashups.
- Logs Management
- Anthracite - An event/change logging/management app.
- Graylog - Free and open source log management.
- Logstash - Collect, parse, transform logs.
- Fluentd - Data collector for unified logging layer.
- Flume - Distributed, reliable, and available service for efficiently collecting, aggregating, and moving logs.
- Heka - Stream processing software system.
- Kibana - Explore, visualize, discover data.
- Loki - Horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus.
- Status
Service Discovery, Service Mesh and Failure detection tools.
- Consul - Connect and secure any service.
- Serf - Decentralized cluster membership, failure detection, and orchestration.
- Doozerd - A consistent distributed data store.
- Zookeeper - Centralized service for configuration, naming, providing distributed synchronization, and more.
- Etcd - Distributed, reliable key-value store for the most critical data of a distributed system.
- Istio - Connect, secure, control, and observe services.
- Kong - Deliver performance needed for microservices, service mesh, and cloud native deployments.
- Linkerd - Service mesh for Kubernetes and beyond.
The discipline of experimenting on a distributed system in order to build confidence in the system's capability to withstand turbulent conditions in production.
- Chaos Toolkit - The Open Source Platform for Chaos Engineering.
- Chaos Monkey - A resiliency tool that helps applications tolerate random instance failures.
- Toxiproxy - Simulate network and system conditions for chaos and resiliency testing.
- Pumba - Chaos testing, network emulation and stress testing tool for containers.
- Chaos Mesh - A Chaos Engineering Platform for Kubernetes.
- Litmus - Litmus enables teams to identify weaknesses in infrastructures.
API Gateway, Service Proxy and Service Management tools.
- API Umbrella - Proxy that sits in front of your APIs, API management platform.
- Ambassador - Kubernetes-Native API Gateway built on the Envoy Proxy.
- Kong - Connect all your microservices and APIs with the industry’s most performant, scalable and flexible API platform.
- Tyk - API and service management platform.
- Cilium - API aware networking and security using BPF and XDP.
- Gloo - Feature-rich, Kubernetes-native ingress controller, and next-generation API gateway.
- Envoy - Cloud-native high-performance edge/middle/service proxy.
- Traefik - Reverse proxy and load balancer for HTTP and TCP-based applications.
Code review. A few of the Source Code Management tools have built-in code review features.
- Gerrit - Web-based team code collaboration tool.
- Review Board - Web-based collaborative code review tool.
Distributed messaging platforms and Queues software.
- Rabbitmq - Message broker.
- Kafka - Building real-time data pipelines and streaming apps.
- Activemq - Multi-Protocol messaging.
- Beanstalkd - Simple, fast work queue.
- NSQ - Realtime distributed messaging platform.
- Celery - Asynchronous task queue/job queue based on distributed message passing.
- Faktory - Repository for background jobs within your application.
- Nats - Simple, secure and high performance open source messaging system.
- RestMQ - Message queue which uses HTTP as transport.
- Dkron - Distributed, fault tolerant job scheduling system.
- KubeMQ - Kubernetes-native messaging platform.
Programming languages.
- Python - Programming language that lets you work quickly and integrate systems more effectively.
- Ruby - A dynamic, open source programming language with a focus on simplicity and productivity.
- Go - An open source programming language that makes it easy to build simple, reliable, and efficient software.
Chat and ChatOps.
- Rocket - Open source team communication.
- Mattermost - Messaging platform that enables secure team collaboration.
- Zulip - Real-time chat with an email threading model.
- Riot - A universal secure chat app entirely under your control.
- ChatOps:
Security as code, sensitive credentials and secrets need to be managed, security, maintained and rotated using automation.
- Sops - Simple and flexible tool for managing secrets.
- Vault - Manage secrets and protect sensitive data.
- Keybase - End-to-end encrypted chat and cloud storage system.
- Vault Secrets Operator - Create Kubernetes secrets from Vault for a secure GitOps based workflow.
- Git Secret - A bash-tool to store your private data inside a git repository.
A collection of tools to help with sharing knowledge and telling the story.
- Gitbook - Modern documentation format and toolchain using Git and Markdown.
- Docusaurus - Easy to maintain open source documentation websites.
- Docsify - A magical documentation site generator.
- MkDocs - Project documentation with Markdown.
VPN, routing and firewall.
- OpenVPN - Flexible VPN solutions to secure your data communications, whether it's for Internet privacy.
- Pritunl - Enterprise Distributed OpenVPN and IPsec Server.
- VyOS - Open source network OS that runs on a wide range of hardware, virtual machines, and cloud providers.
- Algo - Set up a personal VPN in the cloud.
- Streisand - Sets up a new VPN service nearly automatically.
- Freelan - A peer-to-peer, secure, easy-to-setup, multi-platform, open-source, highly-configurable VPN software.
- Sshuttle - Transparent proxy server that works as a poor man's VPN.
- SoftEther - An Open-Source Free Cross-platform Multi-protocol VPN Program. as an academic project from University of Tsukuba, under the Apache License 2.0.
- Firezone - Self-hosted VPN server using WireGuard. Supports MFA, SSO, and has easy deployment options.
Books focused on DevOps, DevSecOps and Site Reliability Engineering.
- Effective DevOps: Building a Culture of Collaboration, Affinity, and Tooling at Scale
- Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation
- Hands-On Security in DevOps
- Site Reliability Engineering
- The Site Reliability Workbook
- Building Secure & Reliable Systems
- Infrastructure as Code: Managing Servers in the Cloud
- The DevOps Handbook
Basic understanding and what you should know to become a DevOps Engineer, check the roadmap here.
Your contributions are always welcome! Please take a look at the Contribution Guidelines.