Azure End-to-end Machine Learning

This repository contains end-to-end example solution based on the Computer Hardware dataset built from Azure Data Factory, Azure Data Lake Gen 2, and Azure Machine Learning Python SDK to ingest data from multiple data sources, build machine learning models, and serve machine learning models as HTTP endpoints.

Architecture

User has data sources from Azure SQL Database and Azure Cosmos Database
User ingests data from data sources with Azure Data Factory to Azure Data Lake Gen 2
User performs data preparation using Azure Data Factory Wrangling Data Flow
User trains machine learning model using Azure Machine Learning Service
User deploys machine learning model to Azure Container Instance using Azure Machine Learning Python SDK

Pre-requisite

Getting Started

Environment preparation
- Run AZ_SUBSCRIPTION_ID='{subscription-id}' AZ_BASE_NAME='{unique-base-name}' AZ_REGION='{azure-region}' ./build_environment.shto provision the Azure environment
- Through Azure Storage Explorer, upload data files from ./data/* to ADLSG2 "demo-prep" container
- Through ADF portal, execute pipeline "PL_E2E_Demo_Prep" (under "Demp-Prep" folder) to hydrate Azure Cosmos DB and Azure SQL Database
Through ADF portal, execute pipeline "PL_E2E_MachineData" to hydrate Azure Data Lake Gen 2 and curate the raw data into curated zone
Through Azure Machine Learning studio [preview],
- Upgrade AML workspace to Enterprise edition. This is required for the advanced AutoML features which this solution will use.
- Create notebook VMs (NBVM) with unique VM name and VM size "STANDARD_DS3_V2"
- Note: make sure that AML studio is scoped to the appropriate AML workspace that is created by the build automation
Create service principal using the following command and note the output (the output is needed later for AML notebook):

az ad sp create-for-rbac \
-n "{unique-sp-name}" \
--role 'Storage Blob Data Reader' \
--scopes /subscriptions/{subscriptions-id}/resourceGroups/{rg-name}/providers/Microsoft.Storage/storageAccounts/{adlsg2-name}

Through AML NBVM Jupyter:
- Create a new terminal and clone this repository (Note: git is pre-installed on AML NBVM)
- Open and walkthrough azure-e2e-ml/aml/configuration.ipynb to configure local environment with AML configurations
  - Note: you have to replace the default values of SUBSCRIPTION_ID, RESOURCE_GROUP, WORKSPACE_NAME, WORKSPACE_REGION with appropriate values in this notebook
- Open and walkthrough azure-e2e-ml/aml/auto-ml-regression-hardware-performance-explanation-and-featurization.ipynb to build and deploy model
- Note: Currently, mini-widget is not support in JupyterLab. Thus, we are using Jupyter for executing the notebook. There is a GitHub issue opened to track the issue.

Todos

Architecture diagram
Automation: build script

PLEASE NOTE FOR THE ENTIRETY OF THIS REPOSITORY AND ALL ASSETS

No warranties or guarantees are made or implied.
All assets here are provided by me "as is". Use at your own risk. Validate before use.
I am not representing my employer with these assets, and my employer assumes no liability whatsoever, and will not provide support, for any use of these assets.
Use of the assets in this repo in your Azure environment may or will incur Azure usage and charges. You are completely responsible for monitoring and managing your Azure usage.

Unless otherwise noted, all assets here are authored by me. Feel free to examine, learn from, comment, and re-use (subject to the above) as needed and without intellectual property restrictions.

If anything here helps you, attribution and/or a quick note is much appreciated.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
adf		adf
aml		aml
arm		arm
data		data
doc		doc
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build_environment.sh		build_environment.sh
data_normalization.py		data_normalization.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Azure End-to-end Machine Learning

Architecture

Pre-requisite

Getting Started

Todos

PLEASE NOTE FOR THE ENTIRETY OF THIS REPOSITORY AND ALL ASSETS

About

Releases

Packages

Languages

License

kawofong/azure-e2e-ml

Folders and files

Latest commit

History

Repository files navigation

Azure End-to-end Machine Learning

Architecture

Pre-requisite

Getting Started

Todos

PLEASE NOTE FOR THE ENTIRETY OF THIS REPOSITORY AND ALL ASSETS

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages