Skip to content

Pizz0x/RetailDataPipeline

Repository files navigation

RetailDataPipeline

Virtual Environment Configuration:

Upgrade the Python's package installer to its latest version: python.exe -m pip install --upgrade pip

Installing Pandas library: pip install pandas

Installing SQLAlchemy library: pip install sqlalchemy psycopg2

Initiate the virtual environment: source airflow_env/bin/activate You need it to run postgres and airflow operation, do it every time you work on the project.

Start PostgreSQL : sudo service postgresql start

Installation of AirFlow Apache:

Install pip: sudo apt install python3-pip

Install the virtual environment: sudo pip3 install virtualenv

Install airflow: pip3 install apache-airflow[gcp,sentry]

airflow db init

Create an airflow user: airflow users create --username admin --password admin --firstname admin --lastname admin --role Admin --email [email protected]

airflow users list

Install the necessary dependencies for interacting with PostgreSQL databases: pip install apache-airflow-providers-postgres

Run the scheduler: airflow scheduler

Run the web server: airflow webserver -p 8080

Installation of Postgres:

sudo sh -c 'echo "deb [signed-by=/usr/share/postgresql-common/pgdg/apt.postgresql.org.asc] https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -

sudo apt-get -y install postgresql

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages