How to start with Apache Airflow in Docker (Windows)

 


In general terms, Apache Airflow is an open-source tool that allow us to manage, monitor, plan and schedule workflows that is normally used as a workflow (services) orchestrator.

Prerequisites:

    1. Install Docker Desktop in your computer.

Get Started

Follow this steps in order to start with your testing Airflow environment in docker:

    1. The first step is to download Docker Desktop from the official website. For this article, I installed 4.4.4 version.

    2. After installing Docker Desktop, we need to download a docker-compose.yaml file that you can also find here.

    3. Now that we have both files, we need to create our airflow directory. Go to the following path: C:/Users/<your_user>/. Inside of that directory, create a folder called docker and inside of docker create another folder called airflow.

    4. Now that we have our airflow folder, we must do the following: a) Create three folders called dags, plugins and logs respectively; b) Move our YAML to that directory.

Your directory will look like this

    5. Now we are ready to start our instance of Airflow in docker. We need to open a PowerShell window and go to the directory above.

    6. Then, we need to run the following commands:

    docker-compose up airflow-init

    docker-compose up

    Note: After second command, a window will keep running some code. We can close this window without any problem.

    7. After that, we can go to our Docker Desktop app and we will see that a container named airflow was created inside section Container/Apps. This container will have 7 sub-containers inside.

Your container will look like this

    8. With this, we have our Apache Airflow instance completely ready to start developing our DAGS.

    9. Go to localhost:8080, login with user "airflow" and password "airflow" and start coding.

    Note: You can put all your DAGs in the dags folder and all your plugins in plugins folder both created in step 4.


References

[1] Airflow. Running Airflow in Docker.
https://airflow.apache.org/docs/apache-airflow/2.0.2/start/docker.html


Comments

Popular posts from this blog

Sending Emails using Apache Airflow Email Operator

Built a working Hadoop-Spark-Hive-Superset cluster on Docker

ETL Process Using Airflow and Docker