Sending Emails using Apache Airflow Email Operator

 


Automation reduces manual work and plays a key role in improving productivity in various industries. It is one of the fastest and less time-consuming techniques that most businesses practice to gain higher production rates and improve work efficiency. But, unsure how to perform, some of them fail to automate tasks and end up performing functions manually.

Each IT expert has a different job or workflow to perform, right from collecting data from other sources to processing it, uploading, and creating reports. There are many tasks that experts need to perform manually on a daily basis. Thus, to trigger automatic workflow and reduce the time and effort of experts, we recommend using Apache Airflow.

Apache Airflow is an open-source tool that assists in managing complex workflows. The powerful workflow management platform helps resolve issues and aids in programmatically authoring, scheduling, and monitoring daily tasks. Data Scientists or Data Engineers often find it helpful for their industry. Upon a complete walkthrough of this article, you will gain a decent understanding of Apache Airflow. You will also learn about the steps required to automate the process of sending Emails using Airflow EmailOperator.

What is EmailOperator?

Like the DAGs in airflow are used to define the workflow, operators are the building blocks that decide the actual work. These operators define the work or state the actions that one needs to perform at each step. There are different operators for general tasks, including:

    - PythonOperator

    - MySqlOperator

    - EmailOperator

    - BashOperator

Talking about the Airflow EmailOperator, they perform to deliver email notifications to the stated recipient. These can be task-related emails or alerts to notify users. The only disadvantage of using Airflow EmailOperator is that this operator is not customizable. Here is the code:


How to Send Emails using Airflow EmailOperator?

Email Automation feature helps business stakeholders to send logs on time, provide alerts on error files, and share results for Data Analysis. It further helps improve engagement and creates a better experience for the recipient. Also, by automating email, the recipient timely receives a notification about the task specifying if the data pipeline failed or is still running. Overall, the process helps save time and reduces the manual burden of experts.

For testing the email operator job, one must add a DAG file to run the python function. Once the python function is well-executed, the Airflow EmailOperator will send the email to the recipient. To perform this function properly, one must install Apache Airflow or Ubuntu in the virtual machine. Follow the below-listed steps to send an email from Airflow using the Airflow EmailOperator.

Step 1: Log in to the Gmail Account

Change the Google Accounts Settings and allow it to use less secure apps before you begin using Airflow. This step is necessary so that Google allows access to your code. Once the code is live, you can switch back to the changed settings for security reasons.

To change the settings, go to the Google Account => Setting => Less secure app access => Turn it on.

If Google doesn't support for this see here to find the solution.

Python supports the smtplib module that defines the SMTP Client Session Object allowed to send mail to any machine over the internet. It uses an SMTP or ESMTP listener to forward the alert or message.

Step 2: Enable IMAP for the SMTP

- Go to the settings using the gear symbol in your Gmail Account.

- Now, click on the 'Forwarding and POP/IMAP' tab under settings.

- Lastly, enable the IMAP radio button from the sub-section 'IMAP access'.


Step 3 : Update SMTP details in Airflow

In this step, to use the Airflow EmailOperator, you need to update SMTP details in the docker-compose.yaml file if you install Airflow by Docker compose.

- Now using any editor, open the docker-compose.yaml file.

- Add the following configuration in environment component of that file.


- Use the following command to create a DAG file in /airflow/dags folder:


- Once the DAG file is created, it is time to write a DAG file.

Step 4: Import modules for the Workflow

You now need to import Python dependencies for the workflow. You can refer to the following code:


Step 5: Define the default arguments

Next up, you can define the default and DAG-specific arguments:


Step 6: Instantiate a DAG

In this step, generate a DAG name, set settings, and configure the schedule.


Step 7: Setting up Tasks

This step involves setting up workflow tasks. Below are the task codes generated by instantiating.


Step 8: Set  Dependencies

Set dependencies for the tasks that need to be executed. A DAG file only organizes the task. Follow these ways to define dependencies between tasks and create a complete data pipeline:


Step 9: Task Verification

- Now, to check all the log files, select the log tab. When you click on the log tab, a list of active tasks will show up on your screen.


- Here is how the task output will display. For the Send_email task, follow the same steps.

- Here is how the send email task output will display when an email is sent.




Comments

Popular posts from this blog

Built a working Hadoop-Spark-Hive-Superset cluster on Docker

ETL Process Using Airflow and Docker