Apache Airflow VM by Anarion Technologies
Apache Airflow is an open-source platform that has become a cornerstone for automating and orchestrating complex workflows. Originally developed by Airbnb, Airflow provides a robust framework for defining, scheduling, and monitoring workflows as code. This approach allows users to create Directed Acyclic Graphs (DAGs), which visually represent the sequence of tasks that need to be executed. Each task in a DAG can be configured to run independently, with dependencies on other tasks, making it an ideal tool for managing complex data pipelines.
One of the key advantages of Airflow is its flexibility and scalability. Workflows can be easily modified, versioned, and deployed across different environments, ensuring consistency and reliability. With its Python-based architecture, Airflow allows developers to write custom plugins, operators, and hooks to integrate with a wide range of systems and services. This extensibility makes it suitable for a variety of use cases, from simple data extraction and transformation tasks to sophisticated machine learning pipelines.
Airflow’s web-based user interface offers powerful monitoring and visualization capabilities, allowing users to track the progress of their workflows in real-time, inspect logs, and troubleshoot failures. It also supports alerting and notifications, ensuring that issues are promptly addressed. Additionally, Airflow’s ability to manage workflows across distributed systems makes it a valuable tool in environments where tasks are run on multiple nodes, whether on-premises or in the cloud.
Overall, Apache Airflow is a critical component in the toolkit of data engineers, DevOps professionals, and developers who need a reliable and scalable solution for managing the flow of data and the execution of tasks across complex systems. Its community-driven development ensures that it continues to evolve, with new features and integrations regularly added to meet the needs of modern data infrastructure.
To subscribe to this product from Azure Marketplace and initiate an instance using the Azure compute service, follow these steps:
1. Navigate to Azure Marketplace and subscribe to the desired product.
2. Search for “virtual machines” and select “Virtual machines” under Services.
3. Click on “Add” in the Virtual machines page, which will lead you to the Create a virtual machine page.
4. In the Basics tab:
- Ensure the correct subscription is chosen under Project details.
- Opt for creating a new resource group by selecting “Create new resource group” and name it as “myResourceGroup.”
5. Under Instance details:
- Enter “myVM” as the Virtual machine name.
- Choose “East US” as the Region.
- Select “Ubuntu 18.04 LTS” as the Image.
- Leave other settings as default.
6. For Administrator account:
- Pick “SSH public key.”
- Provide your user name and paste your public key, ensuring no leading or trailing white spaces.
7. Under Inbound port rules > Public inbound ports:
- Choose “Allow selected ports.”
- Select “SSH (22)” and “HTTP (80)” from the drop-down.
8. Keep the remaining settings at their defaults and click on “Review + create” at the bottom of the page.
9. The “Create a virtual machine” page will display the details of the VM you’re about to create. Once ready, click on “Create.”
10. The deployment process will take a few minutes. Once it’s finished, proceed to the next section.
To connect to the virtual machine:
1. Access the overview page of your VM and click on “Connect.”
2. On the “Connect to virtual machine” page:
- Keep the default options for connecting via IP address over port 22.
- A connection command for logging in will be displayed. Click the button to copy the command. Here’s an example of what the SSH connection command looks like:
“`
ssh [email protected]
“`
3. Using the same bash shell that you used to generate your SSH key pair, you can either reopen the Cloud Shell by selecting >_ again
or going to https://shell.azure.com/bash.
4. Paste the SSH connection command into the shell to initiate an SSH session.
Usage/Deployment Instructions
Anarion Technologies – Airflow
Note: Search product on Azure marketplace and click on “Get it now”
Click on Continue
Click on Create
Creating a Virtual Machine, enter or select appropriate values for zone, machine type, resource group and so on as per your choice.
After Process of Create Virtual Machine. You have got an Option Go to Resource Group
Click Go to Resource Group
Click on the Network Security Group: airflow-nsg
Click on Inbound Security Rule
Click on Add
Add Port
Destination Port Ranges Section* (where default value is 8080)
8080
Select Protocol as TCP
Option Action is to be Allow
Click on Add
Click on Refresh
Copy the Public IP Address
SSH into Terminal and Run following Commands:
$ sudo su
$ sudo apt update
$ source airflow_env/bin/activate
$ airflow scheduler
Open new Terminal
$ sudo su
$ sudo apt update
$ source airflow_env/bin/activate
$ airflow webserver
Copy the Public IP Address
In your browser, you can now access by navigating to the IP address of your server:
http://”instance IP Address:8080″
Login Creds
Username: admin
Password: admin
ThankYou!!!!