Azure Databricks lets you spin up clusters and build quickly in a fully managed Apache Spark environment with the global scale and availability of Azure. And of course, for any production-level solution, monitoring is a critical aspect.
Azure Databricks comes with robust monitoring capabilities for custom application metrics, streaming query events, and application log messages. It allows you to push this monitoring data to different logging services.
In this article, we will look at the setup required to send application logs and metrics from Microsoft Azure Databricks to a Log Analytics workspace.
After cloning repository please open the terminal in the respective path
Please run the command as follows
docker run -it --rm -v %cd%/spark-monitoring:/spark-monitoring -v "%USERPROFILE%/.m2":/root/.m2 maven:3.6.1-jdk-8 /spark-monitoring/build.sh
chmod +x spark-monitoring/build.sh docker run -it --rm -v `pwd`/spark-monitoring:/spark-monitoring -v "$HOME/.m2":/root/.m2 maven:3.6.1-jdk-8 /spark-monitoring/build.sh
dbfs configure –token
It will ask for Databricks workspace URL and Token
Use the personal access token that was generated when setting up the prerequisites
You can get the URL from
Azure portal > Databricks service > Overview
dbfs mkdirs dbfs:/databricks/spark-monitoring
Open the file /src/spark-listeners/scripts/spark-monitoring.sh
Now add the Log Analytics Workspace ID and Key
Use Databricks CLI to copy the modified script
dbfs cp <local path to spark-monitoring.sh> dbfs:/databricks/spark-monitoring/spark-monitoring.sh
Use Databricks CLI to copy all JAR files generated
dbfs cp --overwrite --recursive <local path to target folder> dbfs:/databricks/spark-monitoring/
5 Under “Advanced Options”, click on the “Init Scripts” tab. Go to the last line under the
“Init Scripts section” and select “DBFS” under the “destination” dropdown. Enter
“dbfs:/databricks/spark-monitoring/spark-monitoring.sh” in the text box. Click the
6 Click the “create cluster” button to create the cluster. Next, click on the “start” button to start the cluster.
Now you can run the jobs in the cluster and can get the logs in the Log Analytics workspace
We hope this article helps you set up the right configurations to send application logs and metrics from Azure Databricks to your Log Analytics workspace.
CloudIQ is a leading Cloud Consulting and Solutions firm that helps businesses solve today’s problems and plan the enterprise of tomorrow by integrating intelligent cloud solutions. We help you leverage the technologies that make your people more productive, your infrastructure more intelligent, and your business more profitable.
626 120th Ave NE, B102, Bellevue,
Chennai One IT SEZ,
Module No:5-C, Phase ll, 2nd Floor, North Block, Pallavaram-Thoraipakkam 200 ft road, Thoraipakkam, Chennai – 600097
Get in touch
Please contact us using the form below