Automating PowerBI Refreshes using Python Notebook in Databricks

PowerBI is a collection of analytics services by Microsoft that helps turn your data into insights. It consists of different elements that work together.

PowerBI Desktop : A windows application where the user can create reports & visualization.

PowerBI Service : A SaaS to view reports and dashboards. Also you can create and manage dataflows (collection of entities/tables). Refresh of these dataflows and datasets of published reports can be scheduled in PowerBI service or they can be executed with the help of powerbi rest API

Even though you can schedule refresh of datasets, dataflows in PowerBI service, using REST API you can integrate the refreshes to your ETL or other workflows.

STEPS

  1. Creating an App in Azure Portal and Secutiry Group. Client id associated with the app will be required for authentication. You can either do this via portal.azure.com or with PowerShell.
#Install Module required to sign into Azure Directory
Install-Module AzureAD
#Sign in Azure
Connect-AzureAD
# Create a new Azure AD web application
$app = New-AzureADApplication -DisplayName "AppName" -Homepage "https://localhost:44322" -ReplyUrls "https://localhost:44322"
# Creates a service principal
$sp = New-AzureADServicePrincipal -AppId $app.AppId
# Get the service principal key
$key = New-AzureADServicePrincipalPasswordCredential -ObjectId $sp.ObjectId
# Create an Azure AD security group
$group = New-AzureADGroup -DisplayName <Group display name> -SecurityEnabled $true
#Add the Service principal created in above step to the Sec. group
Add-AzureADGroupMember -ObjectId $($group.ObjectId) -RefObjectId $($sp.ObjectId)

2. PowerBI Admin needs to enable Service principal access in Admin Portal and add security group created in Step1.

3. Add Security Group to your workspace. (PowerShell)

Login-PowerBI  #Interactive LoginAdd-PowerBIWorkspaceUser -Id $workspaceid -AccessRight Member -PrincipalType Group -Identifier $group.ObjectId

workspaceid can be obtained from the powerbi service link

https://app.powerbi.com/groups/<workspace_id>/list/dashboards

Databricks Notebook:

  1. Installation of adal library to authenticate
!pip install — upgrade pip 
!pip install adal # for authentication
import adal

2. Token creation by passing Client_id and client_secret which is created in Azure Portal → App Registration →Certificate & Assets . (Copy the value)

tenant_id = "<tenant_id>"
authority_url = 'https://login.microsoftonline.com/'+tenant_id+'/'
resource_url = 'https://analysis.windows.net/powerbi/api'
client_secret = "client_secret>"
client_id = "<client_id>"
context = adal.AuthenticationContext(authority=authority_url,
validate_authority=True,
api_version=None)
token = context.acquire_token_with_client_credentials(resource_url, client_id, client_secret)
access_token = token.get('accessToken')

3. Import requests, json library and call GET Method to get the list of all the dataflows in the workspace in scope.

import requests
import json
workspaceID = "<workspace_id>"
dataflow_url = 'https://api.powerbi.com/v1.0/myorg/groups/' + workspaceID + '/dataflows'
header = {'Authorization': f'Bearer {access_token}','Content-Type':'application/json'}
dataflows = json.loads(requests.get(url=dataflow_url, headers=header).content)['value']

4. Loop through list of dataflows and refresh them.

Make sure to give parameter ‘refreshRequest’:’y’ else it will through Error 400. This is not required while using Invoke-RestMethod in PowerShell. I was struggling a lot with this error when migrating from powershell to Python. (Thanks to Postman App https://www.postman.com/ )

for dataflow in dataflows:
dataflow_name = dataflow['name']
dataflowid = dataflow['objectId']
url = 'https://api.powerbi.com/v1.0/myorg/groups/' + workspaceID + '/dataflows/' +dataflowid+ '/refreshes'
payload= {'refreshRequest': 'y'}
data = json.dumps(payload)
response = requests.request("POST", url, headers=header, data=data)
if response.status_code ==200:
print (dataflow_name+ " refresh started")
else:
print("Error code :" + str(response.status_code))

Data Warehouse Developer