Skip to content

matbdev/bi-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MS-PowerBI Practical Project: RS Transparency Data Analysis

Note: This project was developed as an academic assignment for a Business Intelligence discipline.

📊 Project Intent

The main objective of this work is to demonstrate the practical application of business intelligence concepts by transforming raw public records into interactive graphical reports to generate valuable insights. It focuses on the public transparency data from the state of Rio Grande do Sul (Brazil), aiming to show that true transparency goes beyond merely publishing open data—it requires structuration, analysis, and accessible dashboards.

🗂️ Data Sources

The data was collected from the Portal da Transparência do Governo do Estado do Rio Grande do Sul (Open Data section). The analysis covers the following themes:

  • Revenues (Receitas): Consolidated data on state revenues.
  • Contracts (Contratos): Real estate lease contracts.
  • Payroll (Servidores): Monthly data on executive branch personnel (January 2023 to February 2026).
  • Daily Allowances (Diárias dos Servidores): Consolidated travel allowance data for the years 2023, 2024, and 2025.

Project Structure

bi-rs/
├── README.md
├── requirements.txt
├── assets/                          # Project images and documentation assets
├── dashboards/
│   ├── RS Transparency Data Analysis.pbip             # Power BI Project file
│   ├── RS Transparency Data Analysis.Report/          # Report definitions
│   └── RS Transparency Data Analysis.SemanticModel/   # Semantic model (DAX, Measures)
└── data-engineering/
    ├── workflow.py                  # Main orchestration script
    ├── bronze/
    │   ├── __init__.py
    │   └── load_files.py            # Raw data ingestion
    └── gold/
        ├── __init__.py
        └── gold_utils.py            # Data transformation & aggregation

Data Pipeline & Architecture

Due to the massive volume of data, the project implements a local Data Engineering pipeline based on the Medallion Architecture, orchestrated with Python and the Polars library (leveraging its lazy evaluation to avoid RAM exhaustion):

  • Raw Data: Raw .csv files as extracted from the portal. Automation (like CLI unzip) was used to optimize the extraction of multiple files.
  • Bronze Layer (data-engineering/bronze/):** Raw data ingested, trans-coded to UTF-8, and converted into highly optimized, compressed .parquet formats via stream processing.
  • Silver Layer: Formally skipped due to local storage capacity constraints, shifting transformations to the analytical consolidation phase.
  • Gold Layer (data-engineering/gold/):** The final step aggregates historical data mapping segments, yielding master tables (e.g., servidores.parquet, receita.parquet) that connect directly to Power BI.

How to Reproduce the Pipeline

  1. Create and activate a Virtual Environment:
    python -m venv venv
    # On Windows: .\venv\Scripts\activate
    # On Linux/Mac: source venv/bin/activate
  2. Install dependencies:
    pip install -r requirements.txt
  3. Run the workflow to ingest raw data and generate processed Parquet files:
    python data-engineering/workflow.py

📈 Power BI Development (Front-end)

Quick Start: Opening the Dashboard

To quickly open the Power BI project (.pbip) from your terminal in WSL, you can use the pbi utility script:

  1. Clone the util-terminal-scripts repository:

    git clone https://github.com/matbdev/util-terminal-scripts.git ~/.terminal-scripts
  2. Add it to your PATH (add this line to your ~/.bashrc or ~/.zshrc):

    export PATH="$HOME/.terminal-scripts:$PATH"
  3. Make scripts executable:

    chmod +x ~/.terminal-scripts/*
  4. From this project's dashboards/ directory, simply run:

    pbi

    This will instantly open the .pbip file in Power BI Desktop, bypassing WSL-to-Windows permission issues.

Dashboard Architecture

The visualization layer connects directly to the .parquet Gold files. It utilizes dynamic Power Query parameters (CaminhoDaBase) so the project can be seamlessly evaluated on different local machines.

The BI dashboard consists of four analytical panels featuring extensive DAX modeling and UI/UX design:

  1. Payroll (Servidores): Highlights gross costs vs. net costs passed to the employee. Includes Top N rankings of highest remunerations.
  2. Daily Allowances (Diárias): Crosses seasonal volume vs. average ticket, and maps temporal and financial effort concentrated in top destinations.
  3. Revenues (Receitas): Tracks budget execution metrics (handling negative current revenue deductions effectively) and features a Sankey Diagram to visualize the flow of cash. Also contains a drill-down matrix with 8 hierarchy classification levels.
  4. Contracts (Contratos): Manages the state's leased real estate portfolio, detailing contract durations and distinct cost modalities (rent, condo, IPTU).

To ensure the interface is self-explanatory, descriptive text tooltips were developed to provide users not just with numbers, but with methodological context on demand, maintaining educational value and high analytical governance.

🖼️ Dashboard Screenshots

Note: The dashboard screenshots below are presented in Portuguese (PT-BR) as they were originally developed for a Brazilian regional context.

1. Payroll (Servidores)

Payroll Overview

Income Distribution Temporal Analysis Tooltip Payroll Tooltip

2. Daily Allowances (Diárias)

Daily Allowances Overview

Quantity and Average Trip Duration Heatmap Tooltip Daily Allowances Tooltip

3. Revenues (Receitas)

Revenues Overview

Revenue Details Matrix (8 Classification Levels) Revenues Details

4. Contracts (Contratos de Locação)

Leasing Contracts

About

An end-to-end Business Intelligence project analyzing public transparency data from Rio Grande do Sul (Brazil). Features a data engineering pipeline using Python (Polars) and interactive Power BI dashboards.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages