spark-sql

Here are 971 public repositories matching this topic...

getredash / redash

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

visualization javascript mysql python bigquery bi spark dashboard athena analytics postgresql snowflake mariadb business-intelligence redash redshift databricks hacktoberfest spark-sql

Updated Apr 19, 2026
Python

apache / kyuubi

Star

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

kubernetes sql spark hive hadoop jdbc thrift data-lake hacktoberfest spark-sql

Updated Apr 24, 2026
Scala

dotnet / spark

Star

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Updated Mar 2, 2026
C#

almond-sh / almond

Star

A Scala kernel for Jupyter

scala spark jupyter repl jupyter-notebook jupyter-kernels spark-sql

Updated Apr 23, 2026
Scala

apache / gluten

Star

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

arrow clickhouse simd vectorization spark-sql velox

Updated Apr 24, 2026
Scala

databricks / LearningSparkV2

Star

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

spark apache-spark mllib structured-streaming spark-sql spark-mllib mlflow delta-lake

Updated Jan 28, 2025
Scala

oeljeklaus-you / UserActionAnalyzePlatform

Star

电商用户行为分析大数据平台

java spark hadoop sparkjava accumulator spark-sql kyro

Updated Nov 16, 2022
Java

ploomber / jupysql

Star

Better SQL in Jupyter. 📊

mysql python bigquery postgres data-science sql presto hive jupyter clickhouse sqlite snowflake data-engineering redshift tsql spark-sql trino duckdb polars

Updated Jan 4, 2026
Python

kevinschaich / pyspark-cheatsheet

Sponsor

Star

🐍 Quick reference guide to common patterns & functions in PySpark.

documentation data-science data docs spark reference guide pyspark cheatsheet cheat quickstart references guides cheatsheets spark-sql pyspark-tutorial

Updated Feb 21, 2023

qubole / sparklens

Star

Qubole Sparklens tool for performance tuning Apache Spark

performance scala spark simulation cluster scheduler scheduling performance-metrics performance-tuning performance-visualization performance-analysis sparkjava spark-job spark-applications spark-sql spark-mllib spark-ml

Updated Jun 26, 2024
Scala

japila-books / spark-sql-internals

Star

The Internals of Spark SQL

spark apache-spark book internals spark-sql mkdocs-material

Updated Jan 25, 2026

zsvoboda / ngods-stocks

Star

New Generation Opensource Data Stack Demo

python spark metabase cube dbt iceberg spark-sql datahub trino dagster trinodb

Updated Feb 6, 2023
Jupyter Notebook

DataWithBaraa / databricks_bootcamp_2026

Sponsor

Star

End-to-end Data Lakehouse project built on Databricks, following the Medallion Architecture (Bronze, Silver, Gold). Covers real-world data engineering and analytics workflows using Spark, PySpark, SQL, Delta Lake, and Unity Catalog. Designed for learning, portfolio building, and job interviews.

python ai spark apache-spark etl pyspark data-engineering data-analytics databricks data-pipeline spark-sql lakehouse data-lakehouse unity-catalog protfolio-project medallion-architecture data-engineering-project

Updated Jan 19, 2026
Jupyter Notebook

microsoft / data-accelerator

Star

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Updated Apr 24, 2026
C#

cuebook / cuelake

Star

Use SQL to build ELT pipelines on a data lakehouse.

sql apache-spark etl pipelines data-engineering data-lake data-transfer delta data-integration upsert elt data-pipeline datalake data-ingestion spark-sql zeppelin-notebook apache-iceberg lakehouse incremental-updates

Updated May 25, 2022
JavaScript

jaceklaskowski / spark-workshop

Sponsor

Star

Apache Spark™ and Scala Workshops

workshop spark apache-spark spark-sql spark-mllib spark-structured-streaming spark-workshops

Updated Jul 29, 2024
HTML

Qbeast-io / qbeast-spark

Star

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

scala big-data spark sampling datasource spark-sql data-lakehouse

Updated Jan 24, 2025
Scala

Chabane / bigdata-playground

Star

A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL

Updated Feb 1, 2019
TypeScript

bluishglc / bdp

Star

A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype

redis demo kafka spark prototype bigdata spark-streaming quickstart sparksql oozie sqoop spark-sql spark-streaming-examples sqoop-import spark-demo middle-end middle-office spark-examples

Updated Aug 12, 2020
Java

mc2-project / opaque-sql

Star

An encrypted data analytics platform

security machine-learning privacy spark analytics enclave spark-sql

Updated Mar 29, 2023
Scala

Improve this page

Add a description, image, and links to the spark-sql topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-sql topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-sql

Here are 971 public repositories matching this topic...

getredash / redash

apache / kyuubi

dotnet / spark

almond-sh / almond

apache / gluten

databricks / LearningSparkV2

oeljeklaus-you / UserActionAnalyzePlatform

ploomber / jupysql

kevinschaich / pyspark-cheatsheet

qubole / sparklens

japila-books / spark-sql-internals

zsvoboda / ngods-stocks

DataWithBaraa / databricks_bootcamp_2026

microsoft / data-accelerator

cuebook / cuelake

jaceklaskowski / spark-workshop

Qbeast-io / qbeast-spark

Chabane / bigdata-playground

bluishglc / bdp

mc2-project / opaque-sql

Improve this page

Add this topic to your repo