Kotlin · Jolanrensen · Jun 2, 2026 · May 29, 2026 · Jun 2, 2026 · Jun 2, 2026
diff --git a/build-logic/src/main/kotlin/dfbuild.buildExampleProjects.gradle.kts b/build-logic/src/main/kotlin/dfbuild.buildExampleProjects.gradle.kts
@@ -29,6 +29,8 @@ val versionsToSync =
         "maven",
         "kotlinDatetime",
         "log4j",
+        "spark3",
+        "kotlin-spark",
         "spark4",
         "kotlin-dl",
     )

diff --git a/docs/StardustDocs/topics/concepts/concepts.md b/docs/StardustDocs/topics/concepts/concepts.md
@@ -44,7 +44,7 @@ This is why it was designed to be hierarchical and allows nesting of columns and
 * [**Interoperable**](collectionsInterop.md) — convertable with Kotlin data classes and collections.
   This also means conversion to/from other libraries' data structures is usually quite straightforward!
   See our [examples](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources) 
-  for some conversions between DataFrame and [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark), [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik), and [JetBrains Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed).
+  for some conversions between DataFrame and [Apache Spark](https://github.com/Kotlin/dataframe/tree/master/examples/projects/kotlin-spark), [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik), and [JetBrains Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed).
 * **Generic** — can store objects of any type, not only numbers or strings.
 * **Typesafe** — the Kotlin DataFrame library provides a mechanism of on-the-fly [**generation of extension properties**](extensionPropertiesApi.md) 
 that correspond to the columns of a dataframe. 

diff --git a/docs/StardustDocs/topics/dataSources/Integrations.md b/docs/StardustDocs/topics/dataSources/Integrations.md
@@ -19,7 +19,7 @@ Below is a list of example integrations with other data frameworks.
 These examples demonstrate how to bridge Kotlin DataFrame with external libraries or APIs.
 
 - [Kotlin Exposed](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed)
-- [Apache Spark (with/without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark)
+- [Apache Spark (with/without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/projects/kotlin-spark)
 - [Multik](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik)
 
 You can use these examples as templates to create your own integrations

diff --git a/docs/StardustDocs/topics/guides/Guides-And-Examples.md b/docs/StardustDocs/topics/guides/Guides-And-Examples.md
@@ -58,7 +58,7 @@ and make working with your data both convenient and type-safe.
 * [Using Unsupported Data Sources](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/src/main/kotlin/org/jetbrains/kotlinx/dataframe/examples):
   — A guide by examples. While these might one day become proper integrations of DataFrame, for now,
   we provide them as examples for how to make such integrations yourself.
-    * [Apache Spark Interop (With and Without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/spark)
+    * [Apache Spark Interop (With and Without Kotlin Spark API)](https://github.com/Kotlin/dataframe/tree/master/examples/projects/kotlin-spark)
     * [Multik Interop](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/multik)
     * [JetBrains Exposed Interop](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/exposed)
     * [Hibernate ORM](https://github.com/Kotlin/dataframe/tree/master/examples/idea-examples/unsupported-data-sources/hibernate)

diff --git a/examples/README.md b/examples/README.md
@@ -21,7 +21,7 @@ They show how to convert to and from Kotlin DataFrame and their respective table
     for an example of using Kotlin DataFrame with [Exposed](https://github.com/JetBrains/Exposed).
   * **Hibernate**: See the [hibernate folder](./idea-examples/unsupported-data-sources/hibernate)
     for an example of using Kotlin DataFrame with [Hibernate](https://hibernate.org/orm/).
-  * **Apache Spark**: See the [spark folder](./idea-examples/unsupported-data-sources/spark)
+  * **Apache Spark**: See the [spark folder](./projects/kotlin-spark)
     for an example of using Kotlin DataFrame with [Spark](https://spark.apache.org/) and with the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
   * **Multik**: See the [multik folder](./idea-examples/unsupported-data-sources/multik)
     for an example of using Kotlin DataFrame with [Multik](https://github.com/Kotlin/multik).

diff --git a/examples/projects/dev/kotlin-spark/.editorconfig b/examples/projects/dev/kotlin-spark/.editorconfig
@@ -0,0 +1,41 @@
+root = true
+
+[*]
+charset = utf-8
+end_of_line = lf
+insert_final_newline = true
+indent_style = space
+indent_size = 4
+max_line_length = 120
+
+[*.json]
+indent_size = 2
+
+[{*.yaml,*.yml}]
+indent_size = 2
+
+[*.ipynb]
+insert_final_newline = false
+
+[*.{kt,kts}]
+ij_kotlin_code_style_defaults = KOTLIN_OFFICIAL
+
+#  Disable wildcard imports entirely
+ij_kotlin_name_count_to_use_star_import = 2147483647
+ij_kotlin_name_count_to_use_star_import_for_members = 2147483647
+ij_kotlin_packages_to_use_import_on_demand = unset
+
+ktlint_code_style = ktlint_official
+ktlint_experimental = enabled
+ktlint_standard_filename = disabled
+ktlint_standard_no-empty-first-line-in-class-body = disabled
+ktlint_class_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
+ktlint_function_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
+ktlint_standard_chain-method-continuation = disabled
+ktlint_ignore_back_ticked_identifier = true
+ktlint_standard_multiline-expression-wrapping = disabled
+ktlint_standard_when-entry-bracing = disabled
+ktlint_standard_expression-operand-wrapping = disabled
+
+[{*/build/**/*,**/*keywords*/**,**/*.Generated.kt,**/*$Extensions.kt,**/BuildConfig.kt}]
+ktlint = disabled
diff --git a/examples/projects/dev/kotlin-spark/README.md b/examples/projects/dev/kotlin-spark/README.md
@@ -0,0 +1,18 @@
+# Apache Spark
+
+Showcase of how to use DataFrame with [Apache Spark](https://spark.apache.org/) and
+the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
+
+Even though Spark is not officially supported as a data source in DataFrame,
+this project shows how to convert from and to Spark tables.
+
+This project uses the
+[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
+
+We recommend using an up-to-date IntelliJ IDEA for the best experience,
+as well as the latest Kotlin plugin version.
+
+> [!WARNING]
+> For proper functionality in IntelliJ IDEA requires version 2025.2 or newer.
+
+[Download this Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/kotlin-spark.zip)
diff --git a/...orted-data-sources/spark/build.gradle.kts → ...rojects/dev/kotlin-spark/build.gradle.kts b/...orted-data-sources/spark/build.gradle.kts → ...rojects/dev/kotlin-spark/build.gradle.kts
@@ -1,28 +1,25 @@
 import org.jetbrains.kotlin.gradle.dsl.JvmTarget
 
 plugins {
-    application
-    kotlin("jvm")
-
-    // uses the 'old' Gradle plugin instead of the compiler plugin for now
-    id("org.jetbrains.kotlinx.dataframe")
+    alias(libs.plugins.kotlin.jvm)
+    alias(libs.plugins.kotlin.dataframe)
+    alias(libs.plugins.ktlint.gradle)
 
-    // only mandatory if `kotlin.dataframe.add.ksp=false` in gradle.properties
-    id("com.google.devtools.ksp")
+    application
 }
 
 repositories {
-    mavenLocal() // in case of local dataframe development
     mavenCentral()
 }
 
 dependencies {
-    // implementation("org.jetbrains.kotlinx:dataframe:X.Y.Z")
-    implementation(project(":"))
+    implementation(libs.dataframe)
 
-    // (kotlin) spark support
+    // (Kotlin) Spark SQL (Spark 3.3.2)
     implementation(libs.kotlin.spark)
-    compileOnly(libs.spark)
+    compileOnly(libs.spark.sql)
+
+    // Logging to keep Spark quiet
     implementation(libs.log4j.core)
     implementation(libs.log4j.api)
 }
@@ -64,6 +61,7 @@ val runSparkUntypedDataset by tasks.registering(JavaExec::class) {
 }
 
 kotlin {
+    jvmToolchain(11)
     compilerOptions {
         jvmTarget = JvmTarget.JVM_11
         freeCompilerArgs.add("-Xjdk-release=11")

diff --git a/examples/projects/dev/kotlin-spark/gradle.properties b/examples/projects/dev/kotlin-spark/gradle.properties
@@ -0,0 +1,5 @@
+org.gradle.jvmargs=-Xmx1g -Dfile.encoding=UTF-8
+kotlin.code.style=official
+# Disabling incremental compilation will no longer be necessary
+# when https://youtrack.jetbrains.com/issue/KT-66735 is resolved.
+kotlin.incremental=false
diff --git a/examples/projects/dev/kotlin-spark/gradle/libs.versions.toml b/examples/projects/dev/kotlin-spark/gradle/libs.versions.toml
@@ -0,0 +1,24 @@
+[versions]
+kotlin = "2.3.21"
+dataframe = "1.0.0-Beta5"
+ktlint-gradle = "14.0.1"
+ktlint = "1.8.0"
+log4j = "2.25.4"
+
+# check the versions down in the [libraries] section too!
+kotlin-spark = "1.2.4"
+spark3 = "3.3.2"
+
+[libraries]
+dataframe = { module = "org.jetbrains.kotlinx:dataframe", version.ref = "dataframe" }
+log4j-core = { group = "org.apache.logging.log4j", name = "log4j-core", version.ref = "log4j" }
+log4j-api = { group = "org.apache.logging.log4j", name = "log4j-api", version.ref = "log4j" }
+kotlin-spark = { group = "org.jetbrains.kotlinx.spark", name = "kotlin-spark-api_3.3.2_2.13", version.ref = "kotlin-spark" }
+spark-sql = { group = "org.apache.spark", name = "spark-sql_2.13", version.ref = "spark3" }
+
+[plugins]
+kotlin-jvm = { id = "org.jetbrains.kotlin.jvm", version.ref = "kotlin" }
+ktlint-gradle = { id = "org.jlleitschuh.gradle.ktlint", version.ref = "ktlint-gradle" }
+
+# The Kotlin DataFrame Compiler plugin is the same version as the Kotlin plugin.
+kotlin-dataframe = { id = "org.jetbrains.kotlin.plugin.dataframe", version.ref = "kotlin" }
diff --git a/examples/projects/dev/kotlin-spark/gradle/wrapper/gradle-wrapper.jar b/examples/projects/dev/kotlin-spark/gradle/wrapper/gradle-wrapper.jar
diff --git a/examples/projects/dev/kotlin-spark/gradle/wrapper/gradle-wrapper.properties b/examples/projects/dev/kotlin-spark/gradle/wrapper/gradle-wrapper.properties
@@ -0,0 +1,7 @@
+distributionBase=GRADLE_USER_HOME
+distributionPath=wrapper/dists
+distributionUrl=https\://services.gradle.org/distributions/gradle-9.5.0-bin.zip
+networkTimeout=10000
+validateDistributionUrl=true
+zipStoreBase=GRADLE_USER_HOME
+zipStorePath=wrapper/dists
diff --git a/examples/projects/dev/kotlin-spark/settings.gradle.kts b/examples/projects/dev/kotlin-spark/settings.gradle.kts
@@ -0,0 +1,18 @@
+pluginManagement {
+    repositories {
+        maven("https://packages.jetbrains.team/maven/p/kt/dev/")
+        mavenCentral()
+        gradlePluginPortal()
+    }
+}
+plugins {
+    id("org.gradle.toolchains.foojay-resolver-convention") version "1.0.0"
+}
+rootProject.name = "kotlin-spark"
+
+// region generated-config
+
+// substitutes dependencies provided by the root project
+includeBuild("../../../..")
+
+// endregion
diff --git a/...xamples/kotlinSpark/compatibilityLayer.kt → ...xamples/kotlinSpark/compatibilityLayer.kt b/...xamples/kotlinSpark/compatibilityLayer.kt → ...xamples/kotlinSpark/compatibilityLayer.kt
diff --git a/...rame/examples/kotlinSpark/typedDataset.kt → ...rame/examples/kotlinSpark/typedDataset.kt b/...rame/examples/kotlinSpark/typedDataset.kt → ...rame/examples/kotlinSpark/typedDataset.kt
@@ -13,7 +13,7 @@ import org.jetbrains.kotlinx.dataframe.api.print
 import org.jetbrains.kotlinx.dataframe.api.schema
 import org.jetbrains.kotlinx.dataframe.api.std
 import org.jetbrains.kotlinx.dataframe.api.toDataFrame
-import org.jetbrains.kotlinx.dataframe.api.toList
+import org.jetbrains.kotlinx.dataframe.api.toListOf
 import org.jetbrains.kotlinx.spark.api.withSpark
 
 /**
@@ -60,14 +60,17 @@ fun main() = withSpark {
     ageStats.print(columnTypes = true, borders = true)
 
     // and when we want to convert a DataFrame back to Spark, we can do the same trick via a typed List
-    val sparkDatasetAgain = dataframe.toList().toDS()
+    // Using the compiler plugin, it's important to specify the target data class explicitly!
+    // The local compiler-plugin type is not a data class that can be instantiated.
+    val sparkDatasetAgain = dataframe.toListOf<Person>().toDS()
     sparkDatasetAgain.printSchema()
     sparkDatasetAgain.show()
 }
 
 @DataSchema
 data class Name(val firstName: String, val lastName: String)
 
+// The @DataSchema annotation is optional for this specific example, but is generally recommended
 @DataSchema
 data class Person(
     val name: Name,

diff --git a/...me/examples/kotlinSpark/untypedDataset.kt → ...me/examples/kotlinSpark/untypedDataset.kt b/...me/examples/kotlinSpark/untypedDataset.kt → ...me/examples/kotlinSpark/untypedDataset.kt
diff --git a/...rame/examples/spark/compatibilityLayer.kt → ...rame/examples/spark/compatibilityLayer.kt b/...rame/examples/spark/compatibilityLayer.kt → ...rame/examples/spark/compatibilityLayer.kt
diff --git a/.../dataframe/examples/spark/typedDataset.kt → .../dataframe/examples/spark/typedDataset.kt b/.../dataframe/examples/spark/typedDataset.kt → .../dataframe/examples/spark/typedDataset.kt
@@ -18,7 +18,7 @@ import org.jetbrains.kotlinx.dataframe.api.print
 import org.jetbrains.kotlinx.dataframe.api.schema
 import org.jetbrains.kotlinx.dataframe.api.std
 import org.jetbrains.kotlinx.dataframe.api.toDataFrame
-import org.jetbrains.kotlinx.dataframe.api.toList
+import org.jetbrains.kotlinx.dataframe.api.toListOf
 import java.io.Serializable
 
 /**
@@ -78,7 +78,9 @@ fun main() {
     ageStats.print(columnTypes = true, borders = true)
 
     // and when we want to convert a DataFrame back to Spark, we can do the same trick via a typed List
-    val sparkDatasetAgain = spark.createDataset(dataframe.toList(), beanEncoderOf())
+    // Using the compiler plugin, it's important to specify the target data class explicitly!
+    // The local compiler-plugin type is not a data class that can be instantiated.
+    val sparkDatasetAgain = spark.createDataset(dataframe.toListOf<Person>(), beanEncoderOf())
     sparkDatasetAgain.printSchema()
     sparkDatasetAgain.show()
 
@@ -93,6 +95,7 @@ data class Name
     @JvmOverloads
     constructor(var firstName: String = "", var lastName: String = "") : Serializable
 
+// The @DataSchema annotation is optional for this specific example, but is generally recommended
 @DataSchema
 data class Person
     @JvmOverloads

diff --git a/...ataframe/examples/spark/untypedDataset.kt → ...ataframe/examples/spark/untypedDataset.kt b/...ataframe/examples/spark/untypedDataset.kt → ...ataframe/examples/spark/untypedDataset.kt
diff --git a/examples/projects/kotlin-spark/.editorconfig b/examples/projects/kotlin-spark/.editorconfig
@@ -0,0 +1,41 @@
+root = true
+
+[*]
+charset = utf-8
+end_of_line = lf
+insert_final_newline = true
+indent_style = space
+indent_size = 4
+max_line_length = 120
+
+[*.json]
+indent_size = 2
+
+[{*.yaml,*.yml}]
+indent_size = 2
+
+[*.ipynb]
+insert_final_newline = false
+
+[*.{kt,kts}]
+ij_kotlin_code_style_defaults = KOTLIN_OFFICIAL
+
+#  Disable wildcard imports entirely
+ij_kotlin_name_count_to_use_star_import = 2147483647
+ij_kotlin_name_count_to_use_star_import_for_members = 2147483647
+ij_kotlin_packages_to_use_import_on_demand = unset
+
+ktlint_code_style = ktlint_official
+ktlint_experimental = enabled
+ktlint_standard_filename = disabled
+ktlint_standard_no-empty-first-line-in-class-body = disabled
+ktlint_class_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
+ktlint_function_signature_rule_force_multiline_when_parameter_count_greater_or_equal_than = 4
+ktlint_standard_chain-method-continuation = disabled
+ktlint_ignore_back_ticked_identifier = true
+ktlint_standard_multiline-expression-wrapping = disabled
+ktlint_standard_when-entry-bracing = disabled
+ktlint_standard_expression-operand-wrapping = disabled
+
+[{*/build/**/*,**/*keywords*/**,**/*.Generated.kt,**/*$Extensions.kt,**/BuildConfig.kt}]
+ktlint = disabled
diff --git a/examples/projects/kotlin-spark/README.md b/examples/projects/kotlin-spark/README.md
@@ -0,0 +1,18 @@
+# Apache Spark
+
+Showcase of how to use DataFrame with [Apache Spark](https://spark.apache.org/) and
+the [Kotlin Spark API](https://github.com/JetBrains/kotlin-spark-api).
+
+Even though Spark is not officially supported as a data source in DataFrame,
+this project shows how to convert from and to Spark tables.
+
+This project uses the
+[Kotlin DataFrame Compiler Plugin](https://kotlin.github.io/dataframe/compiler-plugin.html).
+
+We recommend using an up-to-date IntelliJ IDEA for the best experience,
+as well as the latest Kotlin plugin version.
+
+> [!WARNING]
+> For proper functionality in IntelliJ IDEA requires version 2025.2 or newer.
+
+[Download this Example](https://github.com/Kotlin/dataframe/raw/example-projects-archives/kotlin-spark.zip)