Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/StardustDocs/d.tree
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@
<toc-element topic="schemasInheritance.md"/>
<toc-element topic="Data-Schemas-In-Kotlin-Notebook.md"/>
<toc-element topic="schemasImportOpenApiJupyter.md"/>
<toc-element topic="Data-Schemas-And-Extension-Properties-Troubleshooting.md"/>
<toc-element topic="Gradle-Plugin.md">
<toc-element topic="schemasGradle.md"/>
<toc-element topic="schemasImportSqlGradle.md"/>
Expand Down
12 changes: 12 additions & 0 deletions docs/StardustDocs/topics/extensionPropertiesApi.md
Original file line number Diff line number Diff line change
Expand Up @@ -291,3 +291,15 @@ However, you can work around this by casting back to the original schema:
df.add("name") { "branchName" }
.filter { it.cast<BranchData>().profit > 0 }
```

## Troubleshooting

Sometimes you can get an exception with a message containing

```plain text
..exception in generated DataFrame extension property..
```

This may be caused by incompatible schema usage or incorrectly defined column types.

See [](Data-Schemas-And-Extension-Properties-Troubleshooting.md) for more information.
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Data Schemas and Extension Properties Troubleshooting

Sometimes you can get an exception with a message containing

```plain text
..exception in generated DataFrame extension property..
```

This means there is a runtime error while accessing a [`DataFrame` extension property](extensionPropertiesApi.md),
generated by the [Compiler Plugin](Compiler-Plugin.md) or in the [Kotlin Notebook](SetupKotlinNotebook.md).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*in Kotlin Notebook (the name of the product), or *in a (Kotlin) notebook (the concept)


Such errors are caused by generating extension properties for data schemas that are not compatible with the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "mismatch" is a better fit here

[`DataFrame`](DataFrame.md), [`DataRow`](DataRow.md), etc.
In most cases, the schema contains columns of the wrong names or types.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

columns with an incorrect name or type

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example:


```kotlin
@DataSchema
interface Schema {
val age: String
}
```

```kotlin
val df = dataFrameOf("age" to columnOf(17, 32, 26)).cast<Schema>()

// Compiles correctly but fails on runtime
df.filter { age > 20 }
```

## Possible reasons

### Incompatible manually defined data schema

If you define initad data schema manually,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the initial

make sure your data schema is compatible with the [`DataFrame`](DataFrame.md).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the concept "the dataframe" is better here

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and you're describing just a solution here, not the reason it fails. I'd make a clear distinction for first the reason it fails and then second how to solve it


* Use [special methods](DataSchemaGenerationMethods.md) for generating a data schema code
instead of defining data schema manually.
* Use [`.cast<Schema>()`](cast.md) with `verify=true` for verifying the `Schema` compatibility.

### Incorrect column types after `DataFrame` creation

Sometimes the runtime schema is wrong itself,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*schema itself is wrong

because column types differ from the actual column value types.
This can happen when reading a DataFrame from files or databases.

> Such cases are most probably bugs! Please report them on [GitHub Issues](https://github.com/Kotlin/dataframe/issues).
{style="warning"}

Possible workarounds:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it's better :) first you describe the problem/cause and then the workaround


* Specify the correct type using [`.replace {}`](replace.md) and `ValueColumn.changeType()`:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be important to mention the two different ways a column can have a "type". There's the internal KType, used by runtime functions, and there's the compile-time type, visible in the IDE, which is also sometimes used in runtime when you refer to a column in an inline reified function, but usually it's visual only

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh and this workaround only works if you're working with a value column, obviously ;P


```kotlin
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

korro?

df.replace { wrongTypeCol }.with { it.asValueColumn().changeType(typeOf<ActualType>) }
```

* Use [`.inferType { columns }`](inferType.md) to infer the correct types
for the selected columns from the actual values.
**It can take a long time and use up a lot of resources for large dataframes!**

#### Problems with type affinity in SQLite

Because of [SQLite type affinity](https://sqlite.org/datatype3.html),
the column typed defined by JDBC may differ from the actual values in the column.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you link to JDBC docs?

This problem often occurs when reading data from an SQLite database with column of custom types.

You can provide types for such columns manually:

```Kotlin
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

korro :)

import org.jetbrains.kotlinx.dataframe.io.db.Sqlite
import kotlin.reflect.typeOf

val sqliteCustom = Sqlite.withCustomTypes(
mapOf(
"LONGVARCHAR" to typeOf<String>(),
"LONGINT" to typeOf<Long>()
)
)
val df = DataFrame.readSqlTable(
connection, "table_name", dbType = sqliteCustom
)
```
Loading