Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 91 additions & 34 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# netchdf
_last updated: 7/16/2025_
_last updated: 7/18/2025_

This is a rewrite in Kotlin of parts of the devcdm and netcdf-java libraries.

Expand All @@ -10,6 +10,32 @@

Please contact me if you'd like to help out. Especially needed are test datasets from all the important data archives!!

<!-- TOC -->
* [netchdf](#netchdf)
* [Building](#building)
* [What version of the JVM, Kotlin, and Gradle?](#what-version-of-the-jvm-kotlin-and-gradle)
* [Why this library?](#why-this-library-)
* [Why do we need another library besides the standard reference libraries?](#why-do-we-need-another-library-besides-the-standard-reference-libraries)
* [What's wrong with the standard reference libraries?](#whats-wrong-with-the-standard-reference-libraries)
* [Why Kotlin?](#why-kotlin)
* [What about performance?](#what-about-performance)
* [Goals and scope](#goals-and-scope)
* [Non-goals](#non-goals)
* [Testing](#testing)
* [Code Coverage](#code-coverage)
* [Testing against the reference libraries](#testing-against-the-reference-libraries)
* [Data Model notes](#data-model-notes)
* [Type Safety and Generics](#type-safety-and-generics)
* [Cdl Names](#cdl-names)
* [Datatype](#datatype)
* [Typedef](#typedef)
* [Dimension](#dimension)
* [Compare with HDF5 data model](#compare-with-hdf5-data-model)
* [Compare with HDF4 data model](#compare-with-hdf4-data-model)
* [Compare with HDF-EOS data model](#compare-with-hdf-eos-data-model)
* [Elevator blurb](#elevator-blurb)
<!-- TOC -->

### Building

* Download Java 21 JDK and set JAVA_HOME.
Expand All @@ -26,6 +52,20 @@
* [Building and Running native library](docs/Building.md)
* [Building and Running ncdump](cli/Readme.md)

#### What version of the JVM, Kotlin, and Gradle?

We use the latest LTS (long term support) Java version, and will not be explicitly supporting older versions.
Currently that is Java 21.

We also use the latest stable version of Kotlin that is compatible with the Java version. Currently that is Kotlin 2.1.

Gradle is our build system. We will use the latest stable version of Gradle compatible with our Java and Kotlin versions.
Currently that is Gradle 8.14.

For now, you must download and build the library yourself. Eventually we will publish it to Maven Central.
The IntelliJ IDE is highly recommended for all JVM development.


### Why this library?

The scientific data stored in NetCDF and HDF file formats must remain forever readable.
Expand Down Expand Up @@ -121,31 +161,38 @@
It's not a goal to provide remote access to files.


### What version of the JVM, Kotlin, and Gradle?

We will always use the latest LTS (long term support) Java version, and will not be explicitly supporting older versions.
Currently that is Java 21.
### Testing

We also use the latest stable version of Kotlin that is compatible with the Java version. Currently that is Kotlin 2.1.
Currently most of the test files do not live in the github repo because they are too big.
Eventually we will make them available in a separate download.

Gradle is our build system. We will use the latest stable version of Gradle compatible with our Java and Kotlin versions.
Currently that is Gradle 8.14.
There are four levels of testing:

For now, you must download and build the library yourself. Eventually we will publish it to Maven Central.
The IntelliJ IDE is highly recommended for all JVM development.
1. Unit testing that doesn't require reading files.
2. Testing with files in core/commonTest/data. These are fast and are run in a Github Action.
3. Testing with files in TestFiles.testData in module testfiles. These are medium fast (< 11 min wallclock).
4. Testing with files in TestFiles.testData in module testclibs. These are slow.

Currently we have 1500+ test files in the core and testdata modules:

### Testing
````
hdf-eos2 = 440 files
hdf-eos5 = 18 files
hdf4 = 32 files
hdf5 = 175 files
netcdf3 = 664 files
netcdf3.2 = 81 files
netcdf3.5 = 1 files
netcdf4 = 119 files

total # files = 1530
````

We use the Java [Foreign Function & Memory API](https://docs.oracle.com/en/java/javase/21/core/foreign-function-and-memory-api.html)
for testing against the Netcdf, HDF5, and HDF4 C libraries.
With these tools we can be confident that our library gives the same results as the reference libraries.
We will continue to add representative samples of recent files for improved testing and code coverage.

Currently using
* HDF5 library version: 1.14.6.
* netcdf-c library version 4.10.0-development of May 23 2025
#### Code Coverage

Currently we have this test coverage from core/test:
Currently we have this test coverage from the core and testfiles modules:

````
cdm 88% (1560/1764) LOC
Expand All @@ -154,28 +201,38 @@
netcdf3 77% (230/297) LOC
````

The core library has ~6500 LOC.
7/18/2025
````
cdm.api 94% (532/567) LOC
cdm.array 95% (662/698) LOC
cdm.iosp 68% (146/213) LOC
cdm.layout 89% (277/310) LOC
cdm.util 76% (106/139) LOC
hdf4 82% (1638/2008) LOC
hdf5 80% (2740/3417) LOC
netcdf3 80% (213/266) LOC

all 83% (6314/7618) LOC
````

The core library has ~7600 LOC.

#### Testing against the reference libraries

More and deeper test coverage is provided in the testclibs module, which compares netchdf metadata and data against
the Netcdf, HDF5, and HDF4 C libraries. The clibs module is not part of the released netchdf library and is
the Netcdf, HDF5, and HDF4 C libraries. Note that the clibs module is not part of the released netchdf library and is
only supported for test purposes.

Currently we have 1470 test files in the core test suite:
We use the Java [Foreign Function & Memory API](https://docs.oracle.com/en/java/javase/21/core/foreign-function-and-memory-api.html)
for testing against the Netcdf, HDF5, and HDF4 C libraries.
With these tools we can be confident that our library gives the same results as the reference libraries.

````
hdf-eos2 = 267 files
hdf-eos5 = 18 files
hdf4 = 205 files
hdf5 = 113 files
netcdf3 = 664 files
netcdf3.2 = 81 files
netcdf3.5 = 1 files
netcdf4 = 121 files

total # files = 1470
````
We need to get representative samples of recent files for improved testing and code coverage.
Currently using
* HDF5 library version: 1.14.6.
* netcdf-c library version 4.10.0-development of May 23 2025
* HDF-4 library version: ???

In order to run, you must install the C libraries on your computer and ad them to the LD_LIBRARY_PATH.

### Data Model notes

Expand Down Expand Up @@ -205,7 +262,7 @@
* Netcdf-4/HDF5 library encodes CHAR values as HDF5 string type with elemSize = 1, so we use that convention to detect
legacy CHAR variables in HDF5 files. (NC_CHAR should not be used in Netcdf-4, use NC_UBYTE or NC_STRING.)
* Netcdf-4/HDF5 String variables may be fixed or variable length. For fixed Strings, we set the size of Datatype.STRING to
the fixed size. For both fixed and variable length Strings, the string withh be truncated at the first zero byte, if any.

Check failure on line 265 in Readme.md

View workflow job for this annotation

GitHub Actions / Check for spelling errors

withh ==> with
* HDF4 does not have a STRING type, but does have signed and unsigned CHAR, and signed and unsigned BYTE.
We map both signed and unsigned to Datatype.CHAR and handle it as above (Attributes are Strings, Variables are UBytes).
* _Datatype.STRING_ always appears to be variable length to the user, regardless of whether the data in the file is variable or fixed length.
Expand Down
4 changes: 2 additions & 2 deletions core/src/commonMain/kotlin/com/sunya/cdm/array/ArrayString.kt
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ fun ByteArray.makeStringFromBytes(): String {
* If there is a null (zero) value in the array, the String will end there.
* The null is not returned as part of the String.
*/
internal fun ArrayByte.makeStringFromBytes(charset : Charset = Charsets.UTF8): String {
fun ArrayByte.makeStringFromBytes(charset : Charset = Charsets.UTF8): String {
var count = 0
for (c in this) {
if (c.toInt() == 0) {
Expand All @@ -73,7 +73,7 @@ internal fun ArrayByte.makeStringFromBytes(charset : Charset = Charsets.UTF8): S
return this.values.decodeToString(charset, 0, count)
}

internal fun ArrayUByte.makeStringFromBytes(charset : Charset = Charsets.UTF8): String {
fun ArrayUByte.makeStringFromBytes(charset : Charset = Charsets.UTF8): String {
var count = 0
for (c in this) {
if (c.toInt() == 0) {
Expand Down
2 changes: 2 additions & 0 deletions core/src/commonMain/kotlin/com/sunya/cdm/array/ArrayULong.kt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ import com.sunya.cdm.layout.TransferChunk
@OptIn(ExperimentalUnsignedTypes::class)
class ArrayULong(shape : IntArray, datatype : Datatype<*>, val values: ULongArray) : ArrayTyped<ULong>(datatype, shape) {

constructor(shape : IntArray, values: ULongArray) : this(shape, Datatype.ULONG, values)

override fun iterator(): Iterator<ULong> = BufferIterator()
private inner class BufferIterator : AbstractIterator<ULong>() {
private var idx = 0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ class StructureMember<T>(orgName: String, val datatype : Datatype<T>, val offset
val value = value(sdata)
if (value is ArrayTyped<*>) return value

if (value is String) return ArrayString(intArrayOf(1), listOf(value as String))
if (value is String) return ArrayString(intArrayOf(1), listOf(value))

return when (datatype) {
Datatype.BYTE -> ArrayByte(intArrayOf(1), ByteArray(1) { value as Byte })
Expand Down
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/external_link.hdf5
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/globalheaps_test.hdf5
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/hdf_v14_test1.hdf5
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/hdf_v14_test2.hdf5
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/isssue-523.hdf5
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/lz4_datasets.hdf5
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/test_file.hdf5
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/test_file2.hdf5
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/test_file_ext.hdf5
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added core/src/commonTest/data/jhdf/utf8-fixed-length.hdf5
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

class TestArrayFloat {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

class TestArrayInt {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

class TestArrayLong {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,13 @@ package com.sunya.cdm.array
import com.sunya.cdm.api.*
import com.sunya.cdm.layout.IndexND
import com.sunya.cdm.layout.IndexSpace
import com.sunya.netchdf.testutil.propTestFastConfig
import com.sunya.netchdf.testutil.propTestSlowConfig
import com.sunya.netchdf.testutil.runTest
import io.kotest.property.Arb
import io.kotest.property.arbitrary.int
import io.kotest.property.arbitrary.string
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

class TestArrayTyped {

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

@OptIn(ExperimentalUnsignedTypes::class)
class TestArrayUByte {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

@OptIn(ExperimentalUnsignedTypes::class)
class TestArrayUInt {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

@OptIn(ExperimentalUnsignedTypes::class)
class TestArrayULong {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ import io.kotest.property.arbitrary.int
import io.kotest.property.checkAll
import kotlin.test.*
import kotlin.math.max
import kotlin.test.*

@OptIn(ExperimentalUnsignedTypes::class)
class TestArrayUShort {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ class TestChunkerIntersection {
assertEquals(expected.contentToString(), result.contentToString())
}

@Test
// @Test
fun intersectRight() {
// A rectangular subsection of indices, going from start to start + shape, relative to varShape
// class IndexSpace(startIn : LongArray, shapeIn : LongArray) {
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
package com.sunya.netchdf.testutil
package com.sunya.cdm.util

import kotlin.test.*
import kotlin.test.Test

class TestMisc {

Expand Down
38 changes: 21 additions & 17 deletions core/src/commonTest/kotlin/com/sunya/netchdf/hdf5/H5enumTest.kt
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
package com.sunya.netchdf.hdf5

import com.sunya.cdm.api.*
import com.sunya.cdm.api.CompoundTypedef
import com.sunya.cdm.api.Datatype
import com.sunya.cdm.api.EnumTypedef
import com.sunya.cdm.api.convertEnums
import com.sunya.cdm.array.ArrayStructureData
import com.sunya.cdm.array.ArrayTyped
import com.sunya.netchdf.openNetchdfFile
import com.sunya.netchdf.testutil.readNetchdfData
import com.sunya.netchdf.testfiles.testData

import kotlin.test.*
import com.sunya.netchdf.testutil.testData
import kotlin.test.Test
import kotlin.test.assertContains
import kotlin.test.assertContentEquals
import kotlin.test.assertEquals


class H5enumTest {

Expand All @@ -25,6 +29,13 @@ class H5enumTest {
}
}

@Test
fun testReadNetchdfData() {
files().forEach { filename ->
readNetchdfData(filename, null, null, true, true)
}
}

@Test
fun testEnumAttribute() {
val filename = testData + "devcdm/netcdf4/tst_enums.nc"
Expand All @@ -34,7 +45,7 @@ class H5enumTest {

val att = myfile.rootGroup().attributes.find{ it.name == "brady_attribute"}!!
println("brady_attribute = $att")
assertEquals(Datatype.ENUM1, att.datatype)
assertEquals(Datatype.Companion.ENUM1, att.datatype)
assertContentEquals(listOf(0.toUByte(), 3.toUByte(), 8.toUByte()), att.values)
assertEquals(listOf("Mike", "Marsha", "Alice"), att.convertEnums())

Expand All @@ -50,7 +61,7 @@ class H5enumTest {
println("--- ${myfile!!.type()} $filename ")
println(myfile.cdl())
val v = myfile.rootGroup().variables.find{ it.name == "EnumTest"}!!
assertEquals(Datatype.ENUM4, v.datatype)
assertEquals(Datatype.Companion.ENUM4, v.datatype)
val data = myfile.readArrayData(v)
println("EnumTest data = $data")
val expect = listOf(0,1,2,3,4,0,1,2,3,4)
Expand All @@ -73,15 +84,15 @@ class H5enumTest {
println("--- ${myfile!!.type()} $filename ")
println(myfile.cdl())
val v = myfile.rootGroup().variables.find{ it.name == "EnumCmpndTest"}!!
assertEquals(Datatype.COMPOUND, v.datatype)
assertEquals(Datatype.Companion.COMPOUND, v.datatype)
val typedef = v.datatype.typedef as CompoundTypedef
val member = typedef.members.find { it.name == "color_name"}!!

val mtypedef = member.datatype.typedef as EnumTypedef

val sdataArray = myfile.readArrayData(v)
println("EnumCmpndTest data = $sdataArray")
assertEquals(Datatype.COMPOUND, sdataArray.datatype)
assertEquals(Datatype.Companion.COMPOUND, sdataArray.datatype)
val dtypedef = v.datatype.typedef as CompoundTypedef
assertEquals(typedef, dtypedef)

Expand All @@ -91,8 +102,8 @@ class H5enumTest {
println("sdata = $sdata")
val wtf : ArrayTyped<*> = member.values(sdata)
println("value = $wtf")
assertEquals((idx % 5).toUInt(), wtf.first())
assertEquals(expectNames[idx % 5], wtf.convertEnums().first())
assertEquals(mtypedef.convertEnum(idx % 5), wtf.first())
// assertEquals(expectNames[idx % 5], wtf.convertEnums().first())
}
}
}
Expand All @@ -106,11 +117,4 @@ class H5enumTest {
readNetchdfData(filename, null, null, true, false)
}

@Test
fun testReadNetchdfData() {
files().forEach { filename ->
readNetchdfData(filename, null, null, true, true)
}
}

}
Loading
Loading