Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
name: codespell

on:
pull_request:
paths:
- Readme.md

permissions: {}

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: codespell-project/actions-codespell@v2
with:
path: Readme.md
13 changes: 6 additions & 7 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,16 @@

### Why do we need another library besides the standard reference libraries?

Its necessary to have independent implementations of any standard. If you don't have multiple implementations, its
easy for the single implementer to mistake the implementation for the actual standard. Its easy to hide problems
It's necessary to have independent implementations of any standard. If you don't have multiple implementations, it's
easy for the single implementer to mistake the implementation for the actual standard. It's easy to hide problems
that are actually in the standard by adding work-arounds in the code, instead of documenting problems and creating new
versions of the standard with clear fixes. For Netcdf/Hdf, the standard is the file formats, along with their semantic
descriptions. The API is language and library specific, and is secondary to the standard.

Having multiple implementations is a huge win for the reference library, in that bugs are more quickly found, and
ambiguities more quickly identified.


### Whats wrong with the standard reference libraries?
### What's wrong with the standard reference libraries?

The reference libraries are well maintained but complex. They are coded in C, which is a difficult language to master
and keep bug free, with implications for memory safety and security. The libraries require various machine and OS dependent
Expand All @@ -64,12 +63,12 @@

HDF-EOS uses an undocumented "Object Descriptor Language (ODL)" text format, which adds a dependency on the SDP Toolkit
and possibly other libraries. These toolkits also provide functionality such as handling projections and coordinate system
conversions, and arguably its impossible to process HDF-EOS without them. So the value added here by an independent
conversions, and arguably it's impossible to process HDF-EOS without them. So the value added here by an independent
library for data access is less clear. For now, we will provide a "best-effort" to expose the internal
contents of the file.

Currently, the Netcdf-4 and HDF5 libraries are not thread safe, not even for read-only applications.
This is a serious limitation for high performance, scalable applications, and it is disappointing that it hasnt been fixed.

Check failure on line 71 in Readme.md

View workflow job for this annotation

GitHub Actions / Check for spelling errors

hasnt ==> hasn't
See [Toward Multi-Threaded Concurrency in HDF5](https://www.hdfgroup.org/wp-content/uploads/2022/05/Toward-MT-HDF5.pdf),
and [RFC:Multi-Thread HDF5](https://support.hdfgroup.org/releases/hdf5/documentation/rfc/RFC_multi_thread.pdf) for more information.

Expand Down Expand Up @@ -112,7 +111,7 @@

The library will be thread-safe for reading multiple files concurrently.

We are focussing on earth science data, and dont plan to support other uses except as a byproduct.
We are focussing on earth science data, and don't plan to support other uses except as a byproduct.

The core module will remain pure Kotlin with very minimal dependencies and no write capabilities. In particular,
there will be no dependency on the reference C libraries (except for testing).
Expand Down Expand Up @@ -219,7 +218,7 @@

#### Compare with HDF5 data model
* Creation order is ignored
* We dont include soft (aka symbolic) links in a group, as these point to an existing dataset (variable).
* We don't include soft (aka symbolic) links in a group, as these point to an existing dataset (variable).
* Opaque: hdf5 makes arrays of Opaque all the same size, which gives up some of its usefulness. If there's a need,
we will allow Opaque(*) indicating that the sizes can vary.
* Attributes can be of type REFERENCE, with value the full path name of the referenced dataset.
Expand Down
Loading