This project has a ready code, but it needs more testing, bugfixes and general improvements.
It is not production ready yet.
The wasmbzip2 is a libbzip2 JavaScript wrapper library with binaries compiled to WebAssembly.
The high-level API is a JavaScript API created on top of a low-level API. It provides convenient way of compressing and decompressing bzip2 data.
...TODO sample code
setBinary(binary)
Sets the WebAssembly binary. The method does not exists in variants with embedded binary.
It must be called before BZ2.create().
binary
WebAssembly binary that is associated with current variant of wasmbzpi2. The binary can be in any form thatWebAssembly.compile()orWebAssembly.compileStreaming()accepts.
async create(memoryMax, wasi)
Creates new instance of BZ2 class that contains instance of libbzip2 WebAssembly module.
memoryMaxoptional
Maximum memory that WebAssembly module can allocate, unlimited by default.wasioptional
A WASI system calls. It is used only inlibbzip2-stdiovariant.- Return value
APromisethat resolves to aBZ2object.
version
A libbzip2 version string.
compress(input, maxOutputSize, level)
Compress single array at once and returns the result.
input
An array-like input data.maxOutputSizeoptional
A maximum size of the output. By default, it is calculated based on input for worst case compression ratio.leveloptional
A compression level from 1 to 9. Default is 9.- Return value
AUint8Arraycontaining compressed data. - Throws
BZ2.BufferErrorif output is bigger thanmaxOutputSize.
BZ2.Errorif libbzip2 error occurred.
compress(input, output, level)
Compress single array at once and writes to an output array.
input
An array-like input data.output
AUint8Arraywhere compressed data will be written. Worst case output size is the input size + 1% + 600 bytes.leveloptional
A compression level from 1 to 9. Default is 9.- Return value
Number of bytes written to the output array. - Throws
BZ2.BufferErrorif output does not fit into the output array.
BZ2.Errorif libbzip2 error occurred.
decompress(input, maxOutputSize, lowMem)
Decompress single array at once and returns the result.
input
An array-like input data.maxOutputSize
A maximum size of the output. Internal buffer of that size is allocated, so this value cannot be extremely high.lowMemoptional
If it is present and it istrue, a slower algorithm will be used that consumes less memory.- Return value
AUint8Arraycontaining decompressed data. - Throws
BZ2.BufferErrorif output is bigger thanmaxOutputSize.
BZ2.Errorif libbzip2 error occurred.
decompress(input, output, lowMem)
Decompress single array at once and writes to an output array.
input
An array-like input data.output
AUint8Arraywhere compressed data will be written.lowMemoptional
If it is present and it istrue, a slower algorithm will be used that consumes less memory.- Return value
Number of bytes written to the output array. - Throws
BZ2.BufferErrorif output does not fit into the output array.
BZ2.Errorif libbzip2 error occurred.
It is an abstract base class for BZ2.Compression and BZ2.Decompression classes.
totalInput
A total number of consumed bytes.
totalOutput
A total number of produced bytes.
finished
A boolean indicating that entire output stream was read.
getInput()
Gets Uint8Array that is located at the internal input buffer. Providing it to compress() or decompress() method avoids unnecessary copying of the memory. This method will return the same array if called more than once without compress() or decompress() call. If the internal input buffer is full, zero-length array is returned. The returned array is valid until the next operation on the libbzip2 library instance.
reset()
Resets the state of the object allowing compression or decompression of an another stream.
dispose()
Release WebAssembly memory occupied by this object.
WARNING!!!
To avoid WebAssembly memory leaks in theBZ2object call this method whenBZ2.CompressionorBZ2.Decompressionobject is no longer needed.
This class extends BZ2.Processing class and inherits all its members.
new BZ2.Compression(bz2, inputBufferSize, outputBufferSize, level)
bz2
ABZ2object that will handle the compression.inputBufferSizeoptional
A size of the internal input buffer, default is 128K.outputBufferSizeoptional
A size of the internal output buffer, default is 128K.leveloptional
A compression level from 1 to 9, default is 9.
compress(input, inputOffset, inputLength, output, outputOffset, outputLength)
Compress part of a data.
input
AUint8Arraywith the input bytes. Anullshould be passed at the end to indicate end of data.inputOffsetoptional
A start offset ininputarray, zero by default.inputLengthoptional
A length of a portion of theinputarray. Default is until the end of the array.outputoptional
AUint8Arraywhere the output will be written. By default, output will in the return value.outputOffsetoptional, forbidden ifoutputis missing
A start offset inoutputarray, zero by default.outputLengthoptional, forbidden ifoutputis missing
A length of a portion of theoutputarray. Default is until the end of the array.- Return value
An object with following properties:bytesRead
A number of bytes read from theinput.bytesWritten
A number of bytes written to theoutput.finished
A boolean indicating that entire output stream was read.outputoptional
AUint8Arraywith the output data. It exists only ifoutputparameter was not provided.
This class extends BZ2.Processing class and inherits all its members.
new BZ2.Decompression(bz2, inputBufferSize, outputBufferSize, lowMem)
bz2
ABZ2object that will handle the decompression.inputBufferSizeoptional
A size of the internal input buffer, default is 128K.outputBufferSizeoptional
A size of the internal output buffer, default is 128K.lowMemoptional
If it is present and it istrue, a slower algorithm will be used that consumes less memory.
decompress(input, inputOffset, inputLength, output, outputOffset, outputLength)
Decompress part of a data.
input
AUint8Arraywith the input bytes. Anullcan be passed at the end to indicate end of data. Passingnullis optional, but it allows detecting unexpected end of compressed data.inputOffsetoptional
A start offset ininputarray, zero by default.inputLengthoptional
A length of a portion of theinputarray. Default is until the end of the array.outputoptional
AUint8Arraywhere the output will be written. By default, output will in the return value.outputOffsetoptional, forbidden ifoutputis missing
A start offset inoutputarray, zero by default.outputLengthoptional, forbidden ifoutputis missing
A length of a portion of theoutputarray. Default is until the end of the array.- Return value
An object with following properties:bytesRead
A number of bytes read from theinput.bytesWritten
A number of bytes written to theoutput.finished
A boolean indicating that entire output stream was read.outputoptional
AUint8Arraywith the output data. It exists only ifoutputparameter was not provided.
The low-level API exposes libbzip2 API and adds some utility functions and classes. It it more advanced and requires basic knowledge of C and WebAssembly.
Following sample shows how to use low-level API.
TODO: Run this sample
// Prepare input data
let input = (new TextEncoder()).encode('Hello World!');
// Create library instance
let bz2 = await BZ2.create();
// Allocate memory
let inptr = bz2.malloc(1024);
let outptr = bz2.malloc(1024);
let lenptr = bz2.malloc(4);
// Fill the input memory
(new Uint8Array(bz2.memory.buffer, inptr, 1024)).set(input);
(new DataView(bz2.memory.buffer, lenptr, 4)).setUint32(0, 1024, true);
// Execute compression function
let res = bz2.bzBuffToBuffCompress(out, lenptr, inptr, input.length, 9, 0, 0);
// Show the results
let len = (new DataView(bz2.memory.buffer, lenptr, 4)).getUint32(0, true);
console.log(`Compression result ${res}, output:`);
console.log(new Uint8Array(bz2.memory.buffer, outptr, len));
// Deallocate the memory
bz2.free(inptr);
bz2.free(outptr);
bz2.free(lenptr);You can find details about API in:
Below table shows the C symbols and how they are accessed from the wasmbzip2. The bz2 object is an instance of BZ2 class. The verbosity parameter is ignored. The pointers are 32-bit long. Using low-level functions requires direct access to WebAssembly.Memory via bz2.memory property.
| libbzip2/stdlib | wasmbzip2 |
|---|---|
malloc |
bz2.malloc |
free |
bz2.free |
realloc |
bz2.realloc |
BZ2_bzCompressInit |
bz2.bzCompressInit |
BZ2_bzCompress |
bz2.bzCompress |
BZ2_bzCompressEnd |
bz2.bzCompressEnd |
BZ2_bzDecompressInit |
bz2.bzDecompressInit |
BZ2_bzDecompress |
bz2.bzDecompress |
BZ2_bzDecompressEnd |
bz2.bzDecompressEnd |
BZ2_bzBuffToBuffCompress |
bz2.bzBuffToBuffCompress |
BZ2_bzBuffToBuffDecompress |
bz2.bzBuffToBuffDecompress |
BZ2_bzlibVersion |
bz2.bzlibVersion |
BZ_RUN |
BZ2.RUN |
BZ_FLUSH |
BZ2.FLUSH |
BZ_FINISH |
BZ2.FINISH |
BZ_OK |
BZ2.OK |
BZ_RUN_OK |
BZ2.RUN_OK |
BZ_FLUSH_OK |
BZ2.FLUSH_OK |
BZ_FINISH_OK |
BZ2.FINISH_OK |
BZ_STREAM_END |
BZ2.STREAM_END |
BZ_SEQUENCE_ERROR |
BZ2.SEQUENCE_ERROR |
BZ_PARAM_ERROR |
BZ2.PARAM_ERROR |
BZ_MEM_ERROR |
BZ2.MEM_ERROR |
BZ_DATA_ERROR |
BZ2.DATA_ERROR |
BZ_DATA_ERROR_MAGIC |
BZ2.DATA_ERROR_MAGIC |
BZ_IO_ERROR |
BZ2.IO_ERROR |
BZ_UNEXPECTED_EOF |
BZ2.UNEXPECTED_EOF |
BZ_OUTBUFF_FULL |
BZ2.OUTBUFF_FULL |
BZ_CONFIG_ERROR |
BZ2.CONFIG_ERROR |
bz_stream |
BZ2.stream (details below) |
WARNING! Call
bz2.free,bz2.bzCompressEndandbz2.bzDecompressEndwhen needed to avoid memory leaks in thebz2object.
JavaScript has no structures like the C does, so libbzip2 bz_stream structure is wrapped by a BZ2.stream class.
new BZ2.stream(bz2)- Create a new structure in thebz2instance memory. The memory is dynamically allocated, so it must be deallocated with thedispose()method.
dispose()- Deallocate memory occupied by this structure. After the call, the structure is unusable.WARNING! Call it when needed to avoid memory leaks in the
BZ2instance object.clear()- Fill the memory with zeros.
ptr- This pointer to this structure. You have to provide this value to low-level API functions.view-DataViewof the wasm memory occupied by this structure- Other properties mapped from
bz_streamstructure fields are summarized in following table.
| libbzip2 field | BZ2.stream property |
Comment |
|---|---|---|
next_in |
next_in |
|
avail_in |
avail_in |
|
total_in_lo32/hi32 |
total_in |
read-only |
next_out |
next_out |
|
avail_out |
avail_out |
|
total_out_lo32/hi32 |
total_out |
read-only |
state |
not exposed - internal libbzip2 field | |
bzalloc, bzfree, opaque |
not exposed - wasmbzip2 is always using standard allocator |
wasmbzip2 have a few variants of the binaries.
In most cases, you probably need the basic one: libbzip2.
If you want just compression or decompression you can use a smaller binary libbzip2-cmp or libbzip2-dec.
Variant libbzip2-dbg contains additional debug information.
Variants with -emb suffix contains WebAssembly binary embedded in the source file,
so they do not need for BZ2.setBinary() call.
| Variant | compression | decompression | debug | embedded binary |
|---|---|---|---|---|
libbzip2 |
YES | YES | - | - |
libbzip2-emb |
YES | YES | - | YES |
libbzip2-cmp |
YES | - | - | - |
libbzip2-cmp-emb |
YES | - | - | YES |
libbzip2-dec |
- | YES | - | - |
libbzip2-dec-emb |
- | YES | - | YES |
libbzip2-dbg |
YES | YES | YES | - |
libbzip2-dbg-emb |
YES | YES | YES | YES |
wasmbzip2 also provides libbzip2-stdio variant which is not well tested, because it is not very useful in a browser context. It exports libbzip2 API for .bz2 file manipulation. See the libbzip2-stdio.js source code for more details.
TODO