[FEATURE] Allow output \0 terminated frames (for WebSocket streaming support) by pszemus · Pull Request #2105 · CCExtractor/ccextractor

pszemus · 2026-02-10T15:57:06Z

In raising this pull request, I confirm the following (please check boxes):

I have read and understood the contributors guide.
I have checked that another pull request for this purpose does not exist.
I have considered, and confirmed that this submission will be valuable to others.
I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
I give this submission freely, and claim no ownership to its content.
I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

I have never used CCExtractor.
I have used CCExtractor just a couple of times.
I absolutely love CCExtractor, but have not contributed previously.
I am an active contributor to CCExtractor.

When streaming subtitles (particularly DVBSUB) from ccextractor to WebSocket endpoints via tools like websocat, multi-line subtitles cause issues. Each line is sent as a separate message, resulting in only the last line being visible at the receiving end.

For example, using the following pipeline:

ccextractor --udp <src_stream_address> --codec dvbsub --out=txt --stdout --forceflush | websocat ws://<endpoint-uri>

multi-line subtitle frames are sent line-by-line, losing all but the final line.

This PR introduces the --null-terminated option, which appends a null character (\0) as a frame delimiter after each complete subtitle frame (whether single or multi-line). This enables proper frame boundaries for streaming scenarios.

Then, it'll be possible to create the following pipeline:

ccextractor --udp <src_stream_address> --codec dvbsub --out=txt --null-terminated --stdout --forceflush | websocat -0 ws://<endpoint-uri>

With this change, websocat's -0 flag can properly parse complete subtitle frames using the null delimiter (see websocat documentation).

Benefits:

Enables reliable WebSocket streaming of subtitles without data loss
Maintains backward compatibility (opt-in feature)
Follows established patterns for null-terminated stream processing
Simple, focused change that solves a real-world use case

Please compare the following two output files, where with --null-terminated enabled new lines in multi-line subtitles were preserved and all frames end with \0.

--out=webvtt:
ccextractor_webvtt.txt
--out=txt --null-terminated:
ccextractor_txt_null-terminated.txt

cfsmp3

Good feature with a clear real-world use case. The implementation is clean and properly wired through both C and Rust. However, the --null-terminated flag currently only works for DVB bitmap subtitles, not for text-based captions (CEA-608/708). This needs to be fixed before merging.

The problem

In src/lib_ccx/ccx_encoders_transcript.c, you replaced encoded_crlf with encoded_end_frame in only one place — the bitmap subtitle path at line 92:

// write_cc_bitmap_as_transcript() — line 92 — ✅ changed
write_wrapped(context->out->fh, context->encoded_end_frame, context->encoded_end_frame_length);

But the text subtitle path (write_cc_buffer_as_transcript) still uses encoded_crlf in three places that also need updating:

// Line 206 — ❌ not changed (end of each subtitle line)
ret = write(context->out->fh, context->encoded_crlf, context->encoded_crlf_length);

// Line 328 — ❌ not changed (end of each subtitle block)
ret = write(context->out->fh, context->encoded_crlf, context->encoded_crlf_length);

There's also line 77 and 90 where encoded_crlf is used for parsing/splitting tokens — those should probably stay as-is since they're detecting line breaks within the input, not writing output.

How to verify

I tested with a CEA-608 stream:

./ccextractor input.ts --txt --stdout --null-terminated 2>/dev/null | xxd | head -30

The output contains only 0d 0a (CRLF) — zero null bytes. The flag has no effect for text-based captions.

What to fix

In src/lib_ccx/ccx_encoders_transcript.c, replace encoded_crlf with encoded_end_frame on lines 206 and 328 (the two write() calls in write_cc_buffer_as_transcript). Leave lines 77 and 90 alone — those are input parsing, not output.

Note: you'll also need to update the ret < context->encoded_crlf_length comparisons on lines 207 and 329 to use encoded_end_frame_length accordingly.

pszemus · 2026-02-16T13:35:00Z

Thanks @cfsmp3 I've fixed missing code paths.
With my test file, now the output changes after setting --null-terminated from:

00000000: 5745 4c4c 2c20 4920 4755 4553 5320 594f  WELL, I GUESS YO
00000010: 5520 434f 554c 4420 5341 5920 5448 4154  U COULD SAY THAT
00000020: 0d0a 4920 4341 5245 2e2e 2e42 4543 4155  ..I CARE...BECAU
00000030: 5345 2049 2042 524f 5547 4854 2059 4f55  SE I BROUGHT YOU
00000040: 0d0a 494e 544f 2054 4849 5320 574f 524c  ..INTO THIS WORL
00000050: 442e 0d0a

to:

00000000: 5745 4c4c 2c20 4920 4755 4553 5320 594f  WELL, I GUESS YO
00000010: 5520 434f 554c 4420 5341 5920 5448 4154  U COULD SAY THAT
00000020: 0049 2043 4152 452e 2e2e 4245 4341 5553  .I CARE...BECAUS
00000030: 4520 4920 4252 4f55 4748 5420 594f 5500  E I BROUGHT YOU.
00000040: 494e 544f 2054 4849 5320 574f 524c 442e  INTO THIS WORLD.
00000050: 00

ccextractor-bot · 2026-02-16T14:18:18Z

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 0626bb5...:

Report Name	Tests Passed
Broken	13/13
CEA-708	14/14
DVB	6/7
DVD	3/3
DVR-MS	2/2
General	25/27
Hardsubx	1/1
Hauppage	3/3
MP4	3/3
NoCC	10/10
Options	81/86
Teletext	21/21
WTV	13/13
XDS	34/34

Your PR breaks these cases:

ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

Congratulations: Merging this PR would fix the following tests:

ccextractor --out=spupng c83f765c66..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

ccextractor-bot · 2026-02-16T14:42:29Z

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 0626bb5...:

Report Name	Tests Passed
Broken	13/13
CEA-708	14/14
DVB	7/7
DVD	3/3
DVR-MS	2/2
General	27/27
Hardsubx	1/1
Hauppage	3/3
MP4	3/3
NoCC	10/10
Options	85/86
Teletext	21/21
WTV	13/13
XDS	34/34

Your PR breaks these cases:

ccextractor --out=spupng c83f765c66...

Congratulations: Merging this PR would fix the following tests:

ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b..., Last passed: Never
ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

[feat] Allow output \0 terminated frames

ff9c160

pszemus force-pushed the null-terminated-frames branch from bdf3aa1 to ff9c160 Compare February 11, 2026 15:42

Fix rust FromCType

6d7c192

cfsmp3 requested changes Feb 15, 2026

View reviewed changes

use encoded_end_frame for text-based captions

7d23b42

pszemus and others added 2 commits February 16, 2026 14:41

Merge branch 'CCExtractor:master' into null-terminated-frames

9aa0cb4

add changelog entry

ad1dd83

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Allow output \0 terminated frames (for WebSocket streaming support)#2105

[FEATURE] Allow output \0 terminated frames (for WebSocket streaming support)#2105
pszemus wants to merge 5 commits intoCCExtractor:masterfrom
pszemus:null-terminated-frames

pszemus commented Feb 10, 2026

Uh oh!

cfsmp3 left a comment

Uh oh!

pszemus commented Feb 16, 2026

Uh oh!

ccextractor-bot commented Feb 16, 2026

Uh oh!

ccextractor-bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pszemus commented Feb 10, 2026

Uh oh!

cfsmp3 left a comment

Choose a reason for hiding this comment

The problem

How to verify

What to fix

Uh oh!

pszemus commented Feb 16, 2026

Uh oh!

ccextractor-bot commented Feb 16, 2026

Uh oh!

ccextractor-bot commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants