Discussion:
[Freetel-codec2] Codec2 Container Format
s***@gmx.de
9 years ago
Permalink
Hello,
i had some problems decoding these files here:
https://lowbitnet.wordpress.com/2016/02/09/ultra-low-bit-audio-books/

Having to try different compression-modes is not really user-friendly so
i came up with the idea that codec2 needs a short container format, well
actually just a 4 Byte Header:
"C2%c%c",(char)header_version,(char)mode

I think that should be default when creating a .c2 file in command line,
so i modified c2enc to automatically add this header if --raw or -r is
not given.
Also i added the mode "auto" for c2dec to decode this container format.
What is probably most controversial is that i modified the codec2.h to
have some easily readable compression modes instead of increasing numbers:
;mode Compression-Ratio
;45 450
;70 700
;71 700B
;120 1200
;130 1300
;140 1400
;160 1600
;240 2400
;250 3200

Usage:
c2enc 700B infile.raw test.c2
c2dec auto test.c2 outfile.raw

c2enc 1200 infile.raw test.c2 --raw
c2dec 1200 test.c2 outfile.raw


Feedback Welcome, thanks.
Simon the Sorcerer
s***@gmx.de
9 years ago
Permalink
Why do you think this should be within something that's labeled codec2,
vs the codec being specified in the enclosing format, which could
presumably specify gsm13k or something instead?
It's because Codec2 needs this parameter to correctly decode, and it
wouldn't be able to read it from a foreign container. I suppose its
established practice for a pure audio-format to be readable by the
decoder itself, compare .vorbis or .mp3 which also needn't be included
into ogg or other containers for the decoder to correctly decode it.
That also makes sense because otherwise any programm implementing codec2
would have to know how to parse the correct parameters from the
container, that is a big obstacle to implement codec2 into current
libraries like ffmpeg or gstreamer, as it is lots of code for just one
simple format, my patch and header-definition would be just a small step
for such an inclusion, the argument format of c2enc / c2dec is still
pretty ugly as it depends highly on the order of parameters and is
anything but standard, the header-option should normally be directly in
codec2.h/codec2.c my implementaton in c2enc/c2dec is just a fast
workaround yet.
Do you see this extra header being used everywhere codec2 is used, or
just in files? Would it be sent in freedv?
Well i see it presumable in files, because freedv probably has it's own
method to negotiate the bitrate and therefore could use the -r switch,
if it is using the c2enc program at all. i admit i havn't looked into
the freedv-code itself to be able to say if the header would be better
then freedvs current method of negotiating bitrate.
Tomas Härdin
9 years ago
Permalink
...
ffmpeg developer here. If you pick a relatively sane container it isn't
a huge issue from the libavformat side of things. Picking an existing
container might make some things easier (due to avoiding reinventing
things), but it'll make other things hard (making sure you're writing
files to spec). You'd have to be careful. For instance .mov has 4 bytes
overhead per frame to support seeking (stco atom), which for a
low-bitrate codec is a bit silly

.mp3 and .ogg have one common problem: seeking in VBR mode. I'm not sure
if codec2 will ever have a VBR mode, but it is something you have to
keep in mind. Oh, and audio isn't the only case: good old DV is
"containerless", with a fixed size per frame (144000 bytes for PAL DV25,
120000 for NTSC). Seeking is trivial: just fseek() to
frame_number*frame_size, once you know what DV variant you're dealing
with. But again, this only works if you're 100% sure the codec is CBR

As for handling codec specific data, that's what extradata is for. It's
the container-agnostic way of handling such things, and one of the
reasons why ffmpeg is able to remux between different containers

In short, you have to think about the two major use cases:
streaming/piping (nonseekable) and playing files stored on disk
(seekable, cuttable). You can easily fulfill both if your codec is CBR
Post by s***@gmx.de
Do you see this extra header being used everywhere codec2 is used, or
just in files? Would it be sent in freedv?
Well i see it presumable in files, because freedv probably has it's own
method to negotiate the bitrate and therefore could use the -r switch,
if it is using the c2enc program at all. i admit i havn't looked into
the freedv-code itself to be able to say if the header would be better
then freedvs current method of negotiating bitrate.
This is more an operator issue if I understand correctly

/Tomas
s***@gmx.de
9 years ago
Permalink
...
So is there a container format that you could suggest that doesn't have
much overhead?
Post by Tomas Härdin
As for handling codec specific data, that's what extradata is for. It's
the container-agnostic way of handling such things, and one of the
reasons why ffmpeg is able to remux between different containers
Actually i can't find much information about extradata, how does it
work? So you would suggest using that instead of a File-Header?
Post by Tomas Härdin
streaming/piping (nonseekable) and playing files stored on disk
(seekable, cuttable). You can easily fulfill both if your codec is CBR
Actually a customized Container Format for Codec2 with VBR would be
simple, one could just make about 2 Bytes per up to 255 Frames overhead,
that could even be seekable if making one Byte of the two Bytes
XOR-Linked or using 3 Bytes instead. The question is if VBR makes sense,
so if the codec2 encoder could become intelligent enough to know when to
use higher bitrates, probably that could be pretty good for
speech-breaks eg between words or sentences..!?

/Simon
Tomas Härdin
9 years ago
Permalink
...
Well, you're always going to have some overhead if you can't compute
offsets a priori. The simplest workaround is probably to lump a whole
bunch of frames together
Post by s***@gmx.de
Post by Tomas Härdin
As for handling codec specific data, that's what extradata is for. It's
the container-agnostic way of handling such things, and one of the
reasons why ffmpeg is able to remux between different containers
Actually i can't find much information about extradata, how does it
work? So you would suggest using that instead of a File-Header?
Extradata is whatever metadata you need to interpret the raw codec data.
One example is SPS/PPS in H.264. You put such data in the header of your
file because you don't want to waste space repeating it all over the place
...
Having to parse a whole file just to be able to seek in it is no fun,
which is why seeking in .mp3 and .ogg is such a pain. Having an index
would be highly preferable. I have a feeling we can cross the VBR bridge
when we get to it, so to say. I'm not sure I've heard David mention it
even once, so let's not worry too much about it

One potentially good, simple choice occurs to me: .wav. You can put
compressed audio in it, there's space reserved for extradata, and it has
widespread support and it's hard to mess up writing them. You can even
stream them. You just need to come up with a suitable TWOCC ('c2' perhaps)

/Tomas
s***@gmx.de
9 years ago
Permalink
Post by Tomas Härdin
Extradata is whatever metadata you need to interpret the raw codec data.
[...] You put such data in the header of your
file [...]
Ok then that's just what my header did, but i will still have to
reconsider it, as i'll explain below..
Post by Tomas Härdin
Having to parse a whole file just to be able to seek in it is no fun,
which is why seeking in .mp3 and .ogg is such a pain. Having an index
would be highly preferable. I have a feeling we can cross the VBR bridge
when we get to it, so to say. I'm not sure I've heard David mention it
even once, so let's not worry too much about it
Well an index would be difficult for contiguous streams, also my
suggestion needn't read the whole file but could skip chunks of about 5s
(20ms codec2) to 10s (40ms codec2), i think such a seeking would be
bearable.
Post by Tomas Härdin
One potentially good, simple choice occurs to me: .wav. You can put
compressed audio in it, there's space reserved for extradata, and it has
widespread support and it's hard to mess up writing them. You can even
stream them. You just need to come up with a suitable TWOCC ('c2' perhaps)
Hm i've looked into .wav, it seems it could be used for cbr-codec2-files
indeed, but i couldn't find out how to store contiguous streams as it
always seems to need the size of the compressed and uncompressed data
before.


Probably the whole discussion about a container format is too early, as
i just analyzed the output of c2enc and it seemes not to be in best
shape for putting it into a container yet, there are many spare bits and
repetitions, so even with huffman coding there could be saved a lot of
space, eg 700B seems to have 5 of 32 bits that are just zero every
frame, 700 has even 6 of 32 bits, such alignment seems ridiculous for an
ultra low bitrate audio codec, so is there already a tool for removing
the spare bits, or why are they left inside?

I also did a small test with binary VBR (replacing breaks by RLE), seems
that could really increase the compression ratio with small loss, so i'd
really suggest including metadata for VBR into codec2, then this whole
discussion about extradata for bitrate would be obsolete anyway.

/Simon
Tomas Härdin
9 years ago
Permalink
...
Yes, there's still plenty of correlation in the bitstream which suggests
compression improvements are still possible. Just load
adventuresherlockholmes_01_doyle.c2 in GIMP as a 700x1024 8-bit raw
image and patterns are obvious. You could chalk such things up to codec2
being made for running live with considerable packet loss, so a bit of
redundancy is actually a good thing

Use cases like compressing audio books might call for a separate mode.
You could also get some gains by preprocessing the data, like squelching
quiet areas completely. Any codec would produce highly repetitive output
in such areas. Then you could use a general-purpose compressor like gzip
or bzip2 to take care of the repetitions. That way you could perhaps get
away with using a general-purpose codec like Opus

/Tomas
David Rowe
9 years ago
Permalink
Hello Simon,

700 has been deprecated....

The source (codec2.c) has bit allocations information in various
function headers.

--natural describes the bit ordering, natural versus Grey coded, softdec
is soft decision information (floating point "bits") useful for the
digital voice (radio) modes.

All good reasons to have a separate application targeted for storage.....

- David
...
Bruce Perens
9 years ago
Permalink
There are extra bits for a slow, unreliable text stream.
David Rowe
9 years ago
Permalink
Hi there,

I didn't realise people were using codec 2 for storage, had only thought
it would be of novelty value in that application. Is anyone using codec
2 in real world storage applications?

I like the idea of a header, but for my work I need headerless and I
have a bunch of scripts I don't want to change.

Can I suggest we make it headerless by default?

Alternatively - how about a separate application for this use case,
leave c2enc/c2dec for digital voice work?

- David
...
glen english
9 years ago
Permalink
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
David Rowe
9 years ago
Permalink
Hello Simon,

The Trellis decoding work on Codec 2 is used for protection against
channel errors, rather than compression.

Exploiting remaining redundancy in the codec parameters sounds
interesting, pls let us know how you go. This generally makes the Codec
more sensitive to bit errors and so unsuitable for digital radio
applications, but OK for storage. Lots of experimental quantisation
techniques in quantise.c

In this era of 1 cent megabytes I can't think of a use case where such
tight storage of voice is required, but hey, maybe one will pop up!

Cheers,

David
...
glen english
9 years ago
Permalink
Hi Simon
I am going to see what 3200 sounds like with a bit of bastardization-

- send LSPs for both 10mS subframes (rather than just one of 10mS
subframes of the 20mS block)
- and not bother interpolating them in the decoder (because I wont
have to )

And in doing that I'll also analyse the interpolator error of the
existing 3200 codec and see if I am "barking up the wrong tree". IE go
looking for the differences and error levels.

regards
Sry meant 3200 bit/s not kbit/s :D
Just did a quick comparison of speex and codec2, speex at 2000 bit/s
sounds much like codec2 mode 1200 excerpt that codec2 is a bit dull.
Codec2 mode 1600 sounds much better then speex 2000.
Also Speex at 5000 bit/s sounds a bit worse then codec2 mode 3200, so
probably he is really better of with codec2 if he can't get up to 6
kbit/s opus.
Greets
Simon
glen english
9 years ago
Permalink
Hi Simon
I've never done much listening to codec2 @ 3200 and I have done today
and I'm very impressed.

glen
...
--
-
Glen English
RF Communications and Electronics Engineer

CORTEX RF
&
Pacific Media Technologies Pty Ltd

ABN 40 075 532 008

PO Box 5231 Lyneham ACT 2602, Australia.
au mobile : +61 (0)418 975077
David Rowe
9 years ago
Permalink
Hi Glen,

I'm not sure where voting fits in. Can you describe the use
case/repeater topology please?

Low bit rate speech codecs have memory, so you need a continuous
sequence of frames from one encoder.

- David
...
glen english
9 years ago
Permalink
Hi David

I've run some more runs on 3200, actually pretty good if clean. and I
can clean it. Yes big step to others at 6000+ bps but that's out of my
territory.
That question was a seperate topic from the voting. unrelated.

Voting was an variation topic. That's where you have multiple receivers
from multiple sites, and you have to choose the best one. Currently in
my SDR I divide the (analog ) audio up into 8mS frames and apply a SNR
value to each one. The voting at the other end is simple- use the frame
from the best input. Timing is of course sorted out.
Now that's all fine when you have plenty of link bandwidth
For low bit rate links one really needs to encode and reduce the
bitrate.. Hence use a speech coder.

IE encode the analog receiver signals with a speech coder.

One way would be to decode all the frames (to uncoded PCM) and pick and
choose based on the metric supplied with the frames from the receiver .
But my point was choosing a different decoded stream each codec frame
sounded funny because the coders dont precisely encode the same
receiver audio the same way, with the same coefficients etc. IE 8
receiver sources, and 8 decoders, 7 of them ignored each frame, one
used.. I'd need to force the encoders to all code the same audio the
same way. Am I making sense ? It is the same audio, just with a
different amount of residual noise on it, and perhaps slightly
different spectral energy, depending on how the noise reduction was
feeling...
This agreement on how to code would have to be done on a frame by frame
basis... In that case, the voter would simply have to decode the frame
with the best metric. (rather than decode all of them) , and it could
because the state would be the same for all encoders-decoders.

But in writing this I have sort of answered by own question.
:-)
if all the speech inputs are going to be the same , then why bother
voting- well some will be better quality input than others.

I can do all this with simulations on the PC.

This is part of the new VK1 linking system. (but that will be a simple
speech codec- no voting) for the repeater links. Most likely now Codec2
at 3200.

regards




g
...
--
-
Glen English
RF Communications and Electronics Engineer

CORTEX RF
&
Pacific Media Technologies Pty Ltd

ABN 40 075 532 008

PO Box 5231 Lyneham ACT 2602, Australia.
au mobile : +61 (0)418 975077
glen english
9 years ago
Permalink
I should have added- the link-codec development base will be CODEC2

regards
David Rowe
9 years ago
Permalink
So is there actually a valid use case for storage with Codec 2? The
general trend is for cheaper memory/disk. I having trouble figuring out
where it would be useful.

Re the modes, a lot of them are just experimental and not in wide use.
700 is dead, and 700B probably won't last through 2016 as there will be
a better replacement. The mode is widest use is 1300, which ends up
being FreeDV 1600 once FEC is added.

There is probably an argument for a much smaller number of modes in the
release code, as compared to the dev branch. Lots of experimental cruft
in -dev that has developed over the years.

- David
...
Butrus Damaskus
9 years ago
Permalink
Post by David Rowe
So is there actually a valid use case for storage with Codec 2? The
general trend is for cheaper memory/disk. I having trouble figuring out
where it would be useful.
Well, I think that there will always be a valid use case for storage with
any codec - e.g. for some
embedded devices and so on.

You can also think of archiving amateur-radio conversations without having
to reencode
the stream to another format.
Post by David Rowe
Re the modes, a lot of them are just experimental and not in wide use.
700 is dead, and 700B probably won't last through 2016 as there will be
a better replacement. The mode is widest use is 1300, which ends up
being FreeDV 1600 once FEC is added.
Yes, but I would find it _very_ important to decide which modes are
"stable".
Those modes should than stay in the codec2 library "forever", so that there
is
enough stability for people who would base their project on codec2.
...
Butrus Damaskus
9 years ago
Permalink
Post by David Rowe
Hi there,
I didn't realise people were using codec 2 for storage, had only thought
it would be of novelty value in that application. Is anyone using codec
2 in real world storage applications?
Hi David!

I was thinking about using codec2 for my collection of "audio-books" but
the problem
is that codec2 is much of a 'moving-target' ATM.

Actually I dream about having codec2 (library+some basic encoding/decoding
tools) be
shipped as a standard part of the Debian Linux (or any other Linux
distribution). And another
thing would be to add codec2 to the open-source "rockbox" firmware.

But for this to be true one really has to solve the conflict between the
need to develop codec2
further and to be "backward-compatible", i.e. to be able to identify which
"flavour" of codec2
are we using and use the right version. IMHO this is true for using codec2
in transmission as well
as using codec2 as archival format (audio books, or even archiving
amateur-radio conversations) or
even if using codec2 for Asterisk-trunking.
Post by David Rowe
I like the idea of a header, but for my work I need headerless and I
have a bunch of scripts I don't want to change.
Can I suggest we make it headerless by default?
No, no, no, please! After 10 years you will never know which version of
codec2 was used.
Post by David Rowe
Alternatively - how about a separate application for this use case,
leave c2enc/c2dec for digital voice work?
I'd be fine with it.

Petr
...
David Rowe
9 years ago
Permalink
Hi Petr,

There is a stable release of Codec 2 and modes like 1300 have been
stable for 4 years. However I am also looking at new modes in the 600
bit/s range for digital voice work.

Cheers,

David
...
Continue reading on narkive:
Loading...