jmvalin: (Default)
[personal profile] jmvalin
Recently, I was curious about how CELT and Vorbis differ in the way the allocate bits. Now, CELT's bit allocation is really explicit with a fixed number of bits per band. This is not quite the case of Vorbis, so a comparison isn't straightforward. What I've done is I've ran some audio (mono version of the audio I used in my previous post) through Vorbis and measured the SNR as a function of frequency. By dividing the SNR by 6 db/bit, I can get the (approximate) bit allocation. The result (smoothed a bit) is shown below for encoding quality -1 to 10.



Now, these are the curves currently used by CELT for its bit allocation:




Among the differences are:
1) The Vorbis allocation lines for different rates are nearly parallel, meaning that starting from a certain allocation, bits are added/removed nearly uniformly when changing the bit-rate
2) Vorbis allocates a lot of bits to very low frequencies, and then there is a sharp drop-off around 400 Hz.
3) In the mid-high range, the Vorbis allocation is much flatter than CELT

Now I tend to trust that the Vorbis allocation has been decently tuned, so the question is whether the differences in allocation are due to fundamental differences between Vorbis and CELT or just to bad tuning of CELT so far. I suspect there's a bit of both. I've actually created an exp_vorbis_tuning branch to find out. I just took the Vorbis data and turned that into CELT bit allocation data just to see what it would do. I expected something terrible, but it actually sounds quite decent. In some circumstances, it sounds a bit worse than the original CELT tuning, but I think in other cases it actually sounds better. More investigation needed...

Date: 2010-09-14 12:46 am (UTC)
From: [identity profile] xiphmont.livejournal.com
You should look at the differences between long block/short block tuning in Vorbis as they're very different. The primary reason for allocating as many midbass and low-midrange bits as Vorbis does has to do with the window leaking DC and near-DC energy quite a ways up into the midbass and midrange. This effect is substantially worse in short blocks.

The dropoff you see from DC is following the log-energy transfer of the long-block window pretty closely. It's not a coincidence, and the tuning/testing agrees with theory here.

Its one reason that, if I were revising Vorbis, I'd start by parametrically coding near-DC energy and not trying to rely on the windowed MDCT to handle it. It pollutes the analysis and coding of everything up to the low midrange.

Profile

jmvalin: (Default)
jmvalin

March 2023

S M T W T F S
   1234
567891011
12131415161718
1920212223 2425
262728293031 

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 22nd, 2024 07:42 am
Powered by Dreamwidth Studios