jmvalin: (Default)
[personal profile] jmvalin

This is a follow-up on the first LPCNet demo. In this new demo, we turn LPCNet into a very low-bitrate neural speech codec (see submitted paper) that's actually usable on current hardware and even on phones. It's the first time a neural vocoder is able to run in real-time using just one CPU core on a phone (as opposed to a high-end GPU). The resulting bitrate — just 1.6 kb/s — is about 10 times less than what wideband codecs typically use. The quality is much better than existing very low bitrate vocoders and comparable to that of more traditional codecs using a higher bitrate.

Read More

Crash on raspi

Date: 2019-03-30 07:43 am (UTC)
From: (Anonymous)
Hello,

Thank you for the new codec. I have tested it on my desktop without avx2 and like the quality/space tradeoff. It's painfully slow decoding without avx2, but that is expected. I tried to cross-compile for raspberry pi 3 and get a crash in sincos. Do you know what I'm doing wrong? I'm using gcc-8.3.0 and I had to do a bit of work to get it to compile:
CC=arm-unknown-linux-gnueabi-gcc-8.3.0 CFLAGS='-O3 -ggdb -march=armv8-a -mfpu=neon' ../configure --target=arm-unknown-linux-gnueabi --host=x86_64-pc-linux-gnu
I'm on commit 343e35.

Possible to parallelize?

Date: 2019-04-04 02:16 pm (UTC)
From: (Anonymous)
Hello, jmvalin!

Ist it possible to parallelize the neural-network (or even build it in a decent FPGA)?

Thanks!

Re: Possible to parallelize?

Date: 2019-11-22 12:42 am (UTC)
From: (Anonymous)
As it like processes 4 frames in sequence would it not be easiest do split the 10ms frames to an own core and then merge the result in the fifth and do the rest of the calculation.
If this would work it could mean that it would work on a R Pi in realtime.

Best regards
- Martin

Re: Possible to parallelize?

Date: 2020-06-08 05:14 pm (UTC)
From: (Anonymous)
Is it possible to parallelize the encoding and decoding on the GPU with WebGL?

Opus

Date: 2019-04-10 11:18 pm (UTC)
From: (Anonymous)
In your left to do paragraph, you mention maybe this could be used in Opus. Could this technology also be used to enhance music at lower bit rates, say 20kb/s and achieve similar quality to a higher bitrate of say 64kb/s? Also, is there any samples at to what music would sound like (I know this is for speech so far, but would be interesting to see what happens to music at 1.6kb/s. Cheers, Kirk

Re: Opus

Date: 2019-04-11 05:13 am (UTC)
From: (Anonymous)
I was thinking the same, but thought I would double check. It would be great if this ends up in Opus and can be used for say a podcast where speech / music detection takes place to mix music with current opus WB codec with higher bitrate in with the LPC net stream at 1.6kb/s. I hope that makes sense. Should yield very small files for distribution. Great work by the way. Extremely impressive result so far.

Impressively practical

Date: 2020-02-19 05:00 pm (UTC)
From: (Anonymous)
We're currently trying to use LPCNet as a vocoder in a voice conversion related application, and it is working brilliantly. Really impressed by the latency and speed, and how sensible and practically usable the interface is. Amazing work.

Can't get the same result

Date: 2020-02-25 04:14 am (UTC)
From: (Anonymous)
Hi,I downloaded the reference audio of the test results on the website and then used your open source code (commit = 7d8d216) to encode and decode.But the decoding result I got is different from the ‘LPCNet 1.6kbs’ downloaded on the website. What could be the reason?

Re: Can't get the same result

Date: 2020-02-25 08:02 am (UTC)
From: (Anonymous)
Thanks for your reply,It's not just the difference in values, the two results sound quite different. I tested the PESQ-WB, the website is 1.777, and my own is 1.319. My own decoding results sound a lot of glitches.

PESQ.exe +16000 +wb ref.pcm ref_dec.pcm
PESQ.exe +16000 +wb ref.pcm lpcnq.pcm

I don't know if it is because of my compile, because a lot of warnings are generated, such as:

In file included from src/lpcnet_dec.c:38:0:
src/pitch.h:37:1: warning: C++ style comments are not allowed in ISO C90 [enabled by default]
//#include "modes.h"
^
src/pitch.h:37:1: warning: (this will be reported only once per input file) [enabled by default]
src/lpcnet_dec.c: In function 'decode_packet':
src/lpcnet_dec.c:96:3: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
unpacker bits;
^
src/lpcnet_dec.c:108:3: warning: C++ style comments are not allowed in ISO C90 [enabled by default]
//fprintf(stdout, "%d %d %d %d %d %d %d %d %d\n", c0_id, main_pitch, modulation, corr_id, vq_end[0], vq_end[1], vq_end[2], vq_mid, interp_id);
^
src/lpcnet_dec.c:108:3: warning: (this will be reported only once per input file) [enabled by default]
src/lpcnet_dec.c:135:3: warning: ISO C90 forbids mixed declarations and code [-Wpedantic]
float sign = 1;
^
CC src/lpcnet_enc.lo
In file included from src/lpcnet_enc.c:38:0:
src/pitch.h:37:1: warning: C++ style comments are not allowed in ISO C90 [enabled by default]
//#include "modes.h"
^
src/pitch.h:37:1: warning: (this will be reported only once per input file) [enabled by default]
src/lpcnet_enc.c:46:1: warning: C++ style comments are not allowed in ISO C90 [enabled by default]
//#define NB_FEATURES (2*NB_BANDS+3+LPC_ORDER)

PS:
The voice bandwidth of the "MELP 2.4kbps" downloaded from the website is 4kHz, but the MELP 2.4kbps program I downloaded from the network generates a voice bandwidth of 8kHz.

Re: Can't get the same result

Date: 2020-02-26 01:47 am (UTC)
From: (Anonymous)
The computer is a Windows 10 system, and the virtual machine installed therein is ubuntu14.

Re: Can't get the same result

Date: 2020-02-26 08:10 am (UTC)
From: (Anonymous)
The input of the system is a single-channel speech with a sampling rate of 16kHz. If the input audio with a single-channel sampling rate of 8kHz is also used for trial?

Re: Can't get the same result

Date: 2020-02-26 08:47 am (UTC)
From: (Anonymous)
I found the reason, because I input a 2-channel voice. Thank you for your patience.

Re: Can't get the same result

Date: 2020-02-25 08:40 am (UTC)
From: (Anonymous)
I removed the warning,the results is still different.

Re: Can't get the same result

Date: 2020-02-25 11:00 am (UTC)
From: (Anonymous)
Although PESQ-WB is meaningless,but the results I run here sound like a lot of glitches.

the resudual sequence

Date: 2020-11-18 10:43 am (UTC)
From: (Anonymous)
hi,jmvalin. I want to ask you a question about the code on github.for the code below,I think it is used to calculate the resudual sequence, e(t)=s(t)-s'(t),but the code is "+", sum += st->lpc[j]*st->pitch_mem[j] so i guess the st->lpc is "-st->lpc"?is it right? and the residual sequence is filtered by H(z)=1+0.7z^-1?

for (i=0;i
[Error: Irreparable invalid markup ('<frame_size;i++)>') in entry. Owner must fix manually. Raw contents below.]

hi,jmvalin. I want to ask you a question about the code on github.for the code below,I think it is used to calculate the resudual sequence, e(t)=s(t)-s'(t),but the code is "+", sum += st->lpc[j]*st->pitch_mem[j] so i guess the st->lpc is "-st->lpc"?is it right? and the residual sequence is filtered by H(z)=1+0.7z^-1?

for (i=0;i<FRAME_SIZE;i++) {
int j;
float sum = aligned_in[i];
for (j=0;j<LPC_ORDER;j++)
sum += st->lpc[j]*st->pitch_mem[j];// st->lpc is -st->lpc?
RNN_MOVE(st->pitch_mem+1, st->pitch_mem, LPC_ORDER-1);
st->pitch_mem[0] = aligned_in[i];
st->exc_buf[PITCH_MAX_PERIOD+i] = sum + .7*st->pitch_filt;
st->pitch_filt = sum;
//printf("%f\n", st->exc_buf[PITCH_MAX_PERIOD+i]);
}

Opus and LPCNet

Date: 2021-04-08 08:15 pm (UTC)
From: (Anonymous)
Hi Jean-Marc,

I noticed you moved from Mozilla to Amazon. Are you still planning to make this part of Opus? I think the need for this type op codec on devices with less capable procs is so valuable. I would love to hear how you see your work moving forward and maybe even what help and resources you would need to make this happen.

Cheers,

Mark Vletter

Profile

jmvalin: (Default)
jmvalin

March 2023

S M T W T F S
   1234
567891011
12131415161718
1920212223 2425
262728293031 

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 22nd, 2024 03:26 am
Powered by Dreamwidth Studios