Mercurial > hg > audiostuff
comparison spandsp-0.0.6pre17/src/spandsp/plc.h @ 4:26cd8f1ef0b1
import spandsp-0.0.6pre17
| author | Peter Meerwald <pmeerw@cosy.sbg.ac.at> |
|---|---|
| date | Fri, 25 Jun 2010 15:50:58 +0200 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| 3:c6c5a16ce2f2 | 4:26cd8f1ef0b1 |
|---|---|
| 1 /* | |
| 2 * SpanDSP - a series of DSP components for telephony | |
| 3 * | |
| 4 * plc.h | |
| 5 * | |
| 6 * Written by Steve Underwood <steveu@coppice.org> | |
| 7 * | |
| 8 * Copyright (C) 2004 Steve Underwood | |
| 9 * | |
| 10 * All rights reserved. | |
| 11 * | |
| 12 * This program is free software; you can redistribute it and/or modify | |
| 13 * it under the terms of the GNU Lesser General Public License version 2.1, | |
| 14 * as published by the Free Software Foundation. | |
| 15 * | |
| 16 * This program is distributed in the hope that it will be useful, | |
| 17 * but WITHOUT ANY WARRANTY; without even the implied warranty of | |
| 18 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
| 19 * GNU Lesser General Public License for more details. | |
| 20 * | |
| 21 * You should have received a copy of the GNU Lesser General Public | |
| 22 * License along with this program; if not, write to the Free Software | |
| 23 * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. | |
| 24 * | |
| 25 * $Id: plc.h,v 1.21 2009/02/10 13:06:47 steveu Exp $ | |
| 26 */ | |
| 27 | |
| 28 /*! \file */ | |
| 29 | |
| 30 #if !defined(_SPANDSP_PLC_H_) | |
| 31 #define _SPANDSP_PLC_H_ | |
| 32 | |
| 33 /*! \page plc_page Packet loss concealment | |
| 34 \section plc_page_sec_1 What does it do? | |
| 35 The packet loss concealment module provides a synthetic fill-in signal, to minimise | |
| 36 the audible effect of lost packets in VoIP applications. It is not tied to any | |
| 37 particular codec, and could be used with almost any codec which does not | |
| 38 specify its own procedure for packet loss concealment. | |
| 39 | |
| 40 Where a codec specific concealment procedure exists, that algorithm is usually built | |
| 41 around knowledge of the characteristics of the particular codec. It will, therefore, | |
| 42 generally give better results for that particular codec than this generic concealer will. | |
| 43 | |
| 44 The PLC code implements an algorithm similar to the one described in Appendix 1 of G.711. | |
| 45 However, the G.711 algorithm is optimised for 10ms packets. Few people use such small | |
| 46 packets. 20ms is a much more common value, and longer packets are also quite common. The | |
| 47 algorithm has been adjusted with this in mind. Also, the G.711 approach causes an | |
| 48 algorithmic delay, and requires significant buffer manipulation when there is no packet | |
| 49 loss. The algorithm used here avoids this. It causes no delay, and achieves comparable | |
| 50 quality with normal speech. | |
| 51 | |
| 52 Note that both this algorithm, and the one in G.711 are optimised for speech. For most kinds | |
| 53 of music a much slower decay on bursts of lost packets give better results. | |
| 54 | |
| 55 \section plc_page_sec_2 How does it work? | |
| 56 While good packets are being received, the plc_rx() routine keeps a record of the trailing | |
| 57 section of the known speech signal. If a packet is missed, plc_fillin() is called to produce | |
| 58 a synthetic replacement for the real speech signal. The average mean difference function | |
| 59 (AMDF) is applied to the last known good signal, to determine its effective pitch. | |
| 60 Based on this, the last pitch period of signal is saved. Essentially, this cycle of speech | |
| 61 will be repeated over and over until the real speech resumes. However, several refinements | |
| 62 are needed to obtain smooth pleasant sounding results. | |
| 63 | |
| 64 - The two ends of the stored cycle of speech will not always fit together smoothly. This can | |
| 65 cause roughness, or even clicks, at the joins between cycles. To soften this, the | |
| 66 1/4 pitch period of real speech preceeding the cycle to be repeated is blended with the last | |
| 67 1/4 pitch period of the cycle to be repeated, using an overlap-add (OLA) technique (i.e. | |
| 68 in total, the last 5/4 pitch periods of real speech are used). | |
| 69 | |
| 70 - The start of the synthetic speech will not always fit together smoothly with the tail of | |
| 71 real speech passed on before the erasure was identified. Ideally, we would like to modify | |
| 72 the last 1/4 pitch period of the real speech, to blend it into the synthetic speech. However, | |
| 73 it is too late for that. We could have delayed the real speech a little, but that would | |
| 74 require more buffer manipulation, and hurt the efficiency of the no-lost-packets case | |
| 75 (which we hope is the dominant case). Instead we use a degenerate form of OLA to modify | |
| 76 the start of the synthetic data. The last 1/4 pitch period of real speech is time reversed, | |
| 77 and OLA is used to blend it with the first 1/4 pitch period of synthetic speech. The result | |
| 78 seems quite acceptable. | |
| 79 | |
| 80 - As we progress into the erasure, the chances of the synthetic signal being anything like | |
| 81 correct steadily fall. Therefore, the volume of the synthesized signal is made to decay | |
| 82 linearly, such that after 50ms of missing audio it is reduced to silence. | |
| 83 | |
| 84 - When real speech resumes, an extra 1/4 pitch period of synthetic speech is blended with the | |
| 85 start of the real speech. If the erasure is small, this smoothes the transition. If the erasure | |
| 86 is long, and the synthetic signal has faded to zero, the blending softens the start up of the | |
| 87 real signal, avoiding a kind of "click" or "pop" effect that might occur with a sudden onset. | |
| 88 | |
| 89 \section plc_page_sec_3 How do I use it? | |
| 90 Before audio is processed, call plc_init() to create an instance of the packet loss | |
| 91 concealer. For each received audio packet that is acceptable (i.e. not including those being | |
| 92 dropped for being too late) call plc_rx() to record the content of the packet. Note this may | |
| 93 modify the packet a little after a period of packet loss, to blend real synthetic data smoothly. | |
| 94 When a real packet is not available in time, call plc_fillin() to create a sythetic substitute. | |
| 95 That's it! | |
| 96 */ | |
| 97 | |
| 98 /*! Minimum allowed pitch (66 Hz) */ | |
| 99 #define PLC_PITCH_MIN 120 | |
| 100 /*! Maximum allowed pitch (200 Hz) */ | |
| 101 #define PLC_PITCH_MAX 40 | |
| 102 /*! Maximum pitch OLA window */ | |
| 103 #define PLC_PITCH_OVERLAP_MAX (PLC_PITCH_MIN >> 2) | |
| 104 /*! The length over which the AMDF function looks for similarity (20 ms) */ | |
| 105 #define CORRELATION_SPAN 160 | |
| 106 /*! History buffer length. The buffer much also be at leat 1.25 times | |
| 107 PLC_PITCH_MIN, but that is much smaller than the buffer needs to be for | |
| 108 the pitch assessment. */ | |
| 109 #define PLC_HISTORY_LEN (CORRELATION_SPAN + PLC_PITCH_MIN) | |
| 110 | |
| 111 /*! | |
| 112 The generic packet loss concealer context. | |
| 113 */ | |
| 114 typedef struct | |
| 115 { | |
| 116 /*! Consecutive erased samples */ | |
| 117 int missing_samples; | |
| 118 /*! Current offset into pitch period */ | |
| 119 int pitch_offset; | |
| 120 /*! Pitch estimate */ | |
| 121 int pitch; | |
| 122 /*! Buffer for a cycle of speech */ | |
| 123 float pitchbuf[PLC_PITCH_MIN]; | |
| 124 /*! History buffer */ | |
| 125 int16_t history[PLC_HISTORY_LEN]; | |
| 126 /*! Current pointer into the history buffer */ | |
| 127 int buf_ptr; | |
| 128 } plc_state_t; | |
| 129 | |
| 130 | |
| 131 #if defined(__cplusplus) | |
| 132 extern "C" | |
| 133 { | |
| 134 #endif | |
| 135 | |
| 136 /*! Process a block of received audio samples for PLC. | |
| 137 \brief Process a block of received audio samples for PLC. | |
| 138 \param s The packet loss concealer context. | |
| 139 \param amp The audio sample buffer. | |
| 140 \param len The number of samples in the buffer. | |
| 141 \return The number of samples in the buffer. */ | |
| 142 SPAN_DECLARE(int) plc_rx(plc_state_t *s, int16_t amp[], int len); | |
| 143 | |
| 144 /*! Fill-in a block of missing audio samples. | |
| 145 \brief Fill-in a block of missing audio samples. | |
| 146 \param s The packet loss concealer context. | |
| 147 \param amp The audio sample buffer. | |
| 148 \param len The number of samples to be synthesised. | |
| 149 \return The number of samples synthesized. */ | |
| 150 SPAN_DECLARE(int) plc_fillin(plc_state_t *s, int16_t amp[], int len); | |
| 151 | |
| 152 /*! Initialise a packet loss concealer context. | |
| 153 \brief Initialise a PLC context. | |
| 154 \param s The packet loss concealer context. | |
| 155 \return A pointer to the the packet loss concealer context. */ | |
| 156 SPAN_DECLARE(plc_state_t *) plc_init(plc_state_t *s); | |
| 157 | |
| 158 /*! Release a packet loss concealer context. | |
| 159 \param s The packet loss concealer context. | |
| 160 \return 0 for OK. */ | |
| 161 SPAN_DECLARE(int) plc_release(plc_state_t *s); | |
| 162 | |
| 163 /*! Free a packet loss concealer context. | |
| 164 \param s The packet loss concealer context. | |
| 165 \return 0 for OK. */ | |
| 166 SPAN_DECLARE(int) plc_free(plc_state_t *s); | |
| 167 | |
| 168 #if defined(__cplusplus) | |
| 169 } | |
| 170 #endif | |
| 171 | |
| 172 #endif | |
| 173 /*- End of file ------------------------------------------------------------*/ |
