US8095362B2 - Method and system for reducing effects of noise producing artifacts in a speech signal - Google Patents
Method and system for reducing effects of noise producing artifacts in a speech signal Download PDFInfo
- Publication number
- US8095362B2 US8095362B2 US12/284,805 US28480508A US8095362B2 US 8095362 B2 US8095362 B2 US 8095362B2 US 28480508 A US28480508 A US 28480508A US 8095362 B2 US8095362 B2 US 8095362B2
- Authority
- US
- United States
- Prior art keywords
- gain
- level
- speech
- subframe
- signal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active, expires
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G3/00—Gain control in amplifiers or frequency changers without distortion of the input signal
- H03G3/20—Automatic control
- H03G3/30—Automatic control in amplifiers having semiconductor devices
- H03G3/3089—Control of digital or coded signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/04—Time compression or expansion
- G10L21/043—Time compression or expansion by changing speed
- G10L21/045—Time compression or expansion by changing speed using thinning out or insertion of a waveform
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G3/00—Gain control in amplifiers or frequency changers without distortion of the input signal
- H03G3/20—Automatic control
- H03G3/30—Automatic control in amplifiers having semiconductor devices
- H03G3/34—Muting amplifier when no signal is present or when only weak signals are present, or caused by the presence of noise signals, e.g. squelch systems
- H03G3/341—Muting when no signals or only weak signals are present
Definitions
- the present invention relates generally to speech coding. More particularly, the present invention relates to reduce effects of noise producing artifacts in a voice codec.
- Speech compression may be used to reduce the number of bits that represent the speech signal thereby reducing the bandwidth needed for transmission.
- speech compression may result in degradation of the quality of decompressed speech.
- a higher bit rate will result in higher quality, while a lower bit rate will result in lower quality.
- modern speech compression techniques such as coding techniques, can produce decompressed speech of relatively high quality at relatively low bit rates.
- modern coding techniques attempt to represent the perceptually important features of the speech signal, without preserving the actual speech waveform.
- Speech compression systems commonly called codecs, include an encoder and a decoder and may be used to reduce the bit rate of digital speech signals. Numerous algorithms have been developed for speech codecs that reduce the number of bits required to digitally encode the original speech while attempting to maintain high quality reconstructed speech.
- FIG. 1 illustrates conventional speech decoding system 100 , which includes excitation decoder 110 , synthesis filter 120 and post-processor 130 .
- decoding system 100 receives encoded speech bitstream 102 over a communication medium (not shown) from an encoder, where decoding system 100 may be part of a mobile communication device, a base station or other wireless or wireline communication device that is capable of receiving encoded speech bitstream 102 .
- Decoding system 100 operates to decode encoded speech bitstream 102 and generate speech signal 132 in the form of a digital signal. Speech signal 132 may then be converted to an analog signal by a digital-to-analog converter (not shown).
- the analog output of the digital-to-analog converter may be received by a receiver (not shown) that may be a human ear, a magnetic tape recorder, or any other device capable of receiving an analog signal.
- a digital recording device, a speech recognition device, or any other device capable of receiving a digital signal may receive speech signal 132 .
- Excitation decoder 110 decodes encoded speech bitstream 102 according to the coding algorithm and bit rate of encoded speech bitstream 102 , and generates decoded excitation 112 .
- Synthesis filter 120 may be a short-term prediction filter that generates synthesized speech 122 based on decoded excitation 112 .
- Post-processor 130 may include filtering, signal enhancement, noise reduction, amplification, tilt correction and other similar techniques capable of improving the perceptual quality of synthesized speech 122 .
- Post-processor 130 may decrease the audible noise without noticeably degrading synthesized speech 122 . Decreasing the audible noise may be accomplished by emphasizing the formant structure of synthesized speech 122 or by suppressing the noise in the frequency regions that are perceptually not relevant for synthesized speech 122 .
- variable-rate speech coders perceptually important parts of speech (e.g., voiced speech, plosives, or voiced onsets) are coded with a higher number of bits, and less important parts of speech (e.g., unvoiced parts or silence between words) are coded with a lower number of bits.
- Noise suppression improves the quality of the reconstructed voice signal and helps variable-rate speech coders distinguish voice parts from noise parts. Noise suppression also helps low bit-rate speech encoders produce higher quality output by improving the perceptual speech quality.
- noise suppression techniques remove noise by spectral subtraction methods in the frequency domain.
- a voice activity detector (VAD) determines in the time-domain whether a frame of the signal includes speech or noise. The noise frames are analyzed in the frequency-domain to determine characteristics of the noise signal. From these characteristics, the spectra from noise frames are subtracted from the spectra of the speech frames, providing a clean speech signal in the speech frames.
- VAD voice activity detector
- time-domain noise attenuation may be applied to improve the quality of a speech signal.
- a speech coding system with time-domain noise attenuation described in U.S. application Ser. No. 09/782,791, filed Feb. 13, 2001, which is hereby incorporated by reference in its entirety
- the gains from linear prediction speech coding are adjusted by a gain factor to suppress background noise.
- the speech coding system uses frequency-domain noise suppression along with time-domain voice attenuation to further reduce the background noise.
- a preprocessor suppresses noise in the digitized signal using a VAD and frequency-domain noise suppression.
- a windowed frame including the identified frame of about 10 ms is transformed into the frequency domain.
- Spectral magnitudes of the noisy speech signal are then modified to reduce the noise level according to an estimated SNR, and the modified spectral magnitudes are combined with the unmodified spectral phases.
- the modified spectrum is then transformed back to the time-domain.
- An analysis-by-synthesis scheme chooses the best representation for several parameters such as an adjusted fixed-codebook gain, a fixed codebook index, a lag parameter, and the adjusted gain parameter of the long-term predictor.
- the gains may be adjusted by a gain factor prior to quantization.
- NSR has a value of about 1 when only background noise is detected in the frame, and when speech is detected in the frame, NSR is the square root of the background noise energy divided by the signal energy in the frame.
- the present invention is directed to a method of reducing effect of noise producing artifacts in silence areas of a speech signal for use by a speech decoding system.
- the method comprises obtaining a plurality of incoming samples of a speech subframe; summing an absolute value of an energy level for each of the plurality of incoming samples to generate a total input level (gain_in); smoothing the total input level to generate a smoothed level (Level_in_sm); determining that the speech subframe is in a silence area based on the total input level, the smoothed level and a spectral tilt parameter; defining a gain using k1*(Level_in_sm/1024)+(1 ⁇ k1), where K1 is a function of the spectral tilt parameter; and modifying an energy level of the speech subframe using the gain.
- the method also comprises summing an absolute value of an energy level for each of the plurality of outgoing samples, prior to the modifying, to generate a total output level (gain_out); determining an initial gain using (gain_in/gain_out); and modifying the gain using the initial gain to generate a modified gain (g0), where the modifying comprises multiplying sig_out for each of the plurality of outgoing samples by a smoothed gain (g_sm), wherein g_sm is obtained using iterations from 0 to n ⁇ 1 of (previous g_sm*0.95+g0*0.05), where n is the number of samples, and previous g_sm is zero (0) prior to the first iteration.
- a method of reducing effect of noise producing artifacts in a speech signal comprises obtaining a plurality of incoming samples representative of a speech subframe; summing an energy level for each of the plurality of samples to generate a total input level; comparing the total input level with a predetermined threshold; setting a gain value as a function of the total input level, wherein the gain value is between zero (0) and one (1), and wherein the function results in a lower gain value when the total input level is indicative of a silence area than when the total input level is indicative of a non-silence area; multiplying the plurality of samples representative of the speech subframe by the gain value.
- the setting divides the total input level by the predetermined threshold if the total input level is not greater than the predetermined threshold, and the setting sets the gain value to one (1) if the total input level is greater than the predetermined threshold.
- the summing sums an absolute value of the energy level for each of the plurality of samples to generate the total input level.
- the method is performed by a speech decoding system. Yet, in another aspect, the method is performed by a speech encoding system.
- the method further comprises determining whether the speech signal is a narrowband signal or a wideband signal; and performing the obtaining, the summing, the comparing, the setting and the multiplying only if the determining determines that the speech signal is the narrowband signal.
- the method further comprises detecting a transition of the speech signal between a narrowband signal and a wideband signal; and gradually changing the gain value based on the transition.
- FIG. 1 illustrates a block diagram of a conventional decoding system for decoding and post-processing of encoded speech bitstream
- FIG. 2 illustrates a block diagram of a speech post-processor, according to one embodiment of the present application.
- FIG. 3 illustrates a flow diagram of a post-processing method for use by the speech post-processor of FIG. 2 , according to one embodiment of the present application.
- FIG. 2 illustrates a block diagram of speech post-processor 220 , according to one embodiment of the present application.
- speech post-processor 220 receives incoming signal (sig_in) 210 and generates outgoing signal (sig_out) 230 after post-processing of sig_in 210 to reduce the audible effects of artifacts in the silence areas of sig_in 210 .
- FIG. 3 which illustrates an example flow diagram of post-processing method 300 for use by speech post-processor 220
- subframe energy level calculator 222 receives sig_in 210 , at step 310 , and calculates a sum of absolute energy level of each sample of a subframe of sig_in 210 , which may be defined by:
- L the subframe energy level
- ⁇ (n) designates sig_in 210
- (n) the number of samples.
- subframe energy level comparator 224 receives the subframe energy level (L) from subframe energy level calculator 222 , and at step 320 , subframe energy level comparator 224 compares the subframe energy level (L) with a predetermined threshold (TH), e.g. 1,024, for a determination of whether the subframe energy level (L) is indicative of a silence area.
- TH predetermined threshold
- Output of subframe energy level comparator 224 is then received by subframe energy level modifier 226 . If subframe energy level modifier 226 determines that the subframe energy level (L) is greater than the predetermined threshold (TH), at step 320 , post-processing method 300 moves to step 330 , which is indicative of non-silence area of speech. At step 330 , a gain value (g) is set to one (1). On the other hand, if subframe energy level modifier 226 determines that the subframe energy level (L) is not greater than the predetermined threshold (TH), at step 320 , post-processing method 300 moves to step 340 , which is indicative of the silence area of speech. At step 340 , the gain value (g) is set according to the result of the subframe energy level (L) divided by the predetermined threshold (TH), where 0 ⁇ g ⁇ 1, as shown below:
- post-processing method 300 moves to step 350 , where subframe energy level modifier 226 modifies the subframe energy level (L), to reduced effects of artifacts in the silence areas of post-processor outgoing signal (sig_out) 230 , for example, by multiplying the subframe energy level (L) by the predetermined threshold (TH), as shown in step 350 , which is defined by: g* ⁇ (n) Equation 3.
- the embodiments of FIG. 2 and FIG. 3 are implemented in a speech decoder; however, in other embodiments, the present invention may also be implemented by an encoder.
- equation 2 shows that g is a function of L for silence areas, g may also be a function L in non-silence areas (L>TH) in other embodiments.
- g is set to one (1) in non-silence areas (L>TH), such that ⁇ (n) remains unmodified after the operation of equation 3 in non-silence areas.
- equation 2 shows that g is defined by the function of L/TH, other functions of L may be utilized by other embodiments.
- Appendices A and B show an implementation of one embodiment of the present invention using “C” programming language in fixed-point and floating-point, respectively.
- the signal energy is reduced after detecting low level silence signal.
- the signal level before speech post-processing may be defined as:
- Lsub is the subframe size or the number of speech samples for each subframe
- sig_in( ) is the signal before performance of speech post-processing.
- equations 4 and 5 may be performed by subframe energy level calculator 222 of post-processor 200 in FIG. 2 .
- subframe energy level comparator 224 may determine Sil_Deci, according to equation 6.
- (Level_in_sm ⁇ gain_in) is indicative of non-existence of big peaks in the signal.
- the initial post-filtered signal level may be calculated by subframe energy level calculator 222 , as follows:
- the initial gain for adjusting the post-filtered signal energy can be determined by subframe energy level modifier 226 as:
- subframe energy level modifier 226 will apply a gain adjustment to the post-filtered signal, as follows, where g_sm is the smoothed gain:
- sig_out is modified by multiplying sig_out for each of the plurality of outgoing samples by a smoothed gain (g_sm), wherein g_sm is obtained using iterations from 0 to n ⁇ 1 of (previous g_sm*0.95+g0*0.05), where n is the number of samples or the subframe size, and previous g_sm is zero (0) prior to the first iteration.
- g_sm smoothed gain
- the above-described silence gain reduction is only performed for the narrowband (0-4 KHz) speech signal in the decoder, but not for the wideband (4-8 KHz) speech signal.
- other embodiments of the present invention may include encoder and/or wideband implementations.
- the gain may be gradually changed or adjusted rather than an abrupt application (transition from wideband to narrowband) or non-application (transition from narrowband to wideband) of the gain for reducing effects of the artifacts in the silence areas, where switching between narrowband and wideband is further described in U.S. Patent Application Ser. No. 60/784,384, filed Mar. 20, 2006, entitled “Seamless Speech Band Transition and Pitch Track Smoothing,” which is hereby incorporated by reference in its entirety.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Noise Elimination (AREA)
- Reduction Or Emphasis Of Bandwidth Of Signals (AREA)
- Soundproofing, Sound Blocking, And Sound Damping (AREA)
Abstract
Description
where L is the subframe energy level, Ŝ(n) designates
g*Ŝ(n) Equation 3.
where Lsub is the subframe size or the number of speech samples for each subframe, and sig_in( ) is the signal before performance of speech post-processing. Next, the smoothed level of sig_in is calculated by:
Level_in_sm=0.75*Level_in_sm+0.25*gain_in Equation 5,
where the initial value of Level_in_sm is zero (0). In one embodiment, equations 4 and 5 may be performed by subframe
Sil — Deci=(Level_in_sm<1024) && (gain_in<2*Level_in_sm) && (parcor0<512./32768); Equation 6,
where, Sil_Deci=1 is indicative of low level silence detection. In one embodiment, subframe
if ((Sil — Deci=1) && (Level_in_sm<gain_in))
gain_in=Level_in_sm; Equation 7.
where, the initial gain for adjusting the post-filtered signal energy can be determined by subframe
gain=k1*(Level_in_sm/1024)+(1−k1); Equation 10,
where 0<=gain<=1, and k1 (0=<k1<=1) is a function of parcor0, and:
k1=(512./32768)−parcor0;
-
- if (k1>(2047/32768))
- k1=(2047/32768);
- k1=k1/(2047/32768);
- if (k1>(2047/32768))
if ( Sil_Deci==1 ) { | ||
g0 = g0 * gain | ||
for (i=0; i<Lsub;i++) { | ||
g_sm = g_sm*0.95 + g0*0.05; | ||
sig_out(i) = sig_out(i)*g_sm; | ||
} | ||
} | ||
APPENDIX A |
/***************************************************************************/ |
/***************************************************************************/ |
/* Fixed-Point Silence Cleaning */ |
/***************************************************************************/ |
/***************************************************************************/ |
Word16Level_in_sm=1024; /* temporally put this varible here */ |
Word16 PostNB=0; /* temporally set to 0; real value : 0<=PostNB<=1 */ |
/*---------------------------------------------------------------------------- |
* G729EV_G729_scale_st - control of the subframe gain |
* gain[n] = G729EV_G729_AGC_FAC * gain[n−1] + (1 − AGC_FAC) g_in/g_out |
*---------------------------------------------------------------------------- |
*/ |
static void G729EV_G729_scale_st(Word16 *sig_in, /* input : postfilter input signal */ |
Word16 *sig_out, /* in/out: postfilter output signal */ |
Word16 *gain_prec, /* in/out: last value of gain for subframe */ |
#ifdef SILENCE_CLEANING |
Word16 parcor0, |
Word16 PostNB, |
Word32 *Level_in_sm |
#endif |
) |
{ |
Word32 L_acc, L_temp; |
Word16 i; |
Word16 scal_in, scal_out; |
Word16 s_g_in, s_g_out, temp, sh_g0, g0; |
#ifdef SILENCE_CLEANING |
Word16 gain; |
Word16 Cond; |
#endif |
/* compute input gain */ |
L_acc = 0L; |
#ifdef WMOPS |
move32( ); |
#endif |
for (i = 0; i < G729EV_G729_L_SUBFR; i++) |
{ |
L_temp = L_abs(L_deposit_1(sig_in[i])); |
L_acc = L_add(L_acc, L_temp); |
} |
#ifdef SILENCE_CLEANING |
/* Smooth level */ |
*Level_in_sm=L_add(L_shr(*Level_in_sm, 1), L_shr(*Level_in_sm, 2)); |
*Level_in_sm=L_add(*Level_in_sm, L_shr(L_acc, 2)); |
/* Detect silence*/ |
Cond = (*Level_in_sm<1024) && (L_acc<L_shl(*Level_in_sm, 1)) && (parcor0<512); |
/* If silence is detected, replace the original level with smoothed level*/ |
if (Cond == 1) |
L_acc = *Level_in_sm; |
#endif |
#ifdef WMOPS |
test( ); |
#endif |
if (L_acc == 0L) |
{ |
g0 = 0; |
#ifdef WMOPS |
move16( ); |
#endif |
} |
else |
{ |
scal_in = norm_l(L_acc); |
L_acc = L_shl(L_acc, scal_in); |
s_g_in = extract_h(L_acc); /* normalized */ |
/* Compute output gain */ |
L_acc = 0L; |
#ifdef WMOPS |
move32( ); |
#endif |
for (i = 0; i < G729EV_G729_L_SUBFR; i++) |
{ |
L_temp = L_abs(L_deposit_1(sig_out[i])); |
L_acc = L_add(L_acc, L_temp); |
} |
#ifdef WMOPS |
test( ); |
#endif |
if (L_acc == 0L) |
{ |
*gain_prec = 0; |
#ifdef WMOPS |
move16( ); |
#endif |
return; |
} |
scal_out = norm_l(L_acc); |
L_acc = L_shl(L_acc, scal_out); |
s_g_out = extract_h(L_acc); /* normalized */ |
sh_g0 = add(scal_in, 1); |
sh_g0 = sub(sh_g0, scal_out); /* scal_in − scal_out + 1 */ |
#ifdef WMOPS |
test( ); |
#endif |
if (sub(s_g_in, s_g_out) < 0) |
{ |
g0 = div_s(s_g_in, s_g_out); /* s_g_in/s_g_out in Q15 */ |
} |
else |
{ |
temp = sub(s_g_in, s_g_out); /* sufficient since normalized */ |
g0 = shr(div_s(temp, s_g_out), 1); |
g0 = add(g0, (Word16) 0x4000); /* s_g_in/s_g_out in Q14 */ |
sh_g0 = sub(sh_g0, 1); |
} |
/* L_gain_in/L_gain_out in Q14 */ |
/* overflows if L_gain_in > 2 * L_gain_out */ |
g0 = shr(g0, sh_g0); /* sh_g0 may be >0, <0, or =0 */ |
#ifdef SILENCE_CLEANING |
if ( Cond==1 ) |
{ /* Apply a gain reduction for silence; the gain is defined as |
gain = (Level_in_sm/MAX_SILENCE_LEVEL)*k1 + (1−k1); |
k1 (0=<k1<=1) is a function of PARCOR0 */ |
/* k1 in Q15*/ |
temp=sub(512, parcor0); |
if (temp>2047) temp=2047; |
temp=shl(temp, 4); |
/* gain = (Level_in_sm/MAX_SILENCE_LEVEL) in Q15 */ |
if (*Level_in_sm>1023) gain = 1023; |
else gain = extract_1(*Level_in_sm); |
gain = shl(gain, 5); |
/* gain = gain*k1 + 1−k1*/ |
gain = mult_r(gain, temp); |
gain = add(gain, sub(32767, temp)); |
gain = mult_r(gain, sub(32767, PostNB)); |
gain = add(gain, PostNB); |
g0 = mult_r(g0, gain); |
} |
#endif |
g0 = mult_r(g0, G729EV_G729_AGC_FAC1); /* L_gain_in/L_gain_out * |
AGC_FAC1 */ |
} |
/* gain(n) = G729EV_G729_AGC_FAC gain(n−1) + G729EV_G729_AGC_FAC1 |
gain_in/gain_out */ |
/* sig_out(n) = gain(n) sig_out(n) */ |
gain = *gain_prec; |
for (i = 0; i < G729EV_G729_L_SUBFR; i++) |
{ |
temp = mult_r(G729EV_G729_AGC_FAC, gain); |
gain = add(temp, g0); /* in Q14 */ |
L_temp = L_mult(gain, sig_out[i]); |
L_temp = L_shl(L_temp, 1); |
sig_out[i] = round(L_temp); |
} |
*gain_prec = gain; |
#ifdef WMOPS |
move16( ); |
move16( ); |
#endif |
return; |
} |
APPENDIX B |
/***************************************************************************/ |
/***************************************************************************/ |
/* Floating-Point Silence Cleaning */ |
/***************************************************************************/ |
/***************************************************************************/ |
REAL Level_in_sm=1024.; /* temporally put this varible here */ |
REAL PostNB=0.; /* temporally set to 0; real value: 0<=PostNB<=1 */ |
/**--------------------------------------------------------------------------- |
* Function G729EV_G729_scale_st - control of the subframe gain |
* gain[n] = G729EV_G729_AGC_FAC * gain[n−1] + (1 − G729EV_G729_AGC_FAC) |
g_in/g_out |
*---------------------------------------------------------------------------- |
*/ |
void |
G729EV_G729_scale_st (REAL *sig_in, /**< input : postfilter input signal */ |
REAL *sig_out, /**< in/out: postfilter output signal */ |
REAL *gain_prec /**< in/out: last value of gain for subframe */ |
#ifdef SILENCE_CLEANING |
, |
REAL parcor0, |
REAL PostNB, |
REAL *Level_in_sm |
#endif |
) |
{ |
int i; |
REAL gain_in, gain_out; |
REAL g0, gain; |
#ifdef SILENCE_CLEANING |
short Cond; |
REAL k1; |
#endif |
/* compute input gain */ |
gain_in = (REAL) 0.; |
for (i = 0; i < G729EV_G729_L_SUBFR; i++) |
{ |
gain_in += (REAL) fabs (sig_in[i]); |
} |
#ifdef SILENCE_CLEANING |
/* Smooth level */ |
*Level_in_sm = 0.75*(*Level_in_sm) + 0.25*gain_in; |
/* Detect silence*/ |
Cond = (*Level_in_sm < 1024.) && (gain_in < *Level_in_sm*2.) && (parcor0<512./32768); |
/* If silence is detected, replace the original level with smoothed level*/ |
if (Cond == 1) |
gain_in = *Level_in_sm; |
#endif |
if (gain_in == (REAL) 0.) |
{ |
g0 = (REAL) 0.; |
} |
else |
{ |
/* Compute output gain */ |
gain_out = (REAL) 0.; |
for (i = 0; i < G729EV_G729_L_SUBFR i++) |
{ |
gain_out += (REAL) fabs (sig_out[i]); |
} |
if (gain_out == (REAL) 0.) |
{ |
*gain_prec = (REAL) 0.; |
return; |
} |
g0 = gain_in / gain_out; |
#ifdef SILENCE_CLEANING |
if ( Cond==1 ) |
{ /* Apply a gain reduction for silence; the gain is defined as |
gain = (Level_in_sm/MAX_SILENCE_LEVEL)*k1 + (1−k1); |
k1 (0=<k1<=1) is a function of PARCOR0 */ |
/*k1*/ |
k1=(512./32768) − parcor0; |
if (k1>(2047./32768)) k1= (2047./32768); |
k1 /= (2047./32768); |
/* gain = (Level_in_sm/MAX_SILENCE_LEVEL)*/ |
if (*Level_in_sm>1023) gain = 1.; |
else gain = *Level_in_sm/1024.; |
gain = gain*k1 + 1−k1; |
gain = gain *(1.−PostNB) + PostNB; |
g0 *= gain; |
} |
#endif |
g0 *= G729EV_G729_AGC_FAC1; |
} |
/* compute gain(n) = G729EV_G729_AGC_FAC gain(n−1) + (1− |
G729EV_G729_AGC_FAC)gain_in/gain_out */ |
/* sig_out(n) = gain(n) sig_out(n) */ |
gain = *gain_prec; |
for (i = 0; i < G729EV_G729_L_SUBFR; i++) |
{ |
gain *= G729EV_G729_AGC_FAC; |
gain += g0; |
sig_out[i] *= gain; |
} |
*gain_prec = gain; |
return; |
} |
Claims (34)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/284,805 US8095362B2 (en) | 2006-03-20 | 2008-09-24 | Method and system for reducing effects of noise producing artifacts in a speech signal |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/385,553 US7454335B2 (en) | 2006-03-20 | 2006-03-20 | Method and system for reducing effects of noise producing artifacts in a voice codec |
US12/284,805 US8095362B2 (en) | 2006-03-20 | 2008-09-24 | Method and system for reducing effects of noise producing artifacts in a speech signal |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/385,553 Continuation US7454335B2 (en) | 2006-03-20 | 2006-03-20 | Method and system for reducing effects of noise producing artifacts in a voice codec |
Publications (2)
Publication Number | Publication Date |
---|---|
US20090070106A1 US20090070106A1 (en) | 2009-03-12 |
US8095362B2 true US8095362B2 (en) | 2012-01-10 |
Family
ID=38519014
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/385,553 Active 2027-04-13 US7454335B2 (en) | 2006-03-20 | 2006-03-20 | Method and system for reducing effects of noise producing artifacts in a voice codec |
US12/284,805 Active 2028-02-01 US8095362B2 (en) | 2006-03-20 | 2008-09-24 | Method and system for reducing effects of noise producing artifacts in a speech signal |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/385,553 Active 2027-04-13 US7454335B2 (en) | 2006-03-20 | 2006-03-20 | Method and system for reducing effects of noise producing artifacts in a voice codec |
Country Status (7)
Country | Link |
---|---|
US (2) | US7454335B2 (en) |
EP (2) | EP1997101B1 (en) |
AT (1) | ATE491262T1 (en) |
BR (1) | BRPI0621563A2 (en) |
DE (1) | DE602006018791D1 (en) |
ES (1) | ES2356503T3 (en) |
WO (1) | WO2007111645A2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160163326A1 (en) * | 2010-07-02 | 2016-06-09 | Dolby International Ab | Pitch filter for audio signals |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7454335B2 (en) * | 2006-03-20 | 2008-11-18 | Mindspeed Technologies, Inc. | Method and system for reducing effects of noise producing artifacts in a voice codec |
JP4912036B2 (en) * | 2006-05-26 | 2012-04-04 | 富士通株式会社 | Directional sound collecting device, directional sound collecting method, and computer program |
KR101235830B1 (en) | 2007-12-06 | 2013-02-21 | 한국전자통신연구원 | Apparatus for enhancing quality of speech codec and method therefor |
CN100550133C (en) | 2008-03-20 | 2009-10-14 | 华为技术有限公司 | A kind of audio signal processing method and device |
CN101483042B (en) * | 2008-03-20 | 2011-03-30 | 华为技术有限公司 | Noise generating method and noise generating apparatus |
WO2010091555A1 (en) * | 2009-02-13 | 2010-08-19 | 华为技术有限公司 | Stereo encoding method and device |
US8447595B2 (en) * | 2010-06-03 | 2013-05-21 | Apple Inc. | Echo-related decisions on automatic gain control of uplink speech signal in a communications device |
US8924200B2 (en) * | 2010-10-15 | 2014-12-30 | Motorola Mobility Llc | Audio signal bandwidth extension in CELP-based speech coder |
JP5085769B1 (en) * | 2011-06-24 | 2012-11-28 | 株式会社東芝 | Acoustic control device, acoustic correction device, and acoustic correction method |
CN102568473A (en) * | 2011-12-30 | 2012-07-11 | 深圳市车音网科技有限公司 | Method and device for recording voice signals |
CN102592592A (en) * | 2011-12-30 | 2012-07-18 | 深圳市车音网科技有限公司 | Voice data extraction method and device |
US9208798B2 (en) | 2012-04-09 | 2015-12-08 | Board Of Regents, The University Of Texas System | Dynamic control of voice codec data rate |
US9002030B2 (en) * | 2012-05-01 | 2015-04-07 | Audyssey Laboratories, Inc. | System and method for performing voice activity detection |
ES2626977T3 (en) * | 2013-01-29 | 2017-07-26 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus, procedure and computer medium to synthesize an audio signal |
US9484043B1 (en) * | 2014-03-05 | 2016-11-01 | QoSound, Inc. | Noise suppressor |
EP3573059B1 (en) * | 2018-05-25 | 2021-03-31 | Dolby Laboratories Licensing Corporation | Dialogue enhancement based on synthesized speech |
CN115512687B (en) * | 2022-11-08 | 2023-02-17 | 之江实验室 | Voice sentence-breaking method and device, storage medium and electronic equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5544250A (en) | 1994-07-18 | 1996-08-06 | Motorola | Noise suppression system and method therefor |
US5550924A (en) | 1993-07-07 | 1996-08-27 | Picturetel Corporation | Reduction of background noise for speech enhancement |
US6453289B1 (en) | 1998-07-24 | 2002-09-17 | Hughes Electronics Corporation | Method of noise reduction for speech codecs |
US20040148166A1 (en) | 2001-06-22 | 2004-07-29 | Huimin Zheng | Noise-stripping device |
US20050177364A1 (en) | 2002-10-11 | 2005-08-11 | Nokia Corporation | Methods and devices for source controlled variable bit-rate wideband speech coding |
US7454335B2 (en) * | 2006-03-20 | 2008-11-18 | Mindspeed Technologies, Inc. | Method and system for reducing effects of noise producing artifacts in a voice codec |
-
2006
- 2006-03-20 US US11/385,553 patent/US7454335B2/en active Active
- 2006-10-23 BR BRPI0621563-7A patent/BRPI0621563A2/en not_active IP Right Cessation
- 2006-10-23 EP EP06817327A patent/EP1997101B1/en not_active Not-in-force
- 2006-10-23 DE DE602006018791T patent/DE602006018791D1/en active Active
- 2006-10-23 EP EP10179672A patent/EP2290815B1/en not_active Not-in-force
- 2006-10-23 WO PCT/US2006/041434 patent/WO2007111645A2/en active Search and Examination
- 2006-10-23 AT AT06817327T patent/ATE491262T1/en not_active IP Right Cessation
- 2006-10-23 ES ES06817327T patent/ES2356503T3/en active Active
-
2008
- 2008-09-24 US US12/284,805 patent/US8095362B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5550924A (en) | 1993-07-07 | 1996-08-27 | Picturetel Corporation | Reduction of background noise for speech enhancement |
US5544250A (en) | 1994-07-18 | 1996-08-06 | Motorola | Noise suppression system and method therefor |
US6453289B1 (en) | 1998-07-24 | 2002-09-17 | Hughes Electronics Corporation | Method of noise reduction for speech codecs |
US20040148166A1 (en) | 2001-06-22 | 2004-07-29 | Huimin Zheng | Noise-stripping device |
US20050177364A1 (en) | 2002-10-11 | 2005-08-11 | Nokia Corporation | Methods and devices for source controlled variable bit-rate wideband speech coding |
US7454335B2 (en) * | 2006-03-20 | 2008-11-18 | Mindspeed Technologies, Inc. | Method and system for reducing effects of noise producing artifacts in a voice codec |
Non-Patent Citations (1)
Title |
---|
Coding of Speech at 8kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP), International Telecommunication Union, ITU-T Recommendation G.729, 1-35 (Mar. 1996). |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160163326A1 (en) * | 2010-07-02 | 2016-06-09 | Dolby International Ab | Pitch filter for audio signals |
US9552824B2 (en) * | 2010-07-02 | 2017-01-24 | Dolby International Ab | Post filter |
US9558753B2 (en) * | 2010-07-02 | 2017-01-31 | Dolby International Ab | Pitch filter for audio signals |
US9558754B2 (en) | 2010-07-02 | 2017-01-31 | Dolby International Ab | Audio encoder and decoder with pitch prediction |
US9595270B2 (en) * | 2010-07-02 | 2017-03-14 | Dolby International Ab | Selective post filter |
US9830923B2 (en) | 2010-07-02 | 2017-11-28 | Dolby International Ab | Selective bass post filter |
US9858940B2 (en) | 2010-07-02 | 2018-01-02 | Dolby International Ab | Pitch filter for audio signals |
US10236010B2 (en) | 2010-07-02 | 2019-03-19 | Dolby International Ab | Pitch filter for audio signals |
US10811024B2 (en) | 2010-07-02 | 2020-10-20 | Dolby International Ab | Post filter for audio signals |
US11183200B2 (en) | 2010-07-02 | 2021-11-23 | Dolby International Ab | Post filter for audio signals |
US11610595B2 (en) | 2010-07-02 | 2023-03-21 | Dolby International Ab | Post filter for audio signals |
Also Published As
Publication number | Publication date |
---|---|
EP2290815B1 (en) | 2013-02-20 |
EP2290815A3 (en) | 2011-09-07 |
EP1997101A2 (en) | 2008-12-03 |
EP2290815A2 (en) | 2011-03-02 |
BRPI0621563A2 (en) | 2012-07-10 |
DE602006018791D1 (en) | 2011-01-20 |
ES2356503T3 (en) | 2011-04-08 |
WO2007111645A2 (en) | 2007-10-04 |
EP1997101B1 (en) | 2010-12-08 |
US7454335B2 (en) | 2008-11-18 |
WO2007111645A3 (en) | 2008-10-02 |
EP1997101A4 (en) | 2010-01-06 |
ATE491262T1 (en) | 2010-12-15 |
US20070219791A1 (en) | 2007-09-20 |
US20090070106A1 (en) | 2009-03-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8095362B2 (en) | Method and system for reducing effects of noise producing artifacts in a speech signal | |
US6862567B1 (en) | Noise suppression in the frequency domain by adjusting gain according to voicing parameters | |
US7020605B2 (en) | Speech coding system with time-domain noise attenuation | |
US6604070B1 (en) | System of encoding and decoding speech signals | |
US6574593B1 (en) | Codebook tables for encoding and decoding | |
US6581032B1 (en) | Bitstream protocol for transmission of encoded voice signals | |
RU2596584C2 (en) | Coding of generalised audio signals at low bit rates and low delay | |
RU2262748C2 (en) | Multi-mode encoding device | |
JP4166673B2 (en) | Interoperable vocoder | |
US5752222A (en) | Speech decoding method and apparatus | |
KR100488080B1 (en) | Multimode speech encoder | |
EP2774145B1 (en) | Improving non-speech content for low rate celp decoder | |
KR20000075936A (en) | A high resolution post processing method for a speech decoder | |
US10672411B2 (en) | Method for adaptively encoding an audio signal in dependence on noise information for higher encoding accuracy | |
JP5291004B2 (en) | Method and apparatus in a communication network | |
AU2003262451B2 (en) | Multimode speech encoder | |
AU766830B2 (en) | Multimode speech encoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GAO, YANG;SHLOMOT, EYAL;REEL/FRAME:021660/0954 Effective date: 20080826 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT Free format text: SECURITY INTEREST;ASSIGNOR:MINDSPEED TECHNOLOGIES, INC.;REEL/FRAME:032495/0177 Effective date: 20140318 |
|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, INC., CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:032861/0617 Effective date: 20140508 Owner name: GOLDMAN SACHS BANK USA, NEW YORK Free format text: SECURITY INTEREST;ASSIGNORS:M/A-COM TECHNOLOGY SOLUTIONS HOLDINGS, INC.;MINDSPEED TECHNOLOGIES, INC.;BROOKTREE CORPORATION;REEL/FRAME:032859/0374 Effective date: 20140508 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: MINDSPEED TECHNOLOGIES, LLC, MASSACHUSETTS Free format text: CHANGE OF NAME;ASSIGNOR:MINDSPEED TECHNOLOGIES, INC.;REEL/FRAME:039645/0264 Effective date: 20160725 |
|
AS | Assignment |
Owner name: MACOM TECHNOLOGY SOLUTIONS HOLDINGS, INC., MASSACH Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MINDSPEED TECHNOLOGIES, LLC;REEL/FRAME:044791/0600 Effective date: 20171017 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FEPP | Fee payment procedure |
Free format text: 11.5 YR SURCHARGE- LATE PMT W/IN 6 MO, LARGE ENTITY (ORIGINAL EVENT CODE: M1556); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |