Wiley.com
Print this page Share

Voice and Audio Compression for Wireless Communications, 2nd Edition

ISBN: 978-0-470-51581-5
Hardcover
880 pages
September 2007, Wiley-IEEE Press
List Price: US $288.00
Government Price: US $165.72
Enter Quantity:   Buy
Voice and Audio Compression for Wireless Communications, 2nd Edition (0470515813) cover image

About the Authors xxi

Other Wiley and IEEE Press Books on Related Topics xxiii

Preface and Motivation xxv

Acknowledgements xxxv

I Speech Signals and Waveform Coding 1

1 Speech Signals and an Introduction to Speech Coding 3

1.1 Motivation of Speech Compression 3

1.2 Basic Characterisation of Speech Signals 4

1.3 Classification of Speech Codecs 8

1.4 Waveform Coding 11

1.5 Chapter Summary 28

2 Predictive Coding 29

2.1 Forward-Predictive Coding 29

2.2 DPCMCodec Schematic 30

2.3 Predictor Design 31

2.4 Adaptive One-word-memory Quantisation 39

2.5 DPCM Performance 40

2.6 Backward-adaptive Prediction 42

2.7 The 32 kbps G.721 ADPCM Codec 47

2.8 Subjective and Objective Speech Quality 53

2.9 Variable-rateG.726andEmbeddedG.727ADPC54

2.10 Rate-distortion in Predictive Coding 62

2.11 Chapter Summary 67

II Analysis-by-Synthesis Coding 69

3 Analysis-by-Synthesis Principles 71

3.1 Motivation 71

3.2 Analysis-by-Synthesis Codec Structure 72

3.3 The Short-term Synthesis Filter 73

3.4 Long-term Prediction 76

3.5 Excitation Models 85

3.6 Adaptive Short-term and Long-term Post-Filtering 88

3.7 Lattice-based Linear Prediction 90

3.8 Chapter Summary 97

4 Speech Spectral Quantisation 99

4.1 Log-area Ratios 99

4.2 Line Spectral Frequencies 103

4.3 Vector Quantisation of Spectral Parameters 115

4.4 Spectral Quantisers for Wideband Speech Coding 123

4.5 Chapter Summary 138

5 Regular Pulse Excited Coding 139

5.1 Theoretical Background 139

5.2 The 13 kbps RPE-LTP GSM Speech Encoder 146

5.3 The 13 kbps RPE-LTP GSM Speech Decoder 151

5.4 Bit-sensitivity of the 13 kbps GSM RPE-LTP Codec 153

5.5 Application Example: A Tool-box Based Speech Transceiver 154

5.6 Chapter Summary 157

6 Forward-Adaptive Code Excited Linear Prediction 159

6.1 Background 159

6.2 The Original CELP Approach 160

6.3 Fixed Codebook Search 163

6.4 CELP Excitation Models 165

6.5 Optimisation of  the CELP Codec Parameters 174

6.6 The Error Sensitivity of CELP Codecs 192

6.7 Application Example: A Dual-mode 3.1 kBd Speech Transceiver 204

6.8 Multi-slot PRMA Transceiver 218

6.9 Chapter Summary 223

7 Standard Speech Codecs 225

7.1 Background 225

7.2 The US DoD FS-1016 4.8 kbps CELP Codec 225

7.3 The 7.95 kbps Pan-American Speech Codec – Known as IS-54 DAMPS Codec 231

7.4 The 6.7 kbps Japanese Digital Cellular System’s Speech Codec 235

7.5 The Qualcomm Variable Rate CELPCodec 237

7.6 Japanese Half-rate Speech Codec 245

7.7 The Half-rate GSM Speech Codec 253

7.8 The 8 kbps G.729 Codec 257

7.9 The Reduced Complexity G.729 Annex A Codec 278

7.10 The 12.2 kbps Enhanced Full-rate GSM Speech Codec 282

7.11 The Enhanced Full-rate 7.4 kbps IS-136 Speech Codec 287

7.12 The ITU G.723.1 Dual-rate Codec 292

7.13 Advanced Multirate JD-CDMA Transceiver 302

7.14 Chapter Summary 327

8 Backward-adaptive Code Excited Linear Prediction 331

8.1 Introduction 331

8.2 Motivation and Background 331

8.3 Backward-adaptiveG728CodecSchematic 334

8.4 Backward-adaptiveG728CodingAlgorithm 336

8.5 Reduced-rate G728-like Codec: Variable-length Excitation Vector 351

8.6 The Effects of Long-term Prediction 354

8.7 Closed-loop Codebook Training 359

8.8 Reduced-rate G728-like Codec: Constant-length Excitation Vector 364

8.9 Programmable-rate 8–4 kbps Low-delay CELP Codecs 365

8.10 Backward-adaptive Error Sensitivity Issues 381

8.11 A Low-delay Multimode Speech Transceiver 388

8.12 Chapter Summary 392

III Wideband Speech, MPEG-4 Audio and Their Transmission 393

9 Wideband Speech Coding 395

9.1 Sub-band-ADPCM Wideband Coding at 64 kbps 395

9.2 Wideband Transform-coding at 32 kbps 413

9.3 Sub-band-split Wideband CELPCodecs 416

9.4 Fullband Wideband A CELPCoding 420

9.5 A Turbo-coded Burst-by-burst Adaptive Wideband Speech Transceiver 425

9.6 Turbo-detected Unequal Error Protection Irregular Convolutional Coded AMR-WB Transceivers 442

9.7 The AMR-WB+AudioCodec 454

9.8 Chapter Summary 466

10 MPEG-4 Audio Compression and Transmission 469

10.1 OverviewofMPEG-4Audio 469

10.2 General Audio Coding 471

10.3 Speech Coding in MPEG-4 Audio 495

10.4 MPEG-4CodecPerformance 503

10.5 MPEG-4 Space–time Block Coded OFDM Audio Transceiver 505

10.6 Turbo-detected Space–time Trellis Coded MPEG-4 Audio Transceivers 516

10.7 Turbo-detected Space–time Trellis Coded MPEG-4 Versus AMR-WB Speech Transceivers 525

10.8 Chapter Summary 534

IV Very Low-rate Coding and Transmission 537

11 Overview of Low-rate Speech Coding 539

11.1 Low-bitrate Speech Coding 539

11.2 Linear Predictive Coding Model 553

11.3 Speech Quality Measurements 557

11.4 Speech Database 560

11.5 Chapter Summary 563

12 Linear Predictive Vocoder 565

12.1 Overview of a Linear Predictive Vocoder 565

12.2 Line Spectrum Frequencies Quantisation 566

12.3 Pitch Detection 571

12.4 Unvoiced Frames 583

12.5 Voiced Frames 584

12.6 Adaptive Postfilter 585

12.7 Pulse Dispersion Filter 588

12.8 Results for Linear Predictive Vocoder 592

12.9 Chapter Summary 597

13 Wavelets and Pitch Detection 599

13.1 Conceptual Introduction to Wavelets 599

13.2 Introduction to Wavelet Mathematics 602

13.3 Preprocessing the Wavelet Transform Signal 607

13.4 Voiced–unvoiced Decision 610

13.5 Wavelet-based Pitch Detector 612

13.6 Chapter Summary 619

14 Zinc Function Excitation 621

14.1 Introduction 621

14.2 Overview of Prototype Waveform Interpolation Zinc Function Excitation 622

14.3 Zinc Function Modelling 627

14.4 Pitch Detection 631

14.5 Voiced Speech 635

14.6 Excitation Interpolation Between Prototype Segments 639

14.7 Unvoiced Speech 645

14.8 Adaptive Postfilter 645

14.9 Results for Single Zinc Function Excitation 646

14.10 Error Sensitivity of the 1.9 kbps PWI-ZFE Coder 649

14.11 Multiple Zinc Function Excitation 654

14.12 A Sixth-rate, 3.8 kbps GSM-like Speech Transceiver 661

14.13 Chapter Summary 665

15 Mixed-multiband Excitation 667

15.1 Introduction 667

15.2 Overview of Mixed-multiband Excitation 668

15.3 Finite Impulse Response Filter 671

15.4 Mixed-multiband Excitation Encoder 673

15.5 Mixed-multiband Excitation Decoder 676

15.6 Performance of the Mixed-multiband Excitation Coder 680

15.7 A Higher Rate 3.85 kbps Mixed-multiband Excitation Scheme 686

15.8 A 2.35 kbps Joint-detection-based CDMA Speech Transceiver 691

15.9 Chapter Summary 699

16 Sinusoidal Transform Coding Below 4 kbps 701

16.1 Introduction 701

16.2 Sinusoidal Analysis of Speech Signals 702

16.3 Sinusoidal Synthesis of Speech Signals 704

16.4 Low-bitrate Sinusoidal Coders 705

16.5 Incorporating Prototype Waveform Interpolation 709

16.6 Encoding the Sinusoidal Frequency Component 710

16.7 Determining the Excitation Components 712

16.8 Quantising the Excitation Parameters 720

16.9 Sinusoidal Transform Decoder 728

16.10 Speech Coder Performance 730

16.11 Chapter Summary 736

17 Conclusions on Low-rate Coding 737

17.1 Summary 737

17.2 Listening Tests 738

17.3 Summary of Very-low-rate Coding 739

17.4 Further Research 741

18 Comparison of Speech Codecs and Transceivers 743

18.1 Background to Speech Quality Evaluation 743

18.2 Objective Speech Quality Measures 744

18.3 Subjective Measures 752

18.4 Comparison of Subjective and Objective Measures 753

18.5 Subjective Speech Quality of Various Codecs 755

18.6 Error Sensitivity Comparison of Various Codecs 757

18.7 Objective Speech Performance of Various Transceivers 757

18.8 Chapter Summary 764

19 The Voice over Internet Protocol 765

19.1 Introduction 765

19.2 Session Initiation Protocol 766

19.3 H.323Standards 774

19.4 Real-time Transport Protocol 778

19.5 Conclusion 781

A Constructing the Quadratic Spline Wavelets 783

B Zinc Function Excitation 787

C Probability Density Function for Amplitudes 793

Bibliography 797

Index 825

Author Index 834

Back to Top