Navigation

19D031GT - Speech Technologies

Course specification
Course title Speech Technologies
Acronym 19D031GT
Study programme Electrical Engineering and Computing
Module Telecommunications
Type of study doctoral studies
Lecturer (for classes)
Lecturer/Associate (for practice)
    Lecturer/Associate (for OTC)
      ESPB 9.0 Status elective
      Condition Passed the examination - Fundamentals of speech communications
      The goal The goal is to master E2E architectures, speech synthesis, and SSL methods. Through research, the focus is on critical analysis and developing solutions for robustness, biometrics, and paralinguistic extraction. Students develop the capacity to design innovative speech systems and produce high-quality scientific publications.
      The outcome Students will be able to design E2E and diffusion models for speech recognition and synthesis. They will master SSL techniques for feature extraction and paralinguistic challenges in low-resource scenarios. They will develop the ability to create original solutions in speech biometrics and prepare scientific papers according to top international standards.
      Contents
      Contents of lectures Physiology and acoustics of speech, elements of linguistics, psychoacoustics, speech perception, and psycholinguistics. Theories and systems in speech synthesis and recognition. Methods of languages and speakers recognition (biometric and forensic applications). Strategies in the design of human-computer dialogue. Specific applications of these technologies in the multi-modal communications.
      Contents of exercises The application of various software tools in the speech signal processing and development of adopted theoretical and practical knowledge through seminars and/or projects.
      Literature
      1. Jurafsky, D., & Martin, J. H. (2024). Speech and Language Processing (3rd Edition Draft). (Original title)
      2. Tan, X. (2022). Neural Speech Synthesis. Springer Nature. (Original title)
      3. Tan, X., Qin, T., Soong, F., & Liu, T. Y. (2021). A Survey on Neural Speech Synthesis. Microsoft Research Asia. (Published in IEEE Access/arXiv). (Original title)
      4. Li, J. (2020). Recent Advances in End-to-End Automatic Speech Recognition. Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SIP), Cambridge University Press. (Original title)
      5. Mohamed, A., Lee, H. Y., Borgholt, L., et al. (2022). Self-Supervised Speech Representation Learning: A Review. IEEE Journal of Selected Topics in Signal Processing. (Original title)
      Number of hours per week during the semester/trimester/year
      Lectures Exercises OTC Study and Research Other classes
      8
      Methods of teaching Consultations, seminar work and/or participation in projects.
      Knowledge score (maximum points 100)
      Pre obligations Points Final exam Points
      Activites during lectures 20 Test paper 0
      Practical lessons 0 Oral examination 30
      Projects
      Colloquia 0
      Seminars 50