- Speaker recognition: verification vs identification
- Speaker embeddings: i-vectors, d-vectors, x-vectors (TDNN-based), ECAPA-TDNN
- Speaker diarisation: who spoke when, clustering-based, end-to-end neural diarisation
- Audio classification: environmental sounds, music genre, audio tagging
- Audio event detection: Sound Event Detection (SED), AudioSet
- Acoustic scene classification
- Audio embeddings: VGGish, PANNs, audio spectrogram transformer (AST)
- Music information retrieval: beat tracking, chord recognition, source separation basics