Convolutional operators in the time frequency domain
THÈSE DE DOCTORAT de l ? Université de recherche Paris Sciences et Lettres PSL Research University Préparée à l ? École normale supérieure - - École doctorale n SCIENCES MATHEMATIQUES DE PARIS CENTRE Spécialité INFORMATIQUE Soutenue par VINCENT LOSTANLEN le février h Dirigée par Stéphane MALLAT h COMPOSITION DU JURY M PEETERS Geo ?roy STMS Ircam Université Pierre et Marie Curie CNRS rapporteur M RICHARD Ga? l LTCI TELECOM ParisTech Université Paris-Saclay CNRS rapporteur M GLOTIN Hervé LSIS AMU Université de Toulon ENSAM CNRS président du jury M LAGRANGE Mathieu IRCCyN École centrale de Nantes CNRS membre du jury M SHAMMA Shihab LSP École normale supérieure CNRS membre du jury C CCONVOLUTIONAL OPERATORS IN THE TIME-FREQUENCY DOMAIN VINCENT LOSTANLEN Département d ? informatique École normale supérieure CVincent Lostanlen Convolutional operators in the time-frequency domain CIn memoriam Jean-Claude Risset - C CABSTRACT In the realm of machine listening audio classi ?cation is the problem of automatically retrieving the source of a sound according to a prede ?ned taxonomy This dissertation addresses audio classi ?cation by designing signal representations which satisfy appropriate invariants while preserving inter-class variability First we study time-frequency scattering a representation which extracts modulations at various scales and rates in a similar way to idealized models of spectrotemporal receptive ?elds in auditory neuroscience We report state-of-theart results in the classi ?cation of urban and environmental sounds thus outperforming short-term audio descriptors and deep convolutional networks Secondly we introduce spiral scattering a representation which combines wavelet convolutions along time along logfrequency and across octaves thus following the geometry of the Shepard pitch spiral which makes one full turn at every octave We study voiced sounds as a nonstationary source- ?lter model where both the source and the ?lter are transposed in frequency through time and show that spiral scattering disentangles and linearizes these transpositions In practice spiral scattering reaches state-of-the-art results in musical instrument classi ?cation of solo recordings Aside from audio classi ?cation time-frequency scattering and spiral scattering can be used as summary statistics for audio texture synthesis We ?nd that unlike the previously existing temporal scattering transform time-frequency scattering is able to capture the coherence of spectrotemporal patterns such as those arising in bioacoustics or speech up to a scale of about ms Based on this analysis-synthesis framework an artistic collaboration with composer Florian Hecker has led to the creation of ?ve computer music pieces v C CP U B L I C AT I O N S Lostanlen V and S Mallat ??Transformée en scattering sur la spirale temps-chroma-octave ? In Actes du GRETSI Andén J V Lostanlen and S Mallat ?? Joint Time-frequency Scattering for Audio Classi ?cation ? In Proceedings of the IEEE Conference on Machine Learning for Signal Processing MLSP Received a Best Paper award Lostanlen V and S Mallat ?? Wavelet Scattering on the Pitch Spiral ? In Proceedings of the International Conference on Dig- ital Audio E ?ects DAF-x Lostanlen V and C Cella ??Deep Convolutional Networks on the Shepard Pitch Spiral
Documents similaires










-
40
-
0
-
0
Licence et utilisation
Gratuit pour un usage personnel Aucune attribution requise- Détails
- Publié le Aoû 31, 2021
- Catégorie Industry / Industr...
- Langue French
- Taille du fichier 750.8kB