0%

Remerciez-le!

Remerciez @Admin pour avoir partagé cet document gratuitement, de la manière la plus simple, en partageant sur les réseaux sociaux.

SYNOPSIS SERIATION: A COMPUTER MUSIC PIECE MADE WITH TIME–FREQUENCY SCATTERING

SYNOPSIS SERIATION: A COMPUTER MUSIC PIECE MADE WITH TIME–FREQUENCY SCATTERING AND INFORMATION GEOMETRY Vincent Lostanlen LS2N CNRS Nantes, France Florian Hecker Edinburgh College of Art The University of Edinburgh Edinburgh, UK RÉSUMÉ Cet article présente Synopsis Seriation (2021), une création musicale générée avec l’aide de l’ordinateur. L’idée cen- trale consiste à ré-organiser des fragments de pistes dans une œuvre multicanal pré-existante aﬁn de produire un ﬂux stéréo. Nous appelons “sériation” la recherche de la plus grande similarité de timbre entre fragments successifs dans chaque canal ainsi qu’entre canal gauche et canal droite. Or, puisque le nombre de permutations d’un ensemble est la factorielle de son cardinal, l’espace des séquences possibles est trop vaste pour être exploré directement par l’humain. Là contre, nous formalisons la sériation comme un problème d’optimisation NP-complet de type “voyageur de commerce” et présentons un algorithme évolutionniste qui en donne une solution approximée. Dans ce cadre, nous déﬁnissons la dissimilarité de timbre entre deux fragments à partir d’outils issus de l’analyse en ondelettes (diffusion temps-fréquence) ainsi que de la géométrie de l’informa- tion (divergence de Jensen–Shannon). Pour cette œuvre, nous avons exécuté l’algorithme de sériation sur un corpus de quatre œuvres de Florian Hecker, comprenant notam- ment Formulation (2015). La maison de disques Editions Mego, Vienne, a publié Synopsis Seriation en format CD, assorti d’un livret d’infographies sur la diffusion temps- fréquence conçu en partenariat avec le studio de design NORM, Zurich. 1. INTRODUCTION In mathematics, the seriation problem seeks to arrange ele- ments of a ﬁnite set U into a sequence u1 . . . uN in such a way that distances d(ui, uj) are small if and only if |i −j| is also small [12]. Seriation bears a resemblance with the traveling salesperson problem (TSP), which aims to min- imize the average distance d(ui, ui+1) between adjacent elements in the sequence. Drawing inspiration from these mathematical ideas, the piece Synopsis Seriation (2021, see Figure 1) consists of a sequence of musical parts whose ordering in time reﬂects similarity in timbre. The set U corresponds to an unstruc- tured collection of musical material: in our case, various pre-existing creations gathered under the name of Seriation Input. Seriation Input amounts to 283 minutes of audio in total, comprising hundreds of musical parts. Figure 1. Album cover of Synopsis Seriation, released in March 2021 by Editions Mego, Vienna. The CD im- print represents the time–frequency scattering transform of the piece, which serves as a feature for the segmen- tation and structuration of the piece. Graphical design by NORM, Zurich. Website: https://editionsmego.com/ release/EMEGO-256 The search space of all possible sequences is too vast to be explored manually. Indeed, the number of possible arrangements of U is equal to N! = N × (N −1) × . . . 2. This number is over one million for N > 10 and over one billion for N > 13. Coping with such a combinatorial explosion thus requires the help of the computer. In this article, we describe the algorithmic workﬂow which has led to the synthesis of Synopsis Seriation. On a conceptual level, the worfklow involves a virtual agent which “listens” to Synopsis Input, segments it into temporal parts, and ultimately rearranges those parts to maximize the auditory similarity between adjacent parts. One originality of our approach is that the virtual agent operates purely in the audio domain, without resorting to an external notation system such as MIDI or MusicXML. Furthermore, the agent does not assume that the input follows a traditional structure of repeated sections, such as verse-chorus or AABA forms. Lastly, the agent assigns parts of Seriation Input to either a stereophonic output by optimizing a joint objective of temporal consistency and binaural (left-right) consistency. Time–frequency scattering (Section 2) Information geometry (Section 3) Evolutionary computing (Section 4) Figure 2. Flowchart of the computational stages involved in the synthesis of Synopsis Seriation. Time–frequency scattering is the acoustic frontend, information geometry performs sequential changepoint detection, and evolution- ary computing solves a variant of the traveling salesperson problem (TSP). See paragraph below for details. Our proposed procedure of seriation is akin to a family of digital audio effects known as concatenative synthesis [24]. Generally speaking, concatenative synthesis operates by assembling short audio segments which are taken from a large corpus so as to achieve a certain similarity objec- tive. In this sense, our choice of audio descriptor (time– frequency scattering) and segmentation algorithm (general- ized likelihood ratios) could potentially apply to real-time concatenative synthesis frameworks, such as CataRT [23]. However, we note that CataRT produces sounds according to a local target speciﬁcation that is expressed in terms of sound descriptors or via an example sound. On the contrary, Synopsis Seriation does not rely on a predeﬁned target; instead, it formulates a global problem of combina- torial optimization (the TSP) and arranges all segments of Synopsis Input accordingly. This formulation guarantees a one-to-one mapping between audio material in Synopsis Input and Synopsis Seriation. The ﬂowchart in Figure 2 summarizes the technical components of Synopsis Seriation. To begin with, Sec- tion 2 presents the acoustic frontend of the virtual listening agent: namely, time–frequency scattering. Time–frequency scattering is an operator whose architecture resembles spec- trotemporal receptive ﬁelds (STRF) in auditory neurophys- iology and convolutional neural networks (convnets) in deep learning. Section 3 presents the algorithm which seg- ments the Synopsis Input audio stream into parts. This algorithm is a numerical application of information geome- try, a ﬁeld of research at the intersection between statistical modeling and differential geometry. Section 4 presents the algorithm which rearranges the segments parts and produces the Synopsis Seriation stereophonic piece. This algorithm is massively parallel and converges by evolution- ary optimization. Section 5 presents the CD booklet of Synopsis Seriation, containing computer-generated visual- izations of time–frequency scattering as well as creations of graphical design which summarize the functioning of the virtual listening agent. Lastly, Section 6 discusses the link between Synopsis Seriation and prior works on the spatiotemporal structuration of music, notably Iannis Xenakis’s Diatope. 2. TIME–FREQUENCY SCATTERING Time–frequency scattering comprises three stages. The ﬁrst stage is a constant-Q transform (CQT) followed by log2 λ t Figure 3. Interference pattern between wavelets ψα(t) and ψβ(log2 λ) in the time–frequency domain (t, log2 λ) for different combinations of amplitude modulation rate α and frequency modulation scale β. Darker shades of red (resp. blue) indicate higher positive (resp. lower negative) values of the real part. pointwise complex modulus. The second stage is a convo- lutional operator in the time–frequency domain with wave- lets in time and log-frequency, again followed by pointwise complex modulus. The third stage is a local averaging of every scattering coefﬁcient over the time dimension. 2.1. Constant-Q wavelet transform We build a ﬁlter bank of Morlet wavelets of center fre- quency λ > 0 and quality factor Q = 12 via the equation ψλ(t) = λ exp −λ2t2 2Q2 × (exp(2πiλt) −κ), (1) where the corrective term κ guarantees that each ψλ has one vanishing moment, i.e., a null average. We discretize the center frequency variable as λ = ξ2−j/Q where j is integer and ξ is a constant. In this way, there are exactly Q wavelets per octave in the ﬁlterbank. To make sure that the ﬁlter bank covers the Fourier domain unitarily, the center frequency of the ﬁrst wavelet (j = 0) should lie at the midpoint between the center frequency of the second wavelet (j = 1) and the center frequency of the complex conjugate of the ﬁrst wavelet, hence: ξ = 1 2 2−1/Qξ + (fs −ξ) = fs 3 −2−1/Q (2) where fs denotes the sampling frequency. The CD standard fs = 44.1 kHz yields ξ = 21 448 Hz. We set the number of wavelets to J = 96, hence a range of J/Q = 8 octaves below ξ. The minimum frequency is 2−8ξ = 84 Hz. Let the asterisk symbol (∗) denote the convolution prod- uct. Given a signal x(t) of ﬁnite energy, we deﬁne its CQT as the following time–frequency representation: U1x(t, λ) = |x ∗ψλ| (t) = Z R x(τ)ψλ(t −τ) dτ , (3) indexed by time t and wavelet center frequency λ. 2.2. Spectrotemporal receptive ﬁeld For the second layer of the joint time–frequency scattering transform, we deﬁne two wavelet ﬁlterbanks: one over the time dimension and one over the log-frequency dimension. In both cases, we set the wavelet proﬁle to Morlet (see Equation 1) and the quality factor to Q = 1. With a slight abuse of notation, we denote these wavelets by ψα(t) and ψβ(log λ) even though they do not have the same shape as the wavelets ψλ(t) of the ﬁrst layer, whose quality factor is equal to Q = 12. Frequencies α, hereafter called amplitude modulation rates, are measured in Hertz (Hz) and discretized as 2−n 2 5ξ with integer n. Frequencies β, hereafter called frequency modulation scales, are measured in cycles per octave (c/o) and discretized as ±2−n 2 5Q−1 with integer n. The edge case β = 0 corresponds uploads/s3/ synopsis-seriation.pdf

Tags

Administrationsynopsis scattering seriation which channel

Documents similaires

43
0
0

Licence et utilisation

Gratuit pour un usage personnel Attribution requise

Partager

Détails
Publié le Jan 11, 2021
Catégorie Creative Arts / Ar...
Langue French
Taille du fichier 2.6776MB

Nous utilisons des cookies

Ce site utilise des cookies pour améliorer votre expérience utilisateur.

Cookies de fonctionnement

Nous devons utiliser certains cookies pour pouvoir faire fonctionner certaines pages web. C'est la raison pour laquelle ils ne nécessitent pas votre consentement.

disserty_cookie_consent

1 an 1 mois 1 jour

Stockage des préférences de consentement aux cookies de l'utilisateur.
disserty_session

2 heures

Identification de la session de navigation de l'utilisateur.
XSRF-TOKEN

2 heures

Protection de l'utilisateur et notre site contre les attaques d'usurpation d'identité lors des requêtes.

Plus d'informations

Cookies analytiques

Nous utilisons ces cookies uniquement à des fins de recherche interne sur la manière dont nous pouvons améliorer le service que nous offrons à tous nos utilisateurs. Ces cookies permettent d'évaluer la manière dont vous interagissez avec notre site web.

_ga

2 ans 2 mois 2 jours

Cookie principal utilisé par Google Analytics, permettant de distinguer un visiteur d'un autre.
_ga_C6FBBSLVBT

2 ans 2 mois 2 jours

Utilisé par Google Analytics pour conserver l'état de la session.
_gid

1 jour

Utilisé par Google Analytics pour identifier un visiteur.
_gat

1 minute

Utilisé par Google Analytics pour limiter le taux de demande.

Plus d'informations