Schedule as of Oct 11, 2022 - subject to change

Default Time Zone is EDT - Eastern Daylight Time

Back To Schedule
Thursday, October 27 • 12:15pm - 12:30pm
1D Convolutional Layers to Create Frequency-Based Spectral Features for Audio Networks

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Time-Frequency transformation and spectral representations of audio signals are commonly used in various machine learning applications. Training networks on frequency features such as the Mel-Spectrogram or Chromagram have been proven more effective and convenient than training on time samples. In practical realizations, these features are created on a different processor and/or pre-computed and stored on disk, requiring additional efforts and making it difficult to experiment with various combinations. In this paper, we provide a PyTorch framework for creating spectral features and time-frequency transformation using the built-in trainable conv1d() layer. This allows computing these on-the-fly as part of a larger network and enabling easier experimentation with various parameters. Our work extends the work in the literature developed for that end: First by adding more of these features; and also by allowing the possibility of either training from initialized kernels or training from random values and converging to the desired solution. The code is written as a template of classes and scripts that users may integrate into their own PyTorch classes for various applications.

avatar for Elias Nemer

Elias Nemer

Audio Engineer
Elias Nemer is involved in various aspects of audio signal processing related to AR and VR.He is currently an audio engineer at Meta Platforms.He previously worked at Cirrus Logic in areas related to ML for audio classification. At Broadcom, he developed audio algorithms for mobile... Read More →

Thursday October 27, 2022 12:15pm - 12:30pm EDT
Online Papers
  Applications in Audio
  • badge type: ALL ACCESS or ONLINE