Skip to content
Capture d’écran 2025-03-08 à 17.20.17

CallMark: A Web-based User Interface for Vocal Annotations

Introduction

CallMark Graphical User Interface
Figure 1: CallMark Graphical User Interface (GUI). Features labeled panels for: (a) Species/Individual identification, (b) Track management controls, (c) Spectrogram parameter adjustment, (d) Audio waveform and spectrogram display, (e) Label panel for annotations, (f) Global configuration settings, (g) Label editing options, and (h.1-3) Navigation icons and functionality controls.

The functionality of CallMark, our custom web application to annotate vocalizations, is inspired by existing tools such as Whombat (Balvanera et al. (2023)), NEAL (Gibbons et al. (2022)), Pyrenote (Perry et al. (2021b)), Arbimon (Aide et al. (2013)), AvianZ (Marsland et al. (2019)), Koe (Fukuzawa et al. (2020)), Sonic Visualiser (Cannam et al. (2010)), Kaleidoscope , Label Studio, Raven, Praat (Boersma and Weenink (2003)) , Audacity, and Adobe Audition . In comparison with these tools, CallMark combines a unique set of features that we consider important to optimize the precision of vocal annotations and to reduce their existing biases between species and research fields.

For example, unlike existing platforms, CallMark uniquely integrates support for a sophisticated hierarchical labeling system that allows users to annotate at various hierarchical levels including species, individuals, and call types. Assistive features such as draggable lines to demarcate vocalization onsets and offsets as well as the frequency range enable precise spectro-temporal annotations. Users can zoom in to the millisecond scale for precise audio analysis. Flexible functions for adding, removing, and editing labels enhance productivity and scalability.

CallMark offers customizable spectrogram display options, including log-mel and constant-Q displays, with adjustable parameters like hop length, number of spectrogram columns, sampling rate, and frequency range. Constant-Q spectrograms, for example, with adjustable contrast and brightness, minimize temporal blur at high frequencies.

CallMark handles multi-channel annotation, enabling analysis across diverse data sources simultaneously — a feature absent in most other tools. Additionally, it supports audio playback within the browser, the import and export of CSV files for easy data management.

CallMark seamlessly integrates a machine learning pipeline for vocal activity detection.

Feature Comparison

CallMark combines a unique set of features that optimize the precision of vocal annotations and reduce existing biases between species and research fields. The table below provides a detailed comparison of CallMark with other popular annotation software.

Table 1: Comparison of CallMark with Other Annotation Software
Features CallMark (ours) Whombat NEAL Pyrenote Arbimon AvianZ Koe Sonic Visualiser Kaleidoscope Label Studio Raven Praat Audacity Adobe Audition
Open-source
Browser-based
Temporal Annotation
Frequency Annotation
Spectrogram Interface
Adjustable Spectrogram Computation (Mel, Constant-Q)
Adjustable Spectrogram Visualization (Contrast, Brightness)
Multi-channel-aligned Annotation
Flexible Labeling System (add/remove/eidt labels)
Integrated Vocal Activity Detector
In-App Training
Export Annotations

A General Guideline of Using CallMark

Try CallMark

Try using CallMark to annotate an example recording: Link Here