
CallMark: A Web-based User Interface for Vocal Annotations
Introduction

The functionality of CallMark, our custom web application to annotate vocalizations, is inspired by existing tools such as Whombat (Balvanera et al. (2023)), NEAL (Gibbons et al. (2022)), Pyrenote (Perry et al. (2021b)), Arbimon (Aide et al. (2013)), AvianZ (Marsland et al. (2019)), Koe (Fukuzawa et al. (2020)), Sonic Visualiser (Cannam et al. (2010)), Kaleidoscope , Label Studio, Raven, Praat (Boersma and Weenink (2003)) , Audacity, and Adobe Audition . In comparison with these tools, CallMark combines a unique set of features that we consider important to optimize the precision of vocal annotations and to reduce their existing biases between species and research fields.
For example, unlike existing platforms, CallMark uniquely integrates support for a sophisticated hierarchical labeling system that allows users to annotate at various hierarchical levels including species, individuals, and call types. Assistive features such as draggable lines to demarcate vocalization onsets and offsets as well as the frequency range enable precise spectro-temporal annotations. Users can zoom in to the millisecond scale for precise audio analysis. Flexible functions for adding, removing, and editing labels enhance productivity and scalability.
CallMark offers customizable spectrogram display options, including log-mel and constant-Q displays, with adjustable parameters like hop length, number of spectrogram columns, sampling rate, and frequency range. Constant-Q spectrograms, for example, with adjustable contrast and brightness, minimize temporal blur at high frequencies.
CallMark handles multi-channel annotation, enabling analysis across diverse data sources simultaneously — a feature absent in most other tools. Additionally, it supports audio playback within the browser, the import and export of CSV files for easy data management.
CallMark seamlessly integrates a machine learning pipeline for vocal activity detection.
Feature Comparison
CallMark combines a unique set of features that optimize the precision of vocal annotations and reduce existing biases between species and research fields. The table below provides a detailed comparison of CallMark with other popular annotation software.
Features | CallMark (ours) | Whombat | NEAL | Pyrenote | Arbimon | AvianZ | Koe | Sonic Visualiser | Kaleidoscope | Label Studio | Raven | Praat | Audacity | Adobe Audition |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Open-source | ✓ | ✓ | ✓ | ✓ | − | ✓ | ✓ | ✓ | − | ✓ | − | ✓ | ✓ | − |
Browser-based | ✓ | ✓ | ✓ | ✓ | ✓ | − | ✓ | − | − | ✓ | − | − | − | − |
Temporal Annotation | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Frequency Annotation | ✓ | ✓ | ✓ | − | − | ✓ | − | ✓ | − | − | ✓ | ✓ | ✓ | − |
Spectrogram Interface | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | − | ✓ | ✓ | ✓ | ✓ |
Adjustable Spectrogram Computation (Mel, Constant-Q) | ✓ | − | − | − | − | ✓ | − | ✓ | − | − | ✓ | ✓ | ✓ | ✓ |
Adjustable Spectrogram Visualization (Contrast, Brightness) | ✓ | − | ✓ | − | − | ✓ | ✓ | ✓ | ✓ | − | ✓ | ✓ | ✓ | ✓ |
Multi-channel-aligned Annotation | ✓ | − | − | − | − | − | − | − | − | − | ✓ | − | − | − |
Flexible Labeling System (add/remove/eidt labels) | ✓ | ✓ | ✓ | − | ✓ | ✓ | ✓ | ✓ | ✓ | − | ✓ | ✓ | ✓ | ✓ |
Integrated Vocal Activity Detector | ✓ | − | − | − | ✓ | ✓ | ✓ | − | − | − | ✓ | − | − | − |
In-App Training | ✓ | − | − | − | − | ✓ | − | − | − | − | − | − | − | − |
Export Annotations | ✓ | ✓ | ✓ | − | − | − | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |