TartanAviation: Image, Speech, and Trajectory Datasets for Terminal Airspace Operations
We introduce TartanAviation, an open-source multi-modal dataset focused on terminal-area airspace operations. TartanAviation provides a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using setups installed inside airport boundaries. The datasets were collected at both towered and non-towered airfields across multiple months to capture diversity in aircraft operations, seasons, aircraft types, and weather conditions. In total, TartanAviation provides 3.1M images, 3374 hours of Air Traffic Control speech data, and 661 days of ADS-B trajectory data. The data was filtered, processed, and validated to create a curated dataset. In addition to the dataset, we also open-source the code-base used to collect and pre-process the dataset, further enhancing accessibility and usability. We believe this dataset has many potential uses cases and would be particularly vital in allowing AI and machine learning technologies to be integrated into air traffic control systems and advance the adoption of autonomous aircraft in the the airspace.
Dataset
In this work, we introduce TartanAviation, a multi-modal dataset collected at towered and non-towered terminal areas within the US. The TartanAviation dataset covers three primary concurrently collected modalities of data: trajectory positions for capturing the spatial and temporal information of aircraft movements, video flight sequences collected with static cameras installed within terminal areas, and audio communications to document the voice interactions between pilots and air traffic controllers. While prior datasets in the aviation domain have focused on specific modalities like speech or vision, TartanAviation aims to provide a more holistic view of terminal airspace operations across various data modalities. Additionally, while previous datasets focus on large commercial airports, TartanAviation focuses on smaller regional airports within the Greater Pittsburgh area. Regional airports serve a multitude of different aircraft and mission profiles providing a richer and more diverse data stream. We specifically focus on two airports: the towered Allegheny County Airport (ICAO:KAGC) and the non-towered Pittsburgh-Butler Regional Airport (ICAO:KBTP).
Download
The scripts to record, post-process, and download each modality are publicly available at https://github.com/castacks/TartanAviation.git.
Data Structure
Image Data
The image dataset is split across 550 independent sequences. We define a sequence as all of the data recorded during a single event where the camera recordings were started and stopped. The vision data folder contains multiple zipped files, each associated with a particular camera recording for that sequence. Each zipped sequence folder has multiple files as presented in the table below.
Extension | Nomenclature | Content |
---|---|---|
.zip | <camera_id>_<timestamp> | Zip folder containing sequence data |
→ .mp4 | <camera_id>_<timestamp> | Video File |
→.avi | <camera_id>_<timestamp>_sink_verified | Video File with embedded labels |
→.srt | <camera_id>_<timestamp>_subtitle | Raw timestamps of the recorded video |
→.pkl | <camera_id>_<timestamp>_sink_adsb | Raw ADS-B dictionary |
→.pkl | <camera_id>_<timestamp>_acft_sink | Raw aircraft type data |
→.zip | <camera_id>_<timestamp>_labels | Zip folder containing the image labels |
→→.label | <frame_number>.label | Text file containing label data |
In addition to the video files and labels, we also provide ADS-B data for each sequence. When downloaded with the provided script, they are organized as follows:
image_data
└── 1_2023-02-22-15-21-49
| 1_2023-02-22-15-21-49_sink
│ | ...
| 1_2023-02-22-15-21-49.mp4
| 1_2023-02-22-15-21-49_labels.zip
| 1_2023-02-22-15-21-49_acft_sink.pkl
| 1_2023-02-22-15-21-49_sink_adsb.pkl
| 1_2023-02-22-15-21-49_sink_verified.avi
| 1_2023-02-22-15-21-49_subtitle.srt
| ...
Trajectory and Weather Data
TartanAviation provides both raw the processed data for each airport. Raw data is separated into individual folders for each day of collection. Each raw data folder has CSVs with fields detailed in Table \ref{tab:adsb}. The processed files are available as comma-separated TXT files with fields described in \ref{tab:adsb}.
Speech Data
Both the raw and filtered audio files are included in the dataset. The filtered data is organized in a directory structure by location, year, month, and day. Each day is indiviually zipped and contains audio files in the WAV format and has an accompanying text file that contains the start, end, and total time of the audio clip. When downloaded with the provided script, they are organized as follows:
audio_data
│
└───kbtp
│ └───2020
| | └───01
| | └───...
| | └───12
│ │ │ | 1.wav
| | | | 1.txt
| | | | ...
│ └───...
│ └───2022
│
└───kagc
└───2021
└───...
└───2023
Data Collection and Processing
More details about the data collection and processing can be found in our paper.
Additional Info
Citation
@article{patrikar2024tartanaviation,
title={TartanAviation: Image, Speech, and ADS-B Trajectory Datasets for Terminal Airspace Operations},
author={Jay Patrikar and Joao Dantas and Brady Moon and Milad Hamidi and Sourish Ghosh and Nikhil Keetha and Ian Higgins and Atharva Chandak and Takashi Yoneyama and Sebastian Scherer},
year={2024},
eprint={2403.03372},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/pdf/2403.03372.pdf}
}
Contact
Joao P. A. Dantas - jpdantas@gmail.com
Brady Moon - bradygmoon@cmu.edu
Acknowledgments
This work is supported by Mitsubishi Heavy Industries (MHI) project #A025279. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE1745016.