Published: Jan 25, 2024 by

TartanAviation: Image, Speech, and Trajectory Datasets for Terminal Airspace Operations

We introduce TartanAviation, an open-source multi-modal dataset focused on terminal-area airspace operations. TartanAviation provides a holistic view of the airport environment by concurrently collecting image, speech, and ADS-B trajectory data using setups installed inside airport boundaries. The datasets were collected at both towered and non-towered airfields across multiple months to capture diversity in aircraft operations, seasons, aircraft types, and weather conditions. In total, TartanAviation provides 3.1M images, 3374 hours of Air Traffic Control speech data, and 661 days of ADS-B trajectory data. The data was filtered, processed, and validated to create a curated dataset. In addition to the dataset, we also open-source the code-base used to collect and pre-process the dataset, further enhancing accessibility and usability. We believe this dataset has many potential uses cases and would be particularly vital in allowing AI and machine learning technologies to be integrated into air traffic control systems and advance the adoption of autonomous aircraft in the the airspace.

Figure 1: Our custom data collection setup installed at the Allegheny County Airport with its approximate location within the airport premises with respect to the runway geometry. The setup recorded images, audio, and aircraft trajectory data.

Dataset

In this work, we introduce TartanAviation, a multi-modal dataset collected at towered and non-towered terminal areas within the US. The TartanAviation dataset covers three primary concurrently collected modalities of data: trajectory positions for capturing the spatial and temporal information of aircraft movements, video flight sequences collected with static cameras installed within terminal areas, and audio communications to document the voice interactions between pilots and air traffic controllers. While prior datasets in the aviation domain have focused on specific modalities like speech or vision, TartanAviation aims to provide a more holistic view of terminal airspace operations across various data modalities. Additionally, while previous datasets focus on large commercial airports, TartanAviation focuses on smaller regional airports within the Greater Pittsburgh area. Regional airports serve a multitude of different aircraft and mission profiles providing a richer and more diverse data stream. We specifically focus on two airports: the towered Allegheny County Airport (ICAO:KAGC) and the non-towered Pittsburgh-Butler Regional Airport (ICAO:KBTP).

Figure 2: Our custom setup hardware with the camera and ADS-B antenna mounts. Also shown is the data collection pipeline with the associated sensor suite and automatic logic that triggers camera and speech recordings.

Download

The scripts to record, post-process, and download each modality are publicly available at https://github.com/castacks/TartanAviation.git.

Data Structure

Image Data

The image dataset is split across 550 independent sequences. We define a sequence as all of the data recorded during a single event where the camera recordings were started and stopped. The vision data folder contains multiple zipped files, each associated with a particular camera recording for that sequence. Each zipped sequence folder has multiple files as presented in the table below.

Extension	Nomenclature	Content
.zip	<camera_id>_<timestamp>	Zip folder containing sequence data
→ .mp4	<camera_id>_<timestamp>	Video File
→.avi	<camera_id>_<timestamp>_sink_verified	Video File with embedded labels
→.srt	<camera_id>_<timestamp>_subtitle	Raw timestamps of the recorded video
→.pkl	<camera_id>_<timestamp>_sink_adsb	Raw ADS-B dictionary
→.pkl	<camera_id>_<timestamp>_acft_sink	Raw aircraft type data
→.zip	<camera_id>_<timestamp>_labels	Zip folder containing the image labels
→→.label	<frame_number>.label	Text file containing label data

In addition to the video files and labels, we also provide ADS-B data for each sequence. When downloaded with the provided script, they are organized as follows:

image_data
└── 1_2023-02-22-15-21-49
    |   1_2023-02-22-15-21-49_sink
    │   |   ...
    |   1_2023-02-22-15-21-49.mp4
    |   1_2023-02-22-15-21-49_labels.zip
    |   1_2023-02-22-15-21-49_acft_sink.pkl
    |   1_2023-02-22-15-21-49_sink_adsb.pkl
    |   1_2023-02-22-15-21-49_sink_verified.avi
    |   1_2023-02-22-15-21-49_subtitle.srt
|   ...

Trajectory and Weather Data

TartanAviation provides both raw the processed data for each airport. Raw data is separated into individual folders for each day of collection. Each raw data folder has CSVs with fields detailed in Table \ref{tab:adsb}. The processed files are available as comma-separated TXT files with fields described in \ref{tab:adsb}.

Figure 3: Log-normed trajectory histograms from ADS-B aircraft position reports, KAGC on the left and KBTP on the right.

Speech Data

Both the raw and filtered audio files are included in the dataset. The filtered data is organized in a directory structure by location, year, month, and day. Each day is indiviually zipped and contains audio files in the WAV format and has an accompanying text file that contains the start, end, and total time of the audio clip. When downloaded with the provided script, they are organized as follows:

audio_data   
│
└───kbtp
│   └───2020
|   |   └───01
|   |   └───...
|   |   └───12
│   │   │   |   1.wav
|   |   |   |   1.txt
|   |   |   |   ...
│   └───...
│   └───2022
│   
└───kagc
    └───2021
    └───...
    └───2023

Data Collection and Processing

More details about the data collection and processing can be found in our paper.

Additional Info

Citation

@article{patrikar2024tartanaviation,
	title={TartanAviation: Image, Speech, and ADS-B Trajectory Datasets for Terminal Airspace Operations}, 
	author={Jay Patrikar and Joao Dantas and Brady Moon and Milad Hamidi and Sourish Ghosh and Nikhil Keetha and Ian Higgins and Atharva Chandak and Takashi Yoneyama and Sebastian Scherer},
	year={2024},
	eprint={2403.03372},
	archivePrefix={arXiv},
	primaryClass={cs.LG},
	url={https://arxiv.org/pdf/2403.03372.pdf}
}

Contact

Jay Patrikar - jaypat@cmu.edu

Joao P. A. Dantas - jpdantas@gmail.com

Brady Moon - bradygmoon@cmu.edu

Acknowledgments

This work is supported by Mitsubishi Heavy Industries (MHI) project #A025279. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE1745016.