FMOX

Introduction

In this repo, Fast Moving Object (FMO) datasets (FMOv2, TbD-3D, TbD and Falling Objects, all available at https://cmp.felk.cvut.cz/fmo/) with additional ground truth information in JSON format (our new metadata is called FMOX) are provided and used for benchmarking trackers.

πŸ“Œ If you are using this repo in your research or applications, please cite our papers related to this work (published in IMVIP 2025 and AICS 2025).

Benchmarking SAM2-based trackers on FMOX (AICS 2025)

[Paper AICS 2025] [Code] [Arxiv]

Several object tracking pipelines extending Segment Anything Model 2 (SAM2) have been proposed in the past year, where the approach is to follow and segment the object from a single exemplar template provided by the user on a initialization frame. We propose to benchmark these high performing trackers (SAM2, EfficientTAM, DAM4SAM and SAMURAI) on datasets containing fast moving objects (FMO) specifically designed to be challenging for tracking approaches. The goal is to understand better current limitations in state-of-the-art trackers by providing more detailed insights on the behavior of these trackers. We show that overall the trackers DAM4SAM and SAMURAI perform well on more challenging sequences.

@inproceedings{AICS2025-Aktas,
title={Benchmarking SAM2-based Trackers on FMOX},
author={Senem Aktas and Charles Markham and John McDonald and Rozenn Dahyot},
booktitle={33rd International Conference on Artificial Intelligence and Cognitive Science (AICS 2025)},
address={Dublin, Ireland},
year={2025},
month={December},
pages={1-12},
doi={10.48550/arXiv.2512.09633},
url={Paper=https://cvmlmu.github.io/FMOX/FMOXaics2025.pdf ArXiv=https://arxiv.org/abs/2512.09633 
Code=https://cvmlmu.github.io/FMOX/},
abstract={},
keywords={},
note={},
}

R code AICS2025.qmd (that creates AICS2025.html) used for drawing some figures in the paper is also provided in this repo.

Benchmarking EfficientTAM on FMO datasets (IMVIP 2025)

[Paper IMVIP 2025] [Code] [Arxiv]

In this repo, we extend Fast Moving Object (FMO) datasets (FMOv2, TbD-3D, TbD and Falling Objects, all available at https://cmp.felk.cvut.cz/fmo/) with additional ground truth information in JSON format (our new metadata is called FMOX). The provided FMOX JSON format allows for seamless compatibility with various machine learning frameworks, making it easier for developers and researchers to utilize the dataset in their applications. With FMOX, we test a recently proposed foundational model for tracking (EfficientTAM) showing that its performance compares well with the pipelines originally developed for these FMO datasets.

Scripts provided in this repo allow to download all FMO datasets, create json metadata, and assess object tracking with EfficientTAM using with TIoU metric.

@inproceedings{FMOX_AKTAS2025,
  title={Benchmarking EfficientTAM on FMO datasets},
  author={Senem Aktas and Charles Markham and John McDonald and Rozenn Dahyot},
  booktitle={Irish Machine Vision and Image Processing},
  doi={10.48550/arXiv.2509.06536},
  url={https://cvmlmu.github.io/FMOX/},
  month={September},
  pages={59-66},
  address={Ulster University, Derry-Londonderry, Northern Ireland},
  year={2025},
}

Installation

Getting started

git clone https://github.com/CVMLmu/FMOX.git branch main
cd FMOX
# for conda, create the environment using:
conda env create -n fmo_data_env -f environment.yml
conda activate fmo_data_env

Notebooks

The following notebooks can be run in that environment:

Repo tree structure

  environment.yml
β”‚   LICENSE
β”‚   README.md
β”‚
└───FMOX-code
    β”‚   download_datasets.py
    β”‚   __init__.py
    β”‚
    β”œβ”€β”€β”€create-FMOX
    β”‚   β”‚   combine_all_mask_to_single_img.py
    β”‚   β”‚   create_fmov2_json.py
    β”‚   β”‚   create_jsons_main.ipynb
    β”‚   β”‚   create_tbd_json.py
    β”‚   β”‚   main.py
    β”‚   β”‚   rle_to_seg_mask_img.py
    β”‚   β”‚   tbd_visualize_bboxes.py
    β”‚   β”‚
    β”‚   └───dataset_loader
    β”‚           create_json_via_benchmark_loader.py
    β”‚           loaders_helpers.py
    β”‚           reporters.py
    β”‚
    β”œβ”€β”€β”€EfficientTAM-Jsons
    β”‚       efficienttam_All4.json
    β”‚       efficienttam_falling.json
    β”‚       efficientTam_fmov2.json
    β”‚       efficienttam_tbd3d.json
    β”‚       efficienttam_tdb.json
    β”‚
    β”œβ”€β”€β”€FMOX-Jsons
    β”‚       FMOX_All4.json
    β”‚       FMOX_fall_and_tbd3d.json
    β”‚       FMOX_fmov2.json
    β”‚       FMOX_tbd.json
    β”‚       FMOX_tbd_whole_sequence.json
    β”‚
    └───use-FMOX
        β”‚   access_json_bboxes.py
        β”‚   calciou.py
        β”‚   csv_to_graphics.py
        β”‚   efficientam_evaluation.py
        β”‚   EfficientTAM_averageTIoU.csv
        β”‚   FMOX_all4_json_to_CSV.py
        β”‚   FMOX_All4_statistics.csv
        β”‚   fmox_main.ipynb
        β”‚   fmox_main.py
        β”‚   size_label_bar.png
        β”‚   size_label_count.py
        β”‚   vis_trajectory.py
        β”‚   __init__.py
        β”‚
        └───efficientTAM_traj_vis
                efficientTAM_traj_Falling_Object_v_box_GTgamma.jpg
                (...)

Additional Information

The following results are shared in this repo (created with fmox_main.ipynb):

FMOX Object Size Categories

The sizes of the objects in the public FMO datasets were calculated and β€œobject size levels” were assigned. A total of five distinct level defined as below:

Extremely Tiny Tiny Small Medium Large
[1 Γ— 1, 8 Γ— 8) [8 Γ— 8, 16 Γ— 16) [16 Γ— 16, 32 Γ— 32) [32 Γ— 32, 96 Γ— 96) [96 Γ— 96, ∞)

Table: FMOX object size categories.

Structure of FMOX

{
  "databases": [
    {
      "dataset_name": "Falling_Object",
      "version": "1.0",
      "description": "Falling_Object annotations.",
      "sub_datasets": [
        {
          "subdb_name": "v_box_GTgamma",
          "images": [
            {
              "img_index": 1,
              "image_file_name": "00000027.png",
              "annotations": [
                {
                  "bbox_xyxy": [161, 259, 245, 333],
                  "object_wh": [84, 74],
                  "size_category": "medium"
                }
              ]
            },
            {
              "img_index": 2,
              "image_file_name": "00000028.png",
              "annotations": ["bbox_xyxy": [.....], "object_wh": [.....], "size_category": "...." ]
            }
          ]
        }
      ]
    }
  ]
}

Average TIoU Performance Comparison

This table compares the average TIoU $(\uparrow)$ performance of various studies on FMO datasets. Evaluation of the EfficientTAM has been done with FMOX Json. The best results are indicated with $^*$ and the second-best results with $^{**}$.

Datasets Defmo [Rozumnyi et al., 2021] FmoDetect [Rozumnyi et al., 2021] TbD [Kotera et al., 2019] TbD-3D [Rozumnyi et al., 2020] EfficientTAM [Xiong et al., 2024]
Falling Object 0.684** N/A 0.539 0.539 0.7093*
TbD 0.550** (a) 0.519 (b) 0.715* 0.542 0.542 0.4546
TbD-3D 0.879* N/A 0.598 0.598 0.8604**

(a) Real-time with trajectories estimated by the network. (b) With the proposed deblurring. N/A indicates β€œNot defined”.

Acknowledgments

This Github repo was created and developped by Senem Aktas, and in addition it was tested by Rozenn Dahyot.
This research was supported by funding through the Maynooth University Hume Doctoral Awards.
We would like to thank the authors of the FMO datasets for making their datasets available.

License:

This code is available under the MIT License.