This repository contains the code for the paper "One Map to Find Them All: Real-time Open-Vocabulary Mapping for Zero-shot Multi-Object Navigation". We provide a dockerized environment to run the code ...
Follow the following guide to set up the environment. python gradio_demo/gradio_run.py --frame_interval 1 --num_frames 16 --pretrained_model_name_or_path checkpoints ...
Abstract: In modern video coding standards, block-based inter prediction is widely adopted, which brings high compression efficiency. However, in natural videos, there are usually multiple moving ...
Abstract: Aiming at the specific characteristics of flying bird objects in surveillance video, such as the typically non-obvious features in single-frame images, small size in most instances, and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results