We propose an encoder-decoder for open-vocabulary semantic segmentation comprising a hierarchical encoder-based cost map generation and a gradual fusion decoder. We introduce a category early ...
Google Gemma 4 12B, released June 3, is an open-weight multimodal model that processes text, images, audio, and video in a ...
Abstract: Future pedestrian trajectory prediction offers great prospects for many practical applications. Most existing methods focus on social interaction among pedestrians but ignore the factors ...
We present Representation Autoencoders (RAE), a class of autoencoders that utilize pretrained, frozen representation encoders such as DINOv2 and SigLIP2 as encoders with trained ViT decoders. RAE can ...
Abstract: Turbo code is an error correction coding scheme close to the Shannon limit, usually used in wireless data transmission. Based on the parallel Turbo code ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results