Google Gemma 4 12B, released June 3, is an open-weight multimodal model that processes text, images, audio, and video in a ...
Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models Xiao Liu, Jiaxiang Liu, Boci Peng, Boren Hu, Yusong Wang, Xiwen Chen, Prayag Tiwari, Liming Zhang, Mingkun Xu ...
The project automatically fetches the latest papers from arXiv based on keywords. The subheadings in the README file represent the search keywords. Only the most recent articles for each keyword are ...