Julian Spravil
I’m a PhD student in the Autonomous Intelligent Systems Group at the University of Bonn and a research engineer at Fraunhofer Institute for Intelligent Analysis and Information Systems. My research interests include multimodal and multilingual machine learning as well as human-centered applications.

Research
My research focuses on multimodal machine learning to create systems that combine vision, language, and audio — enabling real-time assistance for people with impairments and supporting robust multilingual interaction. Representative papers are highlighted.
-
Florenz: Scaling Laws for Systematic Generalization in Vision-Language ModelsJulian Spravil, Sebastian Houben, Sven BehnkearXiv preprint, March 2025Florenz demonstrates that scaling vision-language models enables systematic generalization to unseen task-language pairs, allowing multilingual performance without direct supervision.
-
HyenaPixel: Global Image Context with ConvolutionsJulian Spravil, Sebastian Houben, Sven BehnkeEuropean Conference on Artificial Intelligence (ECAI) 2024HyenaPixel builds on the Hyena operator, extending it to bidirectional and 2D processing to capture global image context with large convolutions, enabling transformer-level accuracy without attention.