Jonas Hübotter

PhD Student at ETH Zurich. I work on Test-Time Training and Reinforcement Learning.

prof_pic.png

I am a PhD student in the Learning and Adaptive Systems Group at ETH Zurich working with Andreas Krause. Prior to this, I obtained a Master’s degree in Theoretical Computer Science and Machine Learning from ETH Zurich and a Bachelor’s degree in Computer Science and Mathematics from the Technical University of Munich. I am a recipient of the ETH Medal.

My research aims to leverage foundation models for solving hard tasks through specialization and reinforcement learning. Beyond this, I have broad interests including (approximate) probabilistic inference, optimization, and online learning.

Always feel free to reach out to me with things you find exciting.

Contacts:jhuebotter@ethz.ch Google Scholar GitHub Linkedin

Announcements

Jul, 2025 COLM 2025: Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging has been accepted! We will also present our work on test-time scaling via prefix-confidence at the SCALR workshop.
May, 2025 ICML 2025: Active Fine-Tuning of Multi-Task Policies has been accepted! We will also present our work on test-time offline RL at the PUT workshop and our work on curricula for sparse-reward RL at the EXAIT workshop.
Feb, 2025 Very excited to share notes on Probabilistic AI that I have been writing with Andreas Krause!
Jan, 2025 ICLR 2025: Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs has been accepted!
Jan, 2025 AISTATS 2025: LITE: Efficiently Estimating Gaussian Probability of Maximality has been accepted!

Selected Publications

  1. DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
    Leander Diaz-Bone*, Marco Bagatella*Jonas Hübotter* , and 1 more author
    arXiv preprint arXiv:2505.19850, 2025
  2. COLM ’25
    Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging
    Ryo Bertolissi*Jonas Hübotter*, Ido Hakimi , and 1 more author
    In Conference on Language Modeling (2025) , 2025
  3. ICLR ’25 Best Paper
    Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
    Jonas Hübotter, Sascha Bongni, Ido Hakimi , and 1 more author
    In International Conference on Learning Representations (2025) , 2024
    Best Paper Award at NeurIPS Workshop on Fine-Tuning in Modern Machine Learning, 2024.
  4. NeurIPS ’24 Oral
    Transductive Active Learning: Theory and Applications
    Jonas Hübotter, Bhavya Sukhija, Lenart Treven , and 2 more authors
    In Advances in Neural Information Processing Systems (2024) , 2024
    Oral Presentation at ICML Workshop on Aligning Reinforcement Learning Experimentalists and Theorists, 2024.

Latest Talks

→ all talks
Jan 13, 2026
Invited Talk
Test-Time Training Agents for Deep Exploration
BLISS Speaker Series, Berlin
Nov 18, 2025
Invited Lecture
Test-Time Training and Adaptation
“Foundation Models and Generative AI” (hosted by Prof. Charlotte Bunne), EPFL, Lausanne
Nov 5, 2025
Invited Talk
Test-Time Training Agents to Solve Challenging Problems
heidelberg.ai Speaker Series (hosted by Carsten Lüth), Heidelberg
Aug 12, 2025
Invited Talk
Test-Time Training for Hard Tasks
Google Paradigms of Intelligence team (hosted by Johannes von Oswald), Zurich
Jul 10, 2025
Invited Talk
Towards Solving Hard Problems via Test-Time Training 📝
1st Prague Workshop on Neural Networks and Reasoning, Prague

Supervision

I have had the privilege of advising several BSc and MSc students during their theses and semester projects. Some of these projects have led to publications.

  • Dennis Jüni (MSc): Meta Test-Time Training for Image Classification (with Frederike Lübeck)
  • Matthias Otth (MSc): Efficient Fine-Tuning and Test-Time Training of Large Language Models for Reasoning Tasks (with Ido Hakimi, SCALR@COLM '25)
  • Leander Diaz-Bone (MSc): Directed Goal-Conditioned Reinforcement Learning (with Marco Bagatella)
  • Ryo Bertolissi (BSc): Test-Time Model Merging for Mixture of Local Experts (with Ido Hakimi, COLM '25)
  • Nicolas Menet (MSc): Efficiently Estimating Gaussian Probability of Maximality (with Parnian Kassraie, AISTATS '25)
  • Sascha Bongni (BSc): Active Fine-Tuning of Large Language Models (ICLR '25)
  • Pablo Lahmann (MSc): Safe Control as Inference (with Yarden As)
  • Anh Duc Nguyen (BSc): Safe Bayesian Optimization without Regret
You can find a list of potential projects of our research group here. If you want to work with me, please send me an email describing your area of interest. Please also attach your CV and up-to-date transcripts.