Jonas Hübotter

PhD Student at ETH Zurich. I work on Test-Time Training and Reinforcement Learning.

prof_pic.png

I am a PhD student in the Learning and Adaptive Systems Group at ETH Zurich working with Andreas Krause. Prior to this, I obtained a Master’s degree in Theoretical Computer Science and Machine Learning from ETH Zurich and a Bachelor’s degree in Computer Science and Mathematics from the Technical University of Munich. I am a recipient of the ETH Medal.

My research aims to leverage foundation models for solving hard tasks through specialization and reinforcement learning. Beyond this, I have broad interests including (approximate) probabilistic inference, optimization, and online learning.

Always feel free to reach out to me with things you find exciting.

Contacts:jhuebotter@ethz.ch Google Scholar GitHub Linkedin

Announcements

Jul, 2025 COLM 2025: Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging has been accepted! We will also present our work on test-time scaling via prefix-confidence at the SCALR workshop.
May, 2025 ICML 2025: Active Fine-Tuning of Multi-Task Policies has been accepted!
Feb, 2025 Very excited to share notes on Probabilistic AI that I have been writing with Andreas Krause!
Jan, 2025 ICLR 2025: Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs has been accepted!
Jan, 2025 AISTATS 2025: LITE: Efficiently Estimating Gaussian Probability of Maximality has been accepted!

Selected Publications

  1. DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning
    Leander Diaz-Bone*, Marco Bagatella*Jonas Hübotter* , and 1 more author
    arXiv preprint arXiv:2505.19850, 2025
  2. COLM ’25
    Local Mixtures of Experts: Essentially Free Test-Time Training via Model Merging
    Ryo Bertolissi*Jonas Hübotter*, Ido Hakimi , and 1 more author
    In Conference on Language Modeling , 2025
  3. ICLR ’25 Best Paper
    Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs
    Jonas Hübotter, Sascha Bongni, Ido Hakimi , and 1 more author
    In International Conference on Learning Representations , 2025
    Best Paper Award at NeurIPS Workshop on Fine-Tuning in Modern Machine Learning, 2024.
  4. NeurIPS ’24 Oral
    Transductive Active Learning: Theory and Applications
    Jonas Hübotter, Bhavya Sukhija, Lenart Treven , and 2 more authors
    In Advances in Neural Information Processing Systems , 2024
    Oral Presentation at ICML Workshop on Aligning Reinforcement Learning Experimentalists and Theorists, 2024.

Talks

  • Towards Solving Hard Problems via Test-Time Trainingslides
    Invited Talk, 1st Prague Workshop on Neural Networks and Reasoning, Prague, 10 Jul 2025.
  • Interview on Test-Time Model Merging with Nnamdi Iregbulem (Lightspeed Venture Partners), May 2025.
  • Efficiently Learning at Test-Time with LLMs via Transductive Active Learningrecording, slides
    Invited Talk, Trillion Parameter Consortium (TPC) Seminar Series, 5 Mar 2025.
  • Efficiently Learning at Test-Time: Active Fine-Tuning of LLMsrecording, slides
    Contributed Talk, NeurIPS Workshop on Fine-Tuning in Modern Machine Learning, Vancouver, 14 Dec 2024.
  • Interview with Machine Learning Street Talk (MLST) podcast, Nov 2024.
  • Transductive Active Learning for Fine-Tuning Large (Language) Modelsslides
    Invited Talk, Machine Learning and Modelling Seminar, Czech Academy of Sciences, Prague, 21 Nov 2024.
  • Efficiently Learning at Test-Time with LLMsrecording, slides
    Invited Talk, Zurich AI Meetup, Zurich, 3 Dec 2024.
    Invited Talk, Tufa Labs AI Meetup, Zurich, 29 Oct 2024.
  • Transductive Active Learning with Application to Safe Bayesian Optimizationrecording, slides
    Contributed Talk, ICML Workshop on Aligning Reinforcement Learning Experimentalists and Theorists, Vienna, 26 Jul 2024.
  • Active Fine-Tuning of Large Neural Networksslides
    Contributed Talk, Machine Learning Seminar, ETH Zurich, 18 Apr 2024.

Supervision

I have had the privilege of advising several BSc and MSc students during their theses and semester projects. Some of these projects have led to publications.

  • Matthias Otth: Efficient Fine-Tuning and Test-Time Training of Large Language Models for Reasoning Tasks (with Ido Hakimi, SCALR@COLM '25)
  • Leander Diaz-Bone: Directed Goal-Conditioned Reinforcement Learning (with Marco Bagatella)
  • Ryo Bertolissi: Test-Time Model Merging for Mixture of Local Experts (with Ido Hakimi, COLM '25)
  • Nicolas Menet: Efficiently Estimating Gaussian Probability of Maximality (with Parnian Kassraie, AISTATS '25)
  • Sascha Bongni: Active Fine-Tuning of Large Language Models (ICLR '25)
  • Pablo Lahmann: Safe Control as Inference (with Yarden As)
  • Anh Duc Nguyen: Safe Bayesian Optimization without Regret
You can find a list of potential projects of our research group here. If you want to work with me, please send me an email describing your area of interest. Please also attach your CV and up-to-date transcripts.