Yonghoon Dong
  • about
  • blog (current)
  • publications
  • CV
  • Trust Region Q-Adjoint Matching: Stable Off-Policy RL for Flow Policies

    A new stable off-policy fine-tuning algorithm for pretrained flow-based policies, combining trust-region principles with stochastic optimal control.

    26 min read   ·   May 25, 2026

    2026   ·   off-policy   flow-matching   trust-region   robotics   ·   rl

© Copyright 2026 Yonghoon Dong. Powered by Jekyll with al-folio theme. Hosted by GitHub Pages.