Henry Hao-Tang Tsui

prof_pic.jpg

henrytsu@andrew.cmu.edu

Hi! I’m Hao-Tang Tsui, feel free to call me Henry!

I work on computer vision, vision–language models, and trying to build a usable bridge between pixels and words. I’m currently a Master’s student in Computer Vision at Carnegie Mellon University, working with Prof. Deva Ramanan on vision–language benchmarking.

Previously, I was a research assistant at Academia Sinica (YOLO-Lab) with Prof. Mark Liao, where I worked on YOLO-related research and re-release YOLO under the MIT license. I earned my B.S. in Electrical Engineering from National Yang Ming Chiao Tung University, collaborating with Prof. Hong-Han Shuai and Prof. Wen-Huang Cheng.

I enjoy turning research ideas into clean code, benchmarks, and occasionally opinions—ideally ones that connect vision and language a little better than before.

news

Dec 31, 2025 My code YOLO-MIT is now available on GitHub! :tada: :rocket:
Aug 11, 2025 Started my Master of Science in Computer Vision at the Carnegie Mellon University.
Jan 23, 2025 My paper YOLO-RD was accepted by ICLR 2025! :sparkles: :partying_face:
Jul 01, 2024 My second-author paper TrajPrompt was accepted by ECCV 2024! :tada:

selected publications

  1. On Going
    sigma.gif
    SIGMA: Static Scene Reconstruction via Inpainting and Geometry-first Motion Aggregation for Monocular RGB Videos
    Hao-Tang Tsui*, Ethan Lai*, and Yu-Rou Tuan*
    In , Dec 2025
  2. TrajPrompt: Aligning Color Trajectory with Vision-Language Representations
    Li-Wu Tsao, Hao-Tang Tsui, Yu-Rou Tuan, and 5 more authors
    In Proceedings of the European Conference on Computer Vision (ECCV), Oct 2024
  3. YOLO-RD: Introducing Relevant and Compact Explicit Knowledge to YOLO by Retriever-Dictionary
    Hao-Tang Tsui, Chien-Yao Wang, and Hong-Yuan Mark Liao
    In Proceedings of the International Conference on Learning Representations (ICLR), Apr 2025