I am a Post-doctoral researcher in the KAIST ALIN Lab under the supervision of Jinwoo Shin, while also working as a research intern at RLWRLD.
Before joining the team, I held a postdoctoral research position in Inria Grenoble-Rhône-Alpes THOTH team advised by Karteek Alahari, following the completion of my Ph.D at POSTECH computer vision lab under the supervision of prof. Minsu Cho..
My current research interest is Vision-Language-Action models, especially VLA model architectures for robot manipulations.
Publication
- ContextVLA: Vision-Language-Action Model with Amortized Multi-frame Context [code]
Huiwon Jang, Sihyun Yu, Heeseung Kwon, Hojin Jeon, Younggyo Seo(*), Jinwoo Shin(*), (* equal contribution) In ICLR 2026 (under review)
- Exploring High-order Self-Similarity for Video Understanding [code]
Manjin Kim, Heeseung Kwon, Karteek Alahari, Minsu Cho, In ICLR 2026 (under review)
- Lightweight Structure-Aware Attention for Visual Undestanding [code]
Heeseung Kwon, Francisco M. Castro, Manuel J. Marin-Jimenez, Nicolas Guil, Karteek Alahari, In IJCV 2025
- Relational Self-Attention: What’s missing in Attention for Video Understanding [Project page] [code]
Manjin Kim(*), Heeseung Kwon(*), Chungyu Wang, Suha Kwak, Minsu Cho (* equal contribution), In NeurIPS 2021
- Relational Embedding for Few-shot Classification [Project page] [code]
Dahyun Kang, Heeseung Kwon, Juhong Min, Minsu Cho, In ICCV 2021
- Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition [Project page] [code]
Heeseung Kwon(*), Manjin Kim(*), Suha Kwak, Minsu Cho (* equal contribution), In ICCV 2021
- IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos [code]
Gyeongsik Moon(*), Heeseung Kwon(*), Kyoung Mu Lee, Minsu Cho (* equal contribution), CVPR Workshop 2021
- MotionSqueeze: Neural Motion Feature Learning for Video Understanding [Project page] [code]
Heeseung Kwon, Manjin Kim, Suha Kwak, Minsu Cho, In ECCV 2020
- Temporal U-Nets for Video Summarization with Scene and Action Recognition
Heeseung Kwon, Woohyun Shim, Minsu Cho, In ICCV Workshop 2019 (Challenge winner as the 2nd place of CoVieW 2019)
- Video Understanding via Convolutional Temporal Pooling Network and Multimodal Feature Fusion
Heeseung Kwon, Suha Kwak, Minsu Cho, In MM Workshop 2019 (Challenge winner as the 1st place of CoVieW 2018)
- First Person Action Recognition via Two-stream ConvNet with Long-term Fusion Pooling
Heeseung Kwon, Yeonho Kim, Jin S. Lee, Minsu Cho, In PRL 2018
Education