Hi, I’m Anton!

I’m currently a Machine Learning Engineer and Researcher at Samsung Research working on multimodal AI, conversational agents, and machine translation. In my six years of industry experience, I’ve worked in numerous roles that span from software engineering to data science for products that reach millions of users worldwide. I earned my Bachelor’s degree in Computer Science cum laude from the University of the Philippines - Los Baños and continued my research in the industry where I also mentor and teach others.

My research interests lie in the intersection of Computer Vision and Natural Language Processing, particularly:

Vision-Language Models: While LLMs are capable of text-modality tasks such as conversation, question-answering, etc., I believe that AGI should be multimodal; as such, I am particularly interested in the reasoning and planning capabilities of VLMs. To this end, I am currently collaborating with SEACrowd on the SEA-VL project in order to improve AI’s understanding of Southeast Asian culture. I am a major contributor to the SEA-VL dataset which aims to bridge the resource gap in culturally-relevant Southeast Asian vision-language datasets and has resulted in a paper that was presented at ACL 2025. I am also currently working with the SEA-VL team to train models that have an understanding of Southeast Asian cultures and concepts as part of SEA-VL Phase 2.
Generative Methods for Low-Resource Tasks: Tasks such as machine translation and scene text recognition have established datasets for high-resource cases; however, I wish to address the question of whether existing generative models are able to generate synthetic data for low-resource cases in these problems. Me and my colleagues’ previous work in the WMT24 Low Resource Languages of Spain Shared Task and Indic MT Task are examples of this.
Deep Learning Applications: At Samsung, I am always on the lookout for ways to use recent advances in deep learning to improve user experience. This has currently resulted in two currently pending patent applications (WO2024128677A1 and PH1/2022/00050589).

You can reach out to me via my email anton.rufino1125@gmail.com or LinkedIn if you’re interested in collaborating or chatting about image generation, machine translation, synthetic data, and vision-language modeling!

Manuel Antonio Rufino