Hi, I’m Anton!
I’m currently a Machine Learning Engineer and Researcher at Samsung Research working on multimodal AI and image generation. In my seven years of industry experience, I’ve worked in numerous roles that span from software engineering to data science for products that reach millions of users worldwide. I earned my Bachelor’s degree in Computer Science cum laude from the University of the Philippines - Los Baños and continued my research in the industry where I also mentor and teach others.
My research interests lie in the intersection of Computer Vision and Natural Language Processing, particularly:
- Real-World Visual Understanding for VLMs: I am interested in identifying limitations and biases in Vision-Language Models (VLMs) which hinder real-world visual perception and understanding. My recent work on the SEA-VL project, a collaboration with SEACrowd and several other researchers, focuses on addressing the underrepresentation of Southeast Asia in field of vision-language. This project has resulted in a dataset and a method of regional adaptation for models without sacrificing global capabilities.
- Generative Methods for Low-Resource Tasks: Tasks such as machine translation and scene text recognition have established datasets for high-resource cases; however, I wish to address the question of whether existing generative models are able to generate synthetic data for low-resource cases in these problems. Me and my colleagues’ previous work in the WMT24 Low Resource Languages of Spain Shared Task and Indic MT Task are examples of this.
- Deep Learning Applications: At Samsung, I am always on the lookout for ways to use recent advances in deep learning to improve user experience. This has currently resulted in two currently pending patent applications (WO2024128677A1 and PH1/2022/00050589).
You can reach out to me via my email anton.rufino1125@gmail.com or LinkedIn if you’re interested in collaborating or chatting about vision-language models, image generation, and synthetic data!
