Multimodal AI Engineer
AI & Data Science
Full-time
Hybrid
Multimodal AI Engineers develop AI systems that process and generate multiple data modalities including text, images, audio, video, and structured data within unified model architectures. They work on vision-language models, audio-language models, and cross-modal alignment techniques, integrating multimodal AI capabilities into products such as visual question answering, document understanding, and video analysis systems. This specialty has become critical as foundation models like GPT-4o and Claude 3 expand multimodal capabilities.
Upload your CV
Get an ATS compatibility score and personalized interview practice