The rapid integration of Generative AI technologies, and particularly their multimodality, presents an exciting frontier for tackling complex and lengthy tasks. However, it also opens the door to potential security risks in the cybersecurity landscape. In this talk, we will delve into the security of multimodal large language models (LLMs) by demonstrating various image-based attacks on multimodal systems, illustrating the vulnerabilities these models may possess. Throughout the talk, we will be focusing on the creation of an image based heat-maps for stating where are the most sensitive points for a successful attack, and exploring those attention points from the eyes of an attacker or in other words demonstrating why they are a magnet for bad actors. These research aids not only at explaining why a set of points were successful but also expanding the researchers and engineers mindset in how to approach developing new detectors or security mechanisms in a new blue ocean such as the multimodal presents.
Room: Room 2
Mon, Oct 27th, 9:00 - 9:30