Multimodal and Embodied AI for Digital Humanities

Research Seminar

Lorenzo Baraldi

Participation

Research Seminar

Organized by Digital Visual Studies at the Bibliotheca Hertziana in Rome

Recent progress in the Computer Vision and Natural Language Processing communities have made it possible to connect Vision and Language together in a variety of different tasks which lie at the intersection of Vision, Language, and Embodied AI. Those tasks range from retrieving images or part of images given textual queries, to generating meaningful descriptions of images, answering questions and navigating agents in unseen environments via natural language instructions.

This integration has grown up to the point that it is becoming endemic in literature, and a fundamental tool to develop AI algorithms. This talk will give a comprehensive overview through these advancements with a relevant focus on their applications to the Cultural Heritage and Digital Humanities domains. It will further discuss and present the research activities which are currently carried out at the AImageLab research group and in the Interdipartimental Centre on Digital Humanities of the University of Modena and Reggio Emilia.

Lorenzo Baraldi

Lorenzo Baraldi is a Tenure Track Assistant Professor at the University of Modena and Reggio Emilia. He works under the supervision of Prof. Rita Cucchiara on Deep Learning, video analysis and Multimedia, and teaches in the courses of Computer Vision, AI for Automotive and Image Processing. Among his research interests, he worked on Egocentric Vision and Gesture Recognition, Temporal Video Segmentation and Retrieval, Saliency, Video Captioning, Visual-Semantic alignment and Embodied AI. He is the author of more than 70 publications in international journals and conferences, and Associate Editor of Pattern Recognition Letters and of Frontiers in Artificial Intelligence. He has been elected as Scholar in the ELLIS society, the European Laboratory for Learning and Intelligent Systems. Since 2021, he has been appointed as deputy director of the Interdipartimental Centre on Digital Humanities of the University of Modena and Reggio Emilia. In 2017, he worked in the Facebook AI Research laboratory in Paris, under the supervision of Hervé Jégou, where he developed the video copy detection algorithm that currently runs in production on the social network.

Participation

Partecipation on site without registration.
Online participation via Zoom with preregistration:

Zoom link coming soon