Loading…
Attending this event?
For more information on the FIAT/IFTA World Conference, visit the FIAT/IFTA website.
Wednesday October 16, 2024 4:30pm - 5:00pm EEST
In the past years, Large Language Models (LLMs) have been increasingly developed and used for various Natural Language Processing (NLP) tasks. As an example, LLMs can be employed for content classification without a need for manual human annotation or exhaustive training datasets. However, such LLMs are associated with high computational cost during inference, preventing their wide adoption in the audiovisual industry, such as large scale audiovisual collections.

In this context, we want to automatically classify broadcast news into topics. Topic classification can be useful for audiovisual content retrieval or monitoring, but remains a challenging problem, particularly due to the difficult task of segmenting continuous feed into homogenous extracts. To solve this issue, we propose a framework applied on French TV and radio news where we transcribe a dataset of 11.7k hours, broadcasted in 2023 on 21 channels with a State-of-the-Art transcription model. A LLM is used in few-shot conversation mode to obtain a topic classification on those transcriptions. We define a topic classification scheme based on the IPTC categories.

Using the generated LLM annotations, we explore the finetuning of a specialized smaller classification model. To evaluate the performances of these models, and to estimate the subjectivity of the topic categorization task, we construct and annotate a test set of 03h44m. We demonstrate that while LLM's inference costs makes them prohibitive for large scale analysis of audiovisual collections, they can be used to generate synthetic datasets used to train less complex models (students) associated to much smaller inference time and better classification performances.

Finally, using an automatic gender classification tool, we compute the speaking time per gender depending on the topic, to determine if some subjects are predominantly reserved to specific genders. We show that women are notably under-represented in subjects such as sports and politics.
Speakers
VP

Valentin Pelloin

INA
Valentin Pelloin joined INA's research team at the end of 2023. His main research objectives are related to Natural Language Processing and Spoken Language Technologies, namely Automatic Speech Recognition (ASR), speech and speaker recognition, and end-to-end information extraction... Read More →
Wednesday October 16, 2024 4:30pm - 5:00pm EEST
Hotel Sheraton Bucharest - Arizona

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link