Projects – SMarT

TRADEF (TRAcking and DEtecting Fakes news). From December 2023 to November 2026

Leader: K. Smaïli

The consortium is compose by: Loria and LIA

TRADEF a project accepted in the framework of ASTRID call targeting cognitive warfare, we propose to address several avenues of disinformation: fake news and deepfakes. The idea is to rapidly detect the emergence of a fake in textual, audio, or video form on social networks and its propagation through those networks. Unlike Botsentinel, which uses Twitter accounts to classify them as trustworthy or not by storing and monitoring them daily, the approach in TRADEF is completely different. It involves detecting the emergence of a “fake” and tracking it over time. At any given moment, this potential rumor is analyzed and assigned a confidence measure, tracking it through social networks in the reference language as well as in networks where the language differs from the chosen one. The evolution of suspicious information over time will see its score change based on the data it encounters. This data can be cross-referenced with audio or video data that can refute or confirm the credibility of the information being processed. The videos that can be used to expose a fake can themselves be deepfakes. This leads us to be vigilant in examining these videos by developing robust deepfake detection methods. Indeed, according to various international evaluation campaigns of these methods, high identification rates can be achieved on the baselines used; however, the results degrade drastically on new data, as will be shown later in this project. Finally, an explanatory dimension of the results is introduced in this project, allowing the process that led to the affirmation or negation of the event status at a given moment to be explained. Given the experience of the participating teams in deep learning and language modeling of standard Arabic and its dialects, we propose to track and identify fakes and potentially harmful information in Arabic social networks, which poses other interesting scientific challenges such as code-switching processing, variability of Arabic dialects, identification of named entities in speech continuum, development of neural methods for resource-poor languages, and explainability of the obtained results.

AMIS (Access to Multilingual Information and Opinions) is a ChistEra project funded by EU. From December 2015 to November 2018

Leader: K. Smaïli

The consortium is composed by AGH (Krakòw – Poland), LIA (Avignon – France), LORIA ( Nancy – France) and Deusto (Spain).

With the growth of the information in different media such as TV programs or internet, a new issue arises. How a user can access to the information which is expressed in a foreign language? The idea of the project is to develop a multilingual help system of understanding without any human being intervention. What we would like to do, is to help people understanding broadcasting news, presented in a foreign language and to compare it to the corresponding one available in the mother tongue of the user. The concept of understanding is approached in this project by giving access to any information whatever the language in which it is presented. In fact, with the development of internet and satellite TV, tens of th ousands shows and broadcasting news are available in different languages, it turns out that even high educated people, do not speak more than two or three languages while the majority speaks only one, which makes this huge amount of information inaccessible. Consequently, the majority of TV and radio programs as well as information on internet are inaccessible for the majority of people. And yet, one would like to listen to news in his own language and compare it to what has been said on the same topic in another language. For instance, how the topic of AIDS is presented in SAUDI-ARABIA and in USA? What is the opinion of The Jerusalem-Post about Yasser-Arafat? And how it is presented in Al-Quds (a Palestinian newspaper)? To access to various information and to make available different and sometimes opposite information, we propose to develop AMIS (Access to Multilingual Information and Opinions). As a result, AMIS permits to have another side of story of an event. The understanding process is considered here to be the comprehension of the main ideas of a video. The best way to do that, is then to summarize the video for having access to the essential information. Henceforth, AMIS focuses on the most relevant information by summarizing it and by translating it to the user if necessary. Another aspect of AMIS is to compare two summaries produced by this system, from two languages on the same topic whatever their support is: video, audio or text and to present the difference between their contents in terms of information, sentiments, opinions, etc.

TRAM (TRanslation of Arabic Music) a project funded by AUF in the framework of “Projets de Coopération Scientifique Inter-Universitaire). From 2016 to 2018.

Leader: K. Smaïli

The consortium is composed by Loria and Jordan University.

The objective of TRAM is to show the feasibility of an automatic accompaniment of Arab vocal improvisation. The idea is to propose an automatic instrumental response to an Arab singer who executes a Mawwal (or Istikhbar). The originality of the project is to investigate an approach based on Machine Translation (MT) in studying the accompaniment of Arab vocal improvisation. This approach considers the mutual interaction between the singer and the instrumentalist as a question and answer: vocal sentence (question) and instrumental response (answer). In Machine translation, we need a parallel corpus composed of a source and a target language. The training process allows then to associate each phrase of the source sentence to its corresponding phrase in the target language. To deal with this project, we propose a consortium composed of experts in music and in machine translation and more generally on machine learning process. This project necessitates collecting data which will be a considerable resource for researchers and which will be provided freely to our research community. This bootstrapping project will probably help us to apply in the near future to H2020.

Torjman A project funded by Algerian research program (PNR). From January 2011 to December 2012

The consortium is composed by Loria, University of Annaba, University of Algiers and CRSTDLA.

Torjman is a research project funded by an Algerian research program. It is dedicated to automatic translation from standard Arabic to Arabic dialect. Indeed, in the Arab countries, the mother tongue is the dialect which is different from one region to another. Standard Arabic is taught in school as a second language. It is used in TV programs, newspapers and religious practice.
To make easy communication between different people in the Arab world, it is necessary to develop systems to translate into Arabic dialects. To do this, an analysis and study research is needed to understand the underlying structures of these pseudo-languages that are inspired by the Arabic language, but also by Berber, French, Turkish, … Then the second step was to collect or more precisely to build dialect corpora in order to develop translation systems from standard Arabic to colloquial Arabics. Torjman led to a Machine translation system and to a parallel corpus firstly composed by two algerian corpora aligned with MSA. Then after the project, we developed a larger parallel corpus: PADIC containing two Algerian dialects, Tunisian, Moroccan, Palestinian and Syrian dialects. All aligned with MSA. Torjman was administated by the university of Baji Mokhtar – Annaba Algeria.