Assistant Professor ยท NLP Researcher

Oumaima
Hourrane

Researching Natural Language Processing with a focus on low-resource languages, semantic textual similarity, and multilingual deep learning. Building language tools that work for everyone.

Low-resource NLP Semantic Similarity Arabic NLP Plagiarism Detection Multilingual LLMs African Languages
๐ŸŽ“
Al Akhawayn University School of Science & Engineering ยท Ifrane, Morocco
๐Ÿ”ฌ
Hassan II University, Casablanca PhD in Computer Science ยท 2022
20+
Publications
24
GitHub Repos
14
Languages (SemEval)
2022
PhD Awarded
๐Ÿ†

Kambule Doctoral Award Runner-Up
Deep Learning Indaba 2023

01

About Me

I am an Assistant Professor of Computer Science at the School of Science and Engineering (SSE) at Al Akhawayn University in Ifrane, Morocco. My work sits at the intersection of deep learning, linguistics, and equity in AI.

My PhD research, completed at Hassan II University of Casablanca in 2022, focused on semantic textual similarity and cross-lingual plagiarism detection using deep learning โ€” covering extrinsic, intrinsic, and cross-lingual forms beyond simple copy-paste.

I am passionate about NLP for low-resource and morphologically rich languages, including Arabic, Moroccan Darija, and African languages. I contribute to large multilingual benchmarks and shared tasks, including SemEval and AfriSenti, helping build the data infrastructure for fairer language technology.

I also served as a Research Assistant at UNDP and previously worked as a software engineer before fully committing to research. I care deeply about building language models and resources others can rely on.

2022 โ€“ Present
Assistant Professor, Al Akhawayn UniversitySchool of Science & Engineering ยท Ifrane, Morocco
2020 โ€“ Present
Research Assistant, UNDPNatural Language Processing
2017 โ€“ 2022
PhD in Computer ScienceHassan II University of Casablanca
2018 โ€“ 2021
Teaching AssistantUniversity of Hassan II Casablanca
2011 โ€“ 2016
Engineering DegreeCadi Ayyad University ยท National School of Applied Sciences
02

Selected Publications

ACL 2024
Findings of ACL ยท 2024
SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages
Nedjma Ousidhoum, Shamsuddeen Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Ahmad, Sanchit Ahuja, Alham Aji, Oumaima Hourrane, et al.
EMNLP 2023
Conference on Empirical Methods in NLP ยท 2023
AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Adelani, Oumaima Hourrane, et al.
Journal ยท 2022
Full-text Available ยท September 2022
Graph Transformer for Cross-Lingual Plagiarism Detection
Oumaima Hourrane, EL Habib Benlahmar
ACL Anthology
Multi-language Emotion Recognition ยท 2024
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages
Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Oumaima Hourrane, et al.
Best Thesis
PhD Thesis ยท Hassan II University ยท 2022
Semantic Textual Similarity based on Deep Learning: Towards the Automatic Paraphrastic Detection of Monolingual and Cross-Lingual Plagiarism
Oumaima Hourrane โ€” Runner-up, Kambule Doctoral Award, Deep Learning Indaba 2023
View All on Google Scholar โ†’
03

Research Interests

๐ŸŒ
Low-resource NLP
Building language models and benchmarks for underrepresented languages, including Moroccan Darija, Modern Standard Arabic, and African languages.
๐Ÿ”—
Semantic Textual Similarity
Measuring and modeling semantic relatedness between sentences across languages, from shallow lexical overlap to deep conceptual equivalence.
๐Ÿ•ต๏ธ
Plagiarism Detection
Cross-lingual and paraphrastic plagiarism detection using deep learning โ€” going beyond copy-paste to detect translated and idea-based plagiarism.
๐Ÿง 
Graph Neural Networks for Text
Applying GNNs and graph transformers to NLP tasks like plagiarism detection and document structure modeling.
๐Ÿ’ฌ
Dialogue & LLMs
Exploring large language models for low-resource settings, chatbot development, and educational question classification.
๐Ÿ“Š
Dataset Construction
Creating high-quality annotated corpora for Arabic NLP, sentiment analysis, and multilingual benchmarks used by the global research community.
04

Get in Touch

I'm always happy to connect with researchers, collaborators, and students interested in NLP, low-resource languages, or AI for Africa and the Arab world. Feel free to reach out via any channel below.

๐Ÿ“…

Want to chat? Book a 15-minute meeting directly:
calendly.com/o-hourrane-aui/15min โ†’

โœ‰๏ธ
Email
[email protected]
โ†’
๐ŸŽ“
Google Scholar
Al Akhawayn University
โ†’
๐Ÿ’ป
GitHub
OumaimaHourrane ยท 24 repos
โ†’
๐Ÿ”—
LinkedIn
oumaima-hourrane
โ†’
๐Ÿ“„
ResearchGate
Hassan II University, Casablanca
โ†’
๐Ÿ“š
ACL Anthology
Full publication list
โ†’