For code, check out my GitHub page
Business Model Prevalence in Influencer Marketing on Instagram: This research project brings together expertise from Natural Language Processing (NLP) and European Consumer Law and aims to make a gap-filling contribution focused on determining which particular influencer marketing business models can be identified on social media, and how influencers use them.
Abusive Language on YouTube: a dataset of YouTube comments in English extracted from videos on different controversial topics and labelled by Law students. The comments were sampled from the actual collected data, without artificial methods for increasing the abusive content.
Enelvo: A flexible normaliser for user-generated content in Portuguese. It is basically a tool (and Python module) for correcting non-standard words such as internet slang, spelling mistakes, and acronyms. Project I developed during my master’s.
MultiMT: Multi-modal Context Modelling for Machine Translation. I was part of the project while a student at the University of Sheffield.