- Version
- Download 13
- File Size 643.60 KB
- File Count 1
- Create Date October 4, 2021
- Last Updated October 4, 2021
YORAA: AN AUTHORSHIP ATTRIBUTION OF YORÙBÁ TEXTS
ABSTRACT:
The process of establishing the most likely author of a collection of texts or documents whose authorship must be verified is known as authorship attribution. Several studies have been reported in the literature on the task, but rarely any reported work on Yorùbá language texts. In this paper, the development of an automatic Yorùbá written texts authorship attribution system (YorAA) is reported. The literary works of six Yorùbá authors were considered. Stylometry features were extracted from the texts using the BoW approach and lexical/syntactic word frequencies approach. The Support Vector
Machine, Multilayer Perceptron and Random Forest algorithms were used for the classification analysis. The experimental results showed that the developed YorAA system achieved accuracy, recall, precision and F1 measures values of 95%, 83%, 84% and 84% respectively on the average, for all the six authors. The results demonstrate that with a database of written texts in Yorùbá language, that is enough to extract relevant stylometry ´ features of the author and appropriate methods and tools applied to such features; the authorship of the texts can be identified or verified.
Keywords: Authorship attribution; Stylometry; Text classification; Yoruba language; Yorùbá written texts.