Arabic Linked Drug Dataset Consolidating and Publishing


Guma Lakshen, Valentina Janev, Sanja Vraneš




The paper examines the process of creating and publishing an Arabic Linked Drug Dataset based on open drug datasets from selected Arabic countries and discusses quality issues considered in the linked data lifecycle when establishing a semantic Data Lake in the pharmaceutical domain. Through representation of the data in an open machine-readable format, the approach provides an optimum solution for information and dissemination of data and for building specialized applications. Authors contribute to opening the drug datasets from Arabic countries, interlinking the data with diverse repositories such as DrugBank, and DBpedia, and publishing it in a standard open manner that allows further integration and building different business services on top of the integrated data. This paper showcases how drug industry can take full advantage of the emerging trends for building competitive advantages. However, as is elaborated in this paper, better understanding of the specifics of the Arabic language is needed in order to extend the usage of linked data technologies in Arabic companies.