Clásicos Hispánicos
The most important project besides my activity at the University has taken place in the independent collection of eBooks Clásicos Hispánicos. This collection publishes Spanish classics in ePUB and mobi format with texts prepared by specialists and reviewed by a second specialist. We have developed our
Corpus of the Spanish Novel from 1880-1940
As part of my work at the CLiGS research group, I have already published a small corpus of Spanish novels in XML-TEI called Corpus of Spanish Novels from 1880-1940. We have published in our GitHub repository different versions: XML-TEI, plain text, linguistic annotated XML, and PDF.
This corpus is only a teaser of the real corpus I am currently working on, which will be published at the end of the project.
Toolbox
As part of my work at the CLiGS I have also contributed to our repository of scripts in Python. My main contributions are related to the conversion from HTML to XML-TEI, the treatment and extraction of metadata and the work with stylometric matrixes.
XML-TEI-Bible
I am currently editing chapter by chapter the Bible in Spanish, marking with identifiers people, places, groups, and direct speech (with the specification of who is talking to whom). After editing, I am also extract the information and visualise it as graphs.
Everything about this project is published on the GitHub repository.
Stylometry on Political Text
I have been developing a political corpus of Spanish manifestos and studying it with Machine Learning techniques and stylometry. You can find some results here.
Casa de Citas
A database and a website containing quotes of Spanish literature that I find interesting while reading. It allows advanced searches, even with semantic filters.
Der-die-das
A database about the German morphological gender, with more than 5000 German words, with the objective of making it easier to learn this part of the German grammar for Spanish speakers.