The Dov8bot

By Tyler Gaydos and Owen Helm

Project Documentation

The goal of this project is to create a text-generating bot that can take input and respond in dovahzul, the constructed language (or conlang) of the dragons from Elder Scrolls V: Skyrim.

The sources for this project can be found under the Sources tab at the top navigation bar. Our primary source for the conlang came from Thu'um.org, a community-driven community dedicated to learning and use of dovahzul. Using the 5th Edition Dragon-English dictionary, as well as some scraped text files from The Unofficial Elder Scrolls Pages wiki, we were able to assemble a small collection of files of the language in action.

Once our source files were formatted, we turned to Huggingface.co to find a pretrained LLM (large language model) which we could train with our dragon text files. We settled on BLOOM, a LLM made to continue text from a prompt and output text in 46 different languages.

Currently, we are experimenting with BLOOM and learning how to feed it our training data.

The Process

This project began with webscraping. Owen put together a webscraper which allowed us to pull text from The Unofficial Elder Scrolls Pages,found the dictionary from Thu'um.org, as well as finding a downloadable text file of the script of Skyrim, which included the main questlines for the base game as well as DLC. At the same time, Tyler (with the help of Dr. Bondar) was at work deciphering the special dragon font that was used in the dictionary, which is described further in our Challenges page.

Once the font was deciphered (the key we made can be found in our Github Repo), we had to prepare our files for use. Tyler performed Regex on the script to eliminate all text except those Tyler and Owen deemed relevant to the project. The dialogue deemed relevant included lines spoken by dragons and the Greybeards. After regex was performed to clean up the script, XSLT was ran to select only the relevant lines.


            <xsl:output method="text"/>
            <xsl:template match="root">
                    <xsl:apply-templates select="//dia[@speaker = 'Alduin' or @speaker = 'Paarthurnax' or @speaker = 'Odahviing' or @speaker = 'Mirmulnir' or @speaker = 'Greybeards' or @speaker = 'Miraak' or @speaker = 'Sahrotaar' or @speaker = 'Sahloknir' or @speaker = 'Durnehviir']"/>
                </xsl:template>
                <xsl:template match="dia">
                    <xsl:apply-templates /><xsl:text>
                </xsl:text>
            </xsl:template>

Once we had all the necessary files, our next goal was to create a network visualization. After some discussion with Dr. Bondar on how to best utilize our avaliable resources, we decided to use various scraped files obtained from The Unofficial Elder Scrolls Pages, which we would compare against each other. Namely, the files that we analyzed were those relating to some important dragons within The Elder Scrolls V: Skyrim's story.

Dovahbot by Tyler Gaydos and Owen Helm is licensed under CC BY-NC-SA 4.0