Digital humanities: Analyzing the clause functions of a term's cooccurrences

Within the digital humanities project eXChange, historians and classical philologists work with a database containing a large amount of digitized historical texts in Latin and ancient Greek. Usually, humanities scholars pose keyword based search queries and often receive numerous results, which are hard to revise individually. As a consequence, the generation of valuable hypotheses is a laborious, time-consuming process. To facilitate the humanities scholars' workflows, The intention of using TagSpheres in this project is a specific research interest of the humanities scholars: the analysis and classification of a term's co-occurrences according to their clause functions. For this purpose, the scholars required four-level TagSpheres displaying the following tags:

  • H1 : search term T,
  • H2 : co-occurrences of T with word distance 1,
  • H3 : co-occurrences of T with word distance 2, and
  • H4 : co-occurrences of T with word distance 3 up to word distance m.
The font size of T on level H1 encodes how frequent the search term occurs in the underlying text corpus; the font sizes of all other terms reflect their number of co-occurrences with T in dependency on the corresponding distance.

Sports: Visualizing the performances of teams in championships

This scenario illustrates how TagSpheres can be used to comparatively visualize performances in championships. We processed a dataset containing the results of all national teams ever qualified for the FIFA World Cup. We receive the following six-level hierarchy:

  • H1 : FIFA World Champions,
  • H2 : second placed national teams,
  • H3 : national teams knocked out in the semifinal,
  • H4 : national teams knocked out in the quarterfinal,
  • H5 : national teams knocked out in the second round (second group stage or last 16), and
  • H6 : national teams knocked out in the (first) group stage.
The nations' names are used as tags and font size encodes how often a national team partook a championship round without reaching the next level. Therefore, most nations occur on various hierarchy levels.

Aviation: Visualizing all non-stop flights of an airport

To analyze the federal, continental and worldwide connectivity of airports, we derived a dataset from the OpenFlights database, which provides a list of direct flight connections between around 3,200 airports worldwide. With the selected departure airport d (or city) on H1, all other airports (or cities) reachable with a non-stop flight cluster into three further hierarchy levels:

  • H2 : airports/cities in the same country as d,
  • H3 : airports/cities on the same continent as d, and
  • H4 : all other reachable worldwide airports/cities.
As tags we chose either airport names, the provided IATA codes, or the corresponding city names. In this scenario, font size encodes the inverse geographical distance between the departure airport d and the arrival airport.