Cultural and heritage data resources

Data sources

Language data sources on Hugging Face (e.g. isiZulu) and Kaggle

Broadcast Research Council of South Africa data on audience trends for TV and radio are now available online.

National Archives and Records Service of South Africa (NARSSA) search of more than 8.3 million items on the 'new' database (partial) and the 'old' database (complete).

South African Heritage Resources Agency (SAHRA) manages the South African Heritage Resources Information System (SAHRIS)

South African History Archives (SAHA) collections

HSRC Press open access book collection (400+ titles)

Wits University Historical Papers Research Archive

Various digital collections at the University of Cape Town.

Zamani Project is posting GIS spatial data packages and 3D models of heritage sites across Africa

South African Centre for Digital Language Resources (SADiLaR) collection of language resources

Repository of the 500 Year Archive, an experimental digital research tool. It is designed to support historical enquiry into the five hundred years before colonialism in what is today KwaZulu-Natal and neighbouring regions. It convenes online diverse materials including, amongst other things, texts, images, recordings, excavated items and botanical material, as well as early vernacular publications.

According to the National Library of South Africa (NLSA), "all newspapers published in South Africa are collected by the NLSA" and a digital archive is being implemented.

Example cultural data applications

See this exhibition of projects developed for #HackUrCulture, hosted by the Goethe Institut Johannesburg and Credipple in October 2020.

There are some examples of what can be done with open cultural data, mostly from the US (but please Tweet us about others if you have seen them).

Map a trip using the New York Public Library (NYPL) Green Book items

Southern Mosaic is a visual story using data from the US Library of Congress

Also by the New York Public Library, a visual grouping of 180,000+ public domain items

The Met has collaborated with Google to enable searching of archives using colour

A visual timeline of the Harvard Art Museum collection

Additional reading on open cultural data

Exploring Arts Engagement with (Open) Data by Tim Davies

Open cultural data: Curating GLAM in the digital age in the Jakarta Post

Data as Culture with ODI

A Nerd’s Guide To The 2,229 Paintings At MoMA and the data on Github

How We Learned to Stop Worrying and Love Open Data: A Case Study in the Harvard Art Museums’ API by Harvard Art Museum

A list of 'Cool stuff made with cultural heritage APIs'

120kMoMA - A data visualization study of The Museum of Modern Art collection dataset of 123,919 records

Using Public Domain Materials in the Classroom by New York Public Library

Blog on how people have used MoMA’s data so far

Tools to try

For visualisation, there are many to try out like Flourish and Datawrapper. If you're more technical and using Python or R, have a look at this summary of libraries.

Have a look at these storytelling tools from Knightlab including Timeline, StoryMap, Soundcite and Juxtapose.

For mapping relationships or networks as a story try GraphCommons, see this example of three musicians in a recording ecosystem. Kumu is also popular for network visualisation.

For mapping, something like Kepler is easier to use. For more detail on working with spatial data see this page.

If you want to get data tables out of PDFs you can try Tabula. OpenRefine is good for cleaning data.

If you want to analyse text in books or articles (e.g. to identify people and places) there are lots of tools to try like TextRazor, Intellexer and Google's Natural Language.

Last updated