Working with spatial data in South Africa
Last updated
Last updated
Contents
If you’ve done basic spreadsheet editing and used Google Maps then you can create exciting spatial data visualisations on a variety of topics, from to service delivery.
For any point on a map the location is defined by:
latitude - which tells you the y-axis/ North-South position, and is always a negative number in South Africa.
longitude - tells you the x-axis/ East-West position, and is always a positive number in South Africa
In the example image below, you will see that Zingisa No 1 Primary School has a longitude of 24.7334 and latitude of -28.7203.
You can associate any general information with a point location in a spreadsheet row (such as school name/ size/ type or land use/ owner/ value) and this can be imported into a spatial mapping and visualisation tool as a new layer - as long as there is a longitude and latitude column and you have . In the image of Sol Plaatje, a layer of school point information has been placed on top of other layers as yellow circles.
There are some great new free and open source online tools that make it easy to import new layers of information on top of a base map and to create visual stories, such as , which is built on and , as is .
Point information (like the school name and location) can be imported from a spreadsheet but you will need to (NB: not semi-colon-separated) otherwise the mapping tool will not import the file. See detailed instructions in the mini-guide.
Line or region information (like the suburb populations) needs to be imported as Shapefile, GeoJSON, KML file format or similar, and these files tell you more about a bounded shape instead of a single point.
Longitude and latitude: If you want to show any information (such as a school name) on a map, you need to have the location of the school in the form of longitude and latitude coordinates and tag the name onto that location pin. Latitude tells you the y-axis/ North-South position, longitude tells you the x-axis/ East-West position. In South Africa the longitude will always be a positive number between 15 and 35. The latitude will always be a negative number between 20 and 35. If you have the address of the school you need to convert it to longitude and latitude. In the map of Sol Plaatje below, you will see that Zingisa No 1 Primary School has a longitude of 24.7334 and latitude of -28.7203.
Layers of information: Every map is built up of layers of information starting with a group of ‘base’ layers that typically show a satellite image (or ‘raster’), roads, some key points and possibly municipality or ward boundaries. People often give you 'Shapefiles' (see below) for this information. You then have to insert additional layers of information on top of the base layers, e.g. the name and location of schools in Sol Plaatje or the Northern Cape. A layer is built from one of three types of information:
Point: single location on a map defined by longitude and latitude (e.g. location of a school)
Line: connections between points (e.g. sections of road)
Polygon: connection of multiple lines into a shape (e.g. boundary of a municipality or ward)
Spatial data file formats: Location information is usually combined with other non-spatial information into spatial data files that you can show on a map (e.g. the names of schools are linked to their longitude and latitude and shown as yellow points on the above map).
Shapefile: is a widely used spatial data format usually distributed as a folder or .zip containing a minimum of .shp, .shx and .dbf files. Can show information about a point (like csv) but also line and polygon information.
GeoJSON and other open formats: are increasingly used as alternative to Shapefiles and show similar information.
To upload your data file into a mapping tool like kepler.gl, Mapbox, uMap or Carto you should prepare the file by addressing a few things, otherwise the tool may not load the file:
Location columns: Once you have your xls(x), ods or csv ensure that it includes two separate columns, one with longitude and one with latitude.
Location column format: Not necessarily required here but for future use, convert the longitude and latitude columns to number format. Also increase the number of decimal places to at least 5 otherwise you will affect the accuracy of the location information.
Non-number characters: Remove all non-number characters from the longitude and latitude columns. Even if there is only one row with non-number characters, the mapping tool may not load the entire dataset. You can find the problem rows by sorting the column Z → A and then manually editing or deleting problematic cell contents.
Delimiters: Ensure your operating system is set to save the csv using comma delimiters (,), not semi-colon delimiters (;). This is usually changed in the system region settings or similar.
File format: “Save As” to CSV UTF-8 file format. To be safe you can import your sheet into a Google Sheet and then download as csv from there.
Now open the csv file using a text editor like Notepad or TextEdit and check that:
There is a header row in the first line with the names of the columns
That the naming of longitude and latitude in the header row is correct
That names and values are separated by commas, not semi-colons
As noted above, your information on the school needs to be shown on top of a ‘base’ map of Sol Plaatje that shows roads and possibly municipality or ward boundaries. Luckily, most online mapping tools, including kepler.gl, already have a base layer of information for South Africa so you probably don’t need to do anything in this step.
However, if you do want to add specific base data you can source it from various locations, e.g.:
On kepler.gl click on “+ Add Data” and upload your csv file. This will automatically create a new layer on top of the base layers, and should focus on the region where your points are located. If you hover over one of the points it should pop up information about that school.
After creating your map you can now tweak the settings on the layer to enhance the visualisation. For example, you can make the ‘Radius’ of the school circle proportional to the number of learners in the school. By zooming into Sol Plaatje, you can quickly see where the big schools are in the municipality.
You can now add multiple additional layers as you did with the school information csv. If you are using csv files just make sure there are longitude and latitude columns, and click “+ Add Data” again.
If you would like to add more complex spatial information, such as average household income for sub-place/ suburb/ ward polygons in Sol Plaatje, then you will need to import GeoJSON or Shapefiles with this content.
So hopefully more/ larger schools (yellow bubbles) are located in the areas with higher population numbers (lighter colours), which seems to be the case for Sol Plaatje.
Table (xls(x), ods and csv): each row contains information about a point - such as the name of school, number of teachers, number of learners and telephone number; as well as location information in the form of longitude and latitude or the address, which needs to be converted to longitude and latitude. This data .
Mapping and visualisation: Uber has recently built an open source interface on top of Mapbox called (also see this ) which this mini-guide uses. This is probably the easiest to get started with that doesn’t require a sign-up or download. has a similar tool which you need to register to use. There are very lean cloud-based alternatives like or more advanced options like the base version of and which is a good proprietary starting tool for 30 day trial (then $149/ month!). Also try which lets you create maps with OpenStreetMap layers. The most widely used open source tool by GIS practitioners is which involves about 700MB download over a few steps. And QGIS has a great web mapping plugin called QGIS2Web which makes it very easy to export an interactive map for users to explore. See this and an example output . See for supported open data projects including street coverage in OpenStreetMap, addresses in OpenAddresses, transit schedules in Transitland, or global names and places in the gazetteer Who's On First.
Conversion between various spatial data formats: You can import and export into different formats using QGIS. But a quick and easy online alternative is which converts between various formats, such as Shapefile to a GeoJSON. There are other open, closed and cloud-based tools such as and (only free up to 5MB) that can do other conversions if you need. Shapefiles (SHP) are increasingly being replaced by Geodatabase (GDB), so you may need to try .
Converting addresses to longitude and latitude: This is called geocoding. You can do this manually in Mapbox, Google Maps or OpenStreetMap by searching for the address and getting the longitude and latitude of the pin. You can do this automatically by using (uses OpenStreetMap), and , or through and OpenStreetMap. More broadly you can also geotag points and draw polygons on maps/ satellite images using the above mapping tools.
and collaborative project to create open map of the world.
See which is creating R building blocks and learning resources to make it easier to make data-driven maps in Africa.
is used widely by app developers and is an open-source JavaScript library for mobile-friendly interactive maps.
Citizen-driven open mapping tools: Open Map Kit () and
Open spatial data formats: Geography Markup Language () and GeoPackage ()
Private spatial mapping service providers in South Africa usually supply comprehensive spatial data information covering most of the above (and more): , , ,
Sourcing spatial data: In some cases you don't have an xls(x), ods or csv file and need to scrape the data tables from a pdf or website. Have a look at this on using and crawlers to scrape data, and see the note in about using a mapping tool to geotag, draw polygons or geocode address data into latitude and longitude.
Location column names: Ensure that longitude and latitude columns . If the column name is GIS_latitude or some other variation then the mapping tool may not identify the correct column.
If your csv is still not loading and you are using Mapbox or kepler.gl.gl as your mapping tool, have a look at this .
For the this example data story we wanted to check whether are situated near areas with higher populations (at level or ). In the image of Sol Plaatje above, the yellow circle shows where a school is located, and the size of the circle is proportional to the number of learners in the school. The colour of the suburb corresponds with the number of residents, darker colours = larger population.
This example uses but you could use any of the and it would follow a similar approach. Note that kepler.gl runs as a client-side app and does not require a login, which means you can’t save to an online account. However, you can export your map to a json on your desktop which you can quickly import to kepler.gl any time.
Province, municipality and ward boundaries: in open format KML from the or as proprietary format Shapefiles on
In this case we first download the updated school master list information from for schools under the section “Quarter 4 of 2016: March 2017”. You should first before trying to import into kepler.gl or another mapping tool.
The main source of population or demographic information is the StatsSA census and community surveys. You can access Census 2011 data down to ward level on , including for (click on Download and select GeoJSON). To do it yourself, location-based population information is accessible in two ways:
Municipality and higher levels: You can use QGIS to (inner) the municipality boundary demarcation Shapefile (which has location information) mentioned above to municipality census information you can download from Nesstar or SuperWEB2.
Sub-municipality level: For sub-place and small area (smaller than ward level) census information you will need to contact StatsSA directly to request their 2011 Community Profile Database DVD set (e.g. for census) - which will then need to be processed into Shapefiles or similar. Or you can buy processed data from a commercial provider like .
An extract of processed Census 2011 (extrapolated to 2014) Shapefiles for Sol Plaatje can be found , which you can import directly to kepler.gl in a similar way to how you do the csv. An example of this imported into to kepler.gl, as a new layer below the schools, is shown below. The colour of the polygons shows the population for the area: dark colours = lower population, light colours = higher population.