Then we want to boil this down to a binary image: 1 for yes-has-electricity, 0 for no-it-doesn’t. We do this by convolving and subtracting 2D filter, that basically concentrates sites with evidence of night time lights and zeros out the faded areas around them, resulting in the image below.
Then we need to connect these together, with our hypothetical grid lines. A minimum spanning tree is the natural starting point, with difference: instead of following the shortest line between points, we apply a cost function based on existing road networks, going on the assumption that grid lines are more likely to follow (or be followed by) these existing networks. So we extract the full Ugandan road network from OpenStreetMap (thanks to geofabrik) and assign a ‘cost’ based on this: lower cost for bigger roads, full cost in places with no road.
So now we have the two inputs needed to algorithmically ‘create’ a Ugandan grid network. A set of ‘target’ pixels that need to be connected, and a ‘cost’ array which can guide an algorithm in finding the cheapest way between them.
Minimum spanning trees have had a lot of interest over the years as a problem in network theory, with many different solutions proposed, each with its own uses and drawbacks in speed and complexity. For this I used Djikstra’s, because it’s one of the simplest to implement. Basically, it goes as follows:
1) Start at a connected point.
2) Branch outward (typically along network edges, but in this case between raster cells) keeping track of the cost as the total distance to a known connected location, adding each cell to the queue as we go.
3) If another connected target cell is found, record it, and all the points leading to it are assigned a distance of zero and re-added to the queue for analysis.
4) If we find a shorter route to an already connected cell, this replaces the former.
5) Continue until all cells have been visited!
In graphic form, this looks like this:
The results of this algorithm look as follows, where red is the algorithm’s guess and green is actual grid data for Uganda. Where they overlap in a murky brown colour is where the model was correct - quite a lot of it!