What Inka quipus teach us about data management
What Inka quipus teach us about data management
Chances are that your knowledge of ancient Peruvian culture is a bit rusty. Maybe you have some vague high-school memories of an extensive but backward empire that was conquered and then asset-stripped by a handful of Spanish conquistadores. Or maybe your best preserved memory is the excitement of reading von Daniken’s speculations that the Nazca lines are extraterrestrial spaceports. But unless you happened at some point later in life to hear about the work of Prof. Urton or his collaborators, most likely you have no idea what a quipu is (see image above).
Ancient Peruvian civilizations never developed writing, but by the time of the Incas they did have an elaborate counting and recording system that was using quipus (collections of knotty strings) as its physical data storage. While the precise encoding scheme is still not known, we know that the system was effective enough to allow the Incas to a manage a highly organized agricultural empire of 20 million souls, which extended from Equador all the way to Chile.
Lesson 1: Without metadata, your records will not make it into history
While there are plenty of quipus around, there is no meta-data device (e.g., something like a Rosetta stone) that would allow deciphering the data. The Incas themselves apparently learned the ropes (forgive the pun) via oral traditions. There was a specialized class of accountants called quipucamayocs:
Quipucamayocs were a distinct class of people, males, fifty to sixty that knew how to encode and decode the accounting data
We can speculate that the quipucamayocs had very high job security and all sorts of perks, like residing in the few urban administrative centres of the Incas. While this was good for them, it was eventually not very good for the preservation of the empire’s data. In modern terms, replace the conquistadores with mergers and acquisitions and you have the conditions for some serious legacy database problems
Lesson 2: Human readable encoding is advantageous
While we don’t know what the numbers encoded, we know through the work of the Aschers that they were using a decimal system. For example the number 42 would be represented by four short type knots and two long type knots.
While in our modern computing systems a choice for low level binary encoding has been made long ago to optimally match the physical devices, there is still considerable freedom on how to represent information at higher level (e.g. in files or document databases). For example one can use the verbose XML or the more terse JSON formats. While both are equally readable from a machine perspective, the latter is more suitable for human consumption and hence preferred when human interaction with the data is required.
Lesson 3: A document store is more flexible than a classic relational database
If you look at a quipu closely you see that:
- all strands are tied on one side only
- the length of each strand may vary
Thinking in terms of a standard database you might be tempted to think of each quipu as one giant table and each of its strands as one row of data. But the flexibility of the untied end means that the schema can be modified very easily and the structure is more akin to graph than a table. For example if a new crop has been planted and I need more space per village record, I can start attaching longer strands.
This reminds us that the growing popularity of NoSQL databases is for precisely the same reason!
Lesson 4: There is ancient logic to append-only databases
It is thought that the quipus were write-only. It sounds indeed plausible that whatever the usage pattern of the quipus, it did not involve untying knots (except possibly when an egregious recording error was made). Hence out of the CRUD operation set, the Incas had effectively dropped the Update! In turn the Delete operation probably involved chucking the strand or tying maybe a delete tag at the end. Hence they had a drastically simplified set of data operations which reminds us of the rising popularity of the append-only technologies.
People who grew up while disks and RAM where still counted in M´s might be horrified with the idea of a database that only grows and never updates. But we have moved since a few logarithmic scale letters (Mega -> Giga -> Tera -> Peta -> Exa) and append-only might be the most robust solution for some data applications
There you have it, from the depths of time, timeless advice on how to design the database technologies of the future!