So, you’re building a new game. I’m sure you already thought about lots of things, e.g., game mechanics, in-app purchases, merch, physics engine. What about data storage? The choice isn’t automatically a relational database, you know. Some choose to store data in CMS or NoSQL databases, and then some choose to persist data in spreadsheets (yes, you read it right, spreadsheets, like Microsoft Excel or Google sheets). What’s the best approach? And when would you have to use a relational database instead of a spreadsheet?
What’s the best spreadsheet software?
Spreadsheets are a tool for users designed primarily for human interaction. The UI is very accommodating; that’s why users just type away. Calculated fields are magical and mesmerizing, and if you need to shape your data, pivot is heaven-sent.
I’m pretty sure you have a good idea of what a spreadsheet is already. You may have used a couple of them; below is a list of the more popular spreadsheets.
- Microsoft Excel
- Google Sheets
- LibreOffice Calc
- Calligra Sheets
- Lotus 1-2-3 (if you recognize this, your age is showing😉)
- VisiCalc (this too, will date you😉)
- Quattro Pro
What is a spreadsheet used for?
You can build lots of things with spreadsheets. Let’s list some of them:
- Calculator (of course).
- List management. It’s a natural on list management.
- Analysis tool. You can quickly write on the “pros” column and “cons” column. Just type away.
- Data visualization. You can graph your data.
- Database? Sure. Why not.
That last entry in our list above probably shouldn’t be there, but some people use Google sheet as a database because it’s low-hanging fruit.
Spreadsheets can store data. Lots of data. Google Sheets, by last count, can support approx. 5 million cells. You have plenty of space to keep a massive amount of data. Apart from the storage capacity, the spreadsheet interface is very intuitive and easy to use. Perhaps this is the reason why some people are using spreadsheets as databases.
If your application’s traffic is low and there aren’t many users, using Google Sheets as a database or even a CMS is probably okay. Still, as soon as you sense an uptick in volumetrics (users and transactions are growing), you should move to a proper database.
When you need a database or CMS to store your data
What can a database do? What are the advantages of using a database over a spreadsheet?
- Data security. You can control who has access to the database.
- Data integrity. Databases have schema. It’s very strict on what kind of data can you store on specific columns.
- Scale. Databases are designed for enterprise usage. It can handle large amounts of transactions. The world’s largest online stores run on databases.
- Backup and recovery. These are standard features of databases. Unlike using Google spreadsheets, you don’t have to create a manual backup for a database. It’s already taken care of.
- Data consistency. Database schemas are normalized (typically up the third normal form). Column information is not duplicated because the data is designed and stored relationally. Relational design also has the advantage of cascading edits and deletes across related tables.
Some examples of databases are as follows;
- MySQL – this DB is a powerhouse. It’s everywhere. For starters, all WordPress blogs are powered by this; it’s an opensource database.
- PostgreSQL – This is also an enterprise and opensource database.
- Oracle – a multimodel database management system produced and marketed by Oracle Corp. This is a commercial database.
- Microsoft SQL – An enterprise-grade DB server from Microsoft. This too is commercial software.
- MongoDB – a NoSQL database that’s been gaining popularity. It’s the anti-thesis of relational databases.
- DynamoDB – another NoSQL database available on AWS (Amazon Web Services).
- CosmosDB – A NoSQL DB on the Azure platform.
Spreadsheet as a database - one use case
When you’re prototyping an app and development is at an early stage, you probably don’t want to spend a significant amount of time setting up a database. The setup takes time; you have to think about the schema, the types of field, build indexes, etc. If you use a spreadsheet, you just type away, and in a couple of minutes, you already have a database.
It’s okay to use spreadsheets as a database when you’re prototyping. You want to move fast. It’s understandable, but as the database’s usage grows, in terms of volume and users, you should seriously consider using a proper database.
Pros of using a spreadsheet as CMS or database
- Free to use. When you have a Google account, you can already use Google Sheets.
- Ubiquitous. You can access Google sheets, even from your phone or tablet. If you’re not using Google sheets (Excel, perhaps), you can easily attach the file to an email or put it in Dropbox. You can access the sheet from any device.
- Easy to share with others. Spreadsheets are easy to move around; they’re just files. You can share it in many ways — via email, cloud storage, even Skype.
- Easy to get started. No need to think about schemas. Design as go. No need to set up users; anyone can access the spreadsheet.
- Easy to input data. No need to write front-end code to facilitate CRUD (create, read, update, delete) of records. Just type as you go.
Cons of using a spreadsheet as CMS or database
For all the good stuff that spreadsheets have going for, it has inherent limitations compared to a database.
- Prone to errors. A spreadsheet’s ease of use is also its weakness. Users can type willy-nilly on cells; there isn’t any validation. A database has the facility to restrict the type and format of data to be saved on the records. It’s challenging to enforce consistency of data in a spreadsheet.
- Slow loading speed. When your data is in the order of millions, spreadsheet begins to crack. They’re not designed to handle volumes of this scale.
- Limited data type. You can’t store BLOB data (audio, video, image, and other binary type data).
- Storage limitation. Google sheets have a capacity upwards of 5 million records, but an enterprise app can quickly outgrow this capacity.
- Relational design. You can’t avoid data redundancy in spreadsheets because you can’t normalize the worksheets the way you would normalize tables. The concept of primary keys and foreign keys are absent in a spreadsheet. A database is more structured than a spreadsheet.
- Queries. It’s challenging to search data across sheets. With databases, querying data is a cinch. You can assemble data points in a variety of configurations using JOINS. Searching data in a database also has a significant speed advantage over spreadsheets because of the database indexes.
Why is spreadsheet a lousy choice for localization use
Now that we know the cons and pros of spreadsheets vs. databases, we can make the argument why it’s a bad idea to use spreadsheets in place of a database or a proper CMS.
- No translation memory. You’d have to write a pretty sophisticated script to achieve this in a spreadsheet. If you were using a proper CMS, you could quickly get to texts or segments of texts that have already been translated.
- No word count. This is a standard feature for CMSs. If you want this feature on a spreadsheet, you need to whip up your custom script.
- No version control. It’s challenging to track versions and history of edits. For example, when two translators are working on the same file, and some mistakes were made.
- No audit facility. When two or more people (perhaps, translators) are working on the same file, and mistakes were made, you cannot track who made the error. If you’re using a proper CMS, audit logs are a pretty standard feature.
- Multi-step localization is challenging. If your localization process consists of more than one step such as direct translation from Source to Target language(s), you will find it hard to translate with pivot language(s) or use copywriting as an extra step.
As I’ve said in the introduction, using spreadsheets in place of a proper database or CMS has significant limitations. The lack of security alone is more than enough reason that you should consider a suitable database or CMS. In this day and age of data privacy, you really shouldn’t expose yourself to a security risk by skimping on data storage costs. I can cite more reasons like scale, volume, and audit trail, but the security reason should really drive it home for you.