Why spreadsheets suck at modeling
Every time I step onto an airplane I shiver at the thought that the engineers who designed it might have used Excel (or any other spreadsheeting tool for what matters) for their calculations.
It’s time for a good rant against spreadsheets, after having looked at all sort of mistakes made by engineers and finance specialists while modeling anything from a boiling mixture to the the cash flow for a project financing.
The modeling work typically starts with paper and pencil, then at some point a pocket calculator is picked up and finally someone sits in front of a computer and fires up her favorite spreadsheeting program. BTW, my favorite spreadsheeting program is quantrix, a multi-dimensional spreadsheet.
The apparent simplicity of the spreadsheet metaphor drags the user into the illusion of unlimited flexibility, immediate and transparent results and zero pain.
In reality there are several fundamental weaknesses in using spreadsheets which invariably result in very high costs in the quality control and maintenance:
- although it is possible to name cells, there is no support for naming vectors and accessing vector elements with an index
- there is no type checking: it’s fair to sum integers with dates, with decimals and floating points
- there is no unit or measurements checking so that it’s possible to make very silly mistakes (while this limitation is common to many other tools, today there are technologies available to avoid these troubles)
- spreadsheets have no simple procedural commands such as for, while, repeat: you have to repeat the statement (!), with the risk of mistake while copying or much more likely when somebody else at a later stage has to change that vector formula; they can be implemented with macros, but that turns your model into the ultimate basic + spreadsheet spaghetti code;
- it’s unnatural to establish automated tests; so nobody ever tests them !
- spreadsheets are binary files so it’s forget about proper, meaningful versioning and tracking of changes;
- implicit systems of equations cause circular references; the solution is attempted by the opaque, stupid built-in solver with disappointing results in most cases; often convergence failures are not properly reported, and debugging is a nightmare; even for direct substitution you need to write macro code – this is illustrated by the simplest possible case of the debt service affecting the period cash flow based on the average of beginning-of-period and end-of-period debt…
- built-in functions names are internationalized so if you learn to use the function SCARTO on the Italian version, you’ll have to re-learn OFFSET on the English version, BEREICH.VERSCHIEBEN in German, DECALER in French (was that with or without accent ?) , DESLOCAMENTO in Portuguese, DESREF in Spanish, PRZESUNIĘCIE in Polish, СМЕЩ in Russian and of course KAYDıR in Turkish
- not just the implementation, even the language is proprietary and not peer-reviewed but only subject to the profit-driven strategy of the owner
- as a consequence of the last point, the tool gets a new release every two years, with the associated hassle for the users to fix the weaknesses and discrepancies it exposes in the models; for comparison, the current C++ standard dates back from 1998 with some corrections in 2003, and the new one is due next year after about more than 5 years of intense peer review from the industry
- there is no separation between data and programs, an axiom from early days of information technology that there is no need to infringe except for very good reasons (object orientation); countless viruses hidden in what should have been raw data tables ensue.
The bottom line is: never use a spreadsheet for your models, if they are supposed to live more than a month or if they will end up in other people’s hands.