How Excel’s Row Limit Caused Loss of 16,000 COVID Test Results in England

Alex Hern, writing for The Guardian:

A million-row limit on Microsoft’s Excel spreadsheet software may have led to Public Health England misplacing nearly 16,000 Covid test results, it is understood. The data error, which led to 15,841 positive tests being left off the official daily figures, means than 50,000 potentially infectious people may have been missed by contact tracers and not told to self-isolate. […]

In this case, the Guardian understands, one lab had sent its daily test report to PHE in the form of a CSV file — the simplest possible database format, just a list of values separated by commas. That report was then loaded into Microsoft Excel, and the new tests at the bottom were added to the main database.

But while CSV files can be any size, Microsoft Excel files can only be 1,048,576 rows long — or, in older versions which PHE may have still been using, a mere 65,536. When a CSV file longer than that is opened, the bottom rows get cut off and are no longer displayed. That means that, once the lab had performed more than a million tests, it was only a matter of time before its reports failed to be read by PHE.

The primary problem here isn’t Excel’s million-row limit; it’s the fact that if you import a CSV file that exceeds that limit, Excel doesn’t report an error. It just silently cuts them off, which is inexcusable. [Update: This tweet from Leon Zandman indicates that Excel does present an error message when it attempted to import a CSV file with too many rows or columns. Update 2: BBC News, without citing an explicit source, fingers the use of the old XLS Excel file format, which has a limit of just 65,000 rows of data.]

Everyone knows error messages are bad, but the reason they’re bad is the error part, not the message part. Not reporting errors just makes everything worse, by pretending that the errors aren’t even happening. (Apple, I’m looking in your direction.)

Also reminiscent of our cuckoo-in-chief’s unshakable belief that the solution to America’s COVID pandemic is to reduce testing, not reduce the number of infections.

Wednesday, 7 October 2020