This is a fun one: Researchers have used a database of Loch Ness Monster reports to show how anecdotal evidence can, contrary to the common view among scientists, be mined for usable data. In essence, the statistical analysis of anecdotes about the affectionately named Nessie may not tell us much about the mythical beast itself, but it can tell us a lot about reports of Nessie.
Anecdotes – stories about personal experiences – are usually the bane of scientific research as they are often deemed likely to be inaccurate, biased, and even untrue. As such, scientists (and especially statisticians) try to avoid using anecdotes where possible. But is anecdotal evidence always useless, or can it be translated into data?
Well, not according to the authors of a recent paper, documenting the usage of an unusual database of anecdotal reports in their university lessons to get students to think about what questions can be asked of data and what “data” actually means. They show that, when the population and the sampling unit are properly accounted for, even anecdotal evidence can offer valuable insights.
To demonstrate this, the team analyzed reports related to the Loch Ness Monster to identify patterns. The database had over 1,800 reports, and, as the team writes, these were “almost certainly biased with regard to the experienced phenomena”. This was because “presumably more vivid and exciting reports are more likely to be preserved.”
So the analyzed report sample was deemed likely biased relative to the unrecorded and unpreserved population of reports, but so too were the individual variables. For instance, longer-lasting apparent encounters may have been preserved more, as well as reports related to purported sightings of a larger Nessie, rather than supposed little monsters.
Many of the reports were also not considered to be independent. Reports about the Loch Ness Monster may involve several witnesses, or someone describing the experience multiple times, which can gradually lead to distortions. Witnesses may also confer among themselves, so multiple reports from the same event cannot be seen as independent – to do so would be an example of “pseudoreplication”.
However, once the researchers had screened the data for reports for the most usable information, they were left with 1,433 nominally independent first-hand reports collected since 1850. So what does this data tell us?
“We cannot reach conclusions about Loch Ness Monsters from these collected accounts, but we can draw insights about the wider population of Loch Ness Monster reports”, study co-author Dr Charles Paxton, from the University of St Andrews’ Centre for Research into Ecological and Environmental Modelling, explained in a statement.
Collaborating with Adrian Shine of the Loch Ness Project in Drumnadrochit and Dr Valentin Popov, also of the University of St Andrews, the team revealed entertaining and intriguing trends.
“Nessies are mainly reported in the summer months, during the day as opposed the night – with a dip at lunchtimes – and under excellent weather conditions”, Paxton explained.
“Second-hand reports tended to be exaggerated relative to first-hand reports with the monster reported closer and larger. These patterns might be generated by the monsters themselves, but more likely reflect the availability of witnesses and the tendency for stories to be distorted in retelling.”
Obviously, the study doesn’t prove anything about the supposed monster itself, but it does show how statistical thinking can be applied to anecdotal data and to assessing what types of conclusions can be drawn.
The paper is published in the Journal of Statistics and Data Science Education.