• 1851 Census

    Big data & some (not so) little problems

    Historians now have at their fingertips large datasets that allow us to ask questions that previously wouldn’t have been possible or practical to pursue. We can text mine the 6 billion pages curated in the digital library of the HathiTrust. We can glimpse the lives of some 3.5 million individuals who lived in seventeenth- and eighteenth-century London. Such large datasets are exciting and allow us to ask interesting and novel questions. But what about the things that we miss when dealing with such large amounts of digitised historical data? What compromises are we making in the hope that trends will be discernible from the noise? Or, that errors will ‘average’…