Data was impossible to create RegEx queries that wouldn’t pick up noise.
Letters being misinterpreted as other letters or punctuation symbols.
Most of the headers are badly recognised in the txt. file.
Image wan't match with the content in the same file.
We selected five commonly referenced topics in the Encyclopedia Britannica, they are Anatomy, Architecture, Agriculture, Botany and Chemistry.
We presentED the image that is related to these fields and if you movethe cursor to those pictures, the number of the referenced topics in each edition pop up and you can seethe changes by that. In the lower-left corner, there is also an image showing the popularity of these topics, this was adjusted after our data holder's suggestion that the concept of popularity and number of counts might be misleading to people, since they were initially in different pages.