Introducing ‘Big Data’
The web is home to an enormous amount of information – Facebook alone sees users share 30 billion pieces information or ‘content’ every month. This content ranges from text posts to photos and videos, while in the wider online community it can include anything from scientific publications to records of online purchases and chemical structure information. We call this wealth of information ‘Big Data’.
A better name might be ‘Overwhelming Data’ as its variety, volume and velocity of growth is ever increasing. 90% of the data in the world today was created in the last two years due to the advent (and increased processing ability) of information-sensing mobile devices like smartphones, software logs, cameras, microphones, wireless sensor networks and so on.
In the year 2000 the
Sloan Digital Sky Survey collected more data in its first few weeks than the entire data collection in the history of astronomy. The successor to this telescope – the
Large Synoptic Survey Telescope – comes online in 2016 and will acquire that amount of information every five days.
Big Data should be about the ‘Data’ and the ability to build applications that find information efficiently, not the ‘Big’, after all it is one thing to have a huge reservoir of content and quite another to do something with it.
Bringing the RSC’s data together
The RSC has data in the form of articles, information about authors, events at multiple locations, tweets and Facebook posts, local section producers and supplementary information. We recognise that this information isn't on the scale of Facebook or the Digital Sky Survey but for it to be usefully delivered to our visitors it needs to be properly connected through keywords and joined up in a way that the web (and the computers behind it) can understand both the content and context. This is our vision for the RSC web.
Optimising our content
Our aim is for a chemist to find our highly ranked
Chemistry World article when searching for a particular keyword like ‘caffeine’ on a search engine. Once they have landed on our page we then need to ensure that they have access to all our data by automatically presenting them with not only the article, but also a selection of other relevant resources from around the RSC.
We might pull in a 3-dimension image of a caffeine structure from
ChemSpider, and then discussion about the article from Twitter, analytical discussion groups or the comments from readers of Chemistry World. We could highlight conference and event information or journals related to caffeine, and show the most commented and popular articles about it. This focus on giving a user what is relevant to them is an ambitious goal and one that is driving the changes to our online presence.
We have already made headway into this with our new
Learn Chemistry site which allows users to find resources for teachers and students more easily and the
Visual Elements Periodic Table that correctly tags our vast amount of element data so that it can be used in other areas. All future projects at the RSC, including the Chemistry World site redesign, will encompass the lessons learned from Big Data concepts and as we progress through the year more and more content will be joined up.
We would be very interested to hear your thoughts and experiences of 'Big Data' so please feel free to comment below.
Many thanks,
James Stevens - Web Manager at the RSC.
more...