Social media poses three major computational challenges, dubbed by Gartner the 3Vs of big data: volume, velocity, and variety. Content analytics methods have faced additional difficulties, arising from the short, noisy, and strongly contextualized nature of social media. In order to address the 3Vs of social media, new language technologies have emerged, e.g. using locality sensitive hashing to detect breaking news stories from media streams (volume), predicting stock market movements from microblog sentiment (velocity), and recommending blogs and news articles based on user content (variety).
PHEME focuses on a fourth crucial, but hitherto largely unstudied, challenge: veracity. It will model, identify, and verify phemes (internet memes with added truthfulness or deception), as they spread across media, languages, and social networks.
PHEME achieves this by developing novel cross-disciplinary social semantic methods, combining document semantics, a priori large-scale world knowledge (e.g. Linked Open Data) and a posteriori knowledge and context from social networks, cross-media links and spatio-temporal metadata. Key novel contributions are dealing with multiple truths, reasoning about rumor and the temporal validity of facts, and building longitudinal models of users, influence, and trust.
In particular, PHEME delivers a veracity framework able to track rumors over time, providing a set of state-of-the-art components and algorithms for social media veracity and fact checking. Results will be validated in two high-profile case studies: healthcare and digital journalism.
The techniques developed in PHEME will be generic with many business applications, e.g. brand and reputation management, customer relationship management, semantic search and knowledge management. In addition to its high commercial relevance, PHEME will also benefit society and citizens by enabling government organizations to keep track of and react to rumors spreading online. Of especial interest is the potential impact for detection and veracity checking of news for journalists. This has already attracted attention of journalists around the globe, and the proof is that the project is now known by the media informally as the "Twitter lie detector".