Capturean

This asset is developed by a team led by Tomas Pariente LoboHead of Lab

In recent years the Web has become not only a place to consume and search for content, but an active environment where people and organizations create content and exchange data and knowledge. User-generated content, especially coming from social networks (SN), blogs or forums, is of a highly dynamic nature. The amount of content available even for specific topics is mind blogging. There is a clear need of tracking, filtering and analyzing this content in an automatic way in order to make sense of it and enable different usages of the data.

Capturean implements advance data collection and information integration technologies to gather and harmonize data from multiple sources into a single coherent representation. The acquired data is then analysed providing insights and metrics coming from social media. These metrics provide a view of what is going on the web that can serve as an input for multiple applications and business scenarios, such as brand management, product placement, media tracking, financial sentiment over time, reputation on the web, political debates, etc.

Business Challenge

In the age of Internet, business decisions are increasingly dependent on the just-in-time delivery of relevant information and knowledge. While in the past this information used to be structured, in today´s world there is increasing dependence on unstructured sources of information, such as the Internet, and subjective inputs, such as sentiments, assessments, opinions, rumors, beliefs, etc.

Internet texts such as weblog articles and forums provide, for example, a massive amount of potentially useful information. An analyst or decision maker would have to collect, filter, assess, and interpret all these texts with respect to a current object of interest. However, accomplishing this task cannot be done manually due to time constraints in decision making and the enormous amount of documents.

Customers and R&D projects are asking for versatile tools that allow the acquisition of intelligence from Social Networks and apply it to the decision making process.
Capture offers a solution open, innovative and adaptable to the needs of costumers and organizations to gather and extract facts and intelligence from Social Networks.

Solution

Capturean provides automated methods for knowledge and intelligence processing and management, from data acquisition all the way to the final application services that include decision support, visualization, etc.

This application layer can be developed in a fast and cost-effective way thanks to previous implementations of Capture and the reuse of previously developed services for a broad range of sectors and applications, such as reputational risk in finance, rumor detection, security in smart cities, etc.

Capture is based on state-of-the-art big data technologies. The solutions uses Open Source frameworks and tools ranging from Apache Hadoop and Storm for distributed processing, to Apache HBase and Solr for storage and information retrieval. Capture extracts data from SN and RSS feeds using open APIs and tools delivering a set of metrics for specific scenarios.

Capture resembles the water cycle:

  • by drinking from Data Sources (Twitter, RSS…), each delimited by queries to a Social Network;
  • feeding Data Channels, or data flows related to several sources, usually about related topics;
  • stored in thematic Data Pools, or functional topic-based repositories of annotated data;
  • accessible via Solr queries;
  • and processable in the cloud as-a-service using big data technologies.

Benefits

Capturean is an Atos offering in Social Network analytics, providing several APIs and integration points in order to ease the process of delivering data and insights to people or external applications.

Capturean provides an innovative dashboard with advanced reporting tools leaving the insights at the fingerprints of the users.