Team work to deliver the most qualitative Tweet Volume

Tweet Volume: find the most tweeted symbols

Gambiste provides a Tweet Volume on Stocks and Crypto Currencies designed to help investor gauge the market appeal on symbols. Social media is the new data type that you must integrate in your trading strategy to grasp all opportunities. We provide an overview of our processes in this article.

Research shows that Twitter is useful:

The researchers of the University of Connecticut showed that Twitter data can improve your trading strategies. They published the study “Twitter volume spikes and stock options pricing” in the journal Computer Communications.It reveals how spikes in the number of tweets about a company can be used to design a profitable stock options trading strategy.

“Our results show that social media is a powerful tool to help understand the behavior of stock options, and further assist the trading of these valuable, but complex investment vehicles,”said Bing Wang, associate professor of computer science and engineering, one of three authors on the paper.

At Gambiste, we worked since 2015 to deliver the most qualitative and advanced Tweet Volume on securities.

Alpha Generation for Modern Traders

That’s just a new way to digest information.

Traditional financial institutions pay thousands of people to read and prepare investment notes and advices, that the reasearch business.  Mac Kinsey estimates that the top-10 sell-side banks currently spend $4 billion on research annually for a cash equity research headcount of 3900 in 2011. Gambiste thinks that this industry will face big challenges,  not only regulatory but also technological.

Social media provides a new source of data to support alpha generation.

 

The volume of social media information is enormous. There is more than 500 million tweets posted on Twitter per day. It makes tapping into the potential value and deriving insight a challenge. That’s also a huge opportunity. It’s why the process is hidden underneath.

Tweet Volume Filtered From Spam

So for traders and investors looking for an edge, the core challenge in mining social media remains. That is, how can the truly valuable information — which represents a small percentage of the overall feed — be extracted and presented to the trader ?

We developed our social indicators using data from Twitter. The data is retrieved automatically. Our application extracts relevant tweets. We developed our own spam filters to remove abusive Tweets. So at first, we apply machine learning classifiers to keep the ‘good’ tweets.

For example, we identified this tweet on the symbol $BTC as spam. This is a classical tweet. @techCrypto asks others to sign on his website. To reach people, the spammer add the Tickers $BTC, $LBC, $LKK. When a user searches for one of these Tickers in the search tool of twitter, it will display this tweet with bad content.

https://twitter.com/tehCrypto/status/915314530972061696

We discard a lot of messages. It represents 70% to 90% of the messages that we retrieved. We remove theses ‘spam’ Tweets from the Gambiste Tweet Volume and we keep the good ones.

User reputation matter

We don’t stop here.  Each user does not have the same weight. Depending of your reputation, we will not account the same score to a given tweet. It depends of the engagement that a given user can generate based on multiple factors. Your reputation score depends of your number of  followers, tweets, likes received…

In fact, Gambiste does not calculate the reputation score. That’s not our business. For this matter, we use a leading reputation indicator. Gambiste discounts the tweet created by user with a low reputation.

Tweet with many Tickers:

Similarly, we also discount the tweet with several Cashtags (e.g. $AMZN, $GOOG, $HALO). Spammers create tweet with many symbols to attract views. If we took into account these messages, this would totally disrupt our count. The application manages these tweets by splitting their points betweet each ticker. The application divides the global score of the tweet below  by the number of symbols. Such as the tweet below, there is 7 symbols so we divide by 7. We then add the fractioned score to the current Ticker score [$BTC actual score + 1/7 of the Gambiste tweet score].

To recapitulate, Gambiste has a tree steps process:

  1. Spam filtering
  2. User reputation weighting
  3. Symbols management

to provide an accurate picture of the tweet volume on a given stock symbol and Gambiste does that live, continuously.

Tweet Volume For Crypto Currencies and US Stocks:

We developed two distinct streams: one for the Alternative Currencies and one for American Securities. We published recently a research screen to browse into the tweet volumes calculated by Gambiste on the Alternative Currencies. Crypto currencies are chatted a lot on Social Media. Satoshi Nakamoto released the Bitcoin as a decentralized digital currency in 2009; tree years after the creation of Twitter. Crypto fans are some native users of social media.

Hedge Funds and Wealth Management Brokerage Firms implements social media stock data into their overall trading strategies. They capitalize on this lucrative insight since years.  And you know what ? Their results have been staggering and they don’t want to share it. Individual investors also need access to this high level structured information in order to compete with professionals. This is exactly what Gambiste provides today .

With Gambiste, you will be able to analyze:

  1. Which Stocks are trending on twitter ?
  2. Which Stocks the masses are about to get into ?
  3. Find out which stocks everyone is talking about ?
  4. Which stocks have been tweeted the most ?
  5. Which stocks are being tweeted about right now ?

Always exploring new domains

To conclude, we improve our algorithms every day to provide to our end users actionable data sets that can be incorporated into existing algorithm. We want to dramatically improve your trading performance and your stocks discovery process.

Leave a Reply

Your email address will not be published. Required fields are marked *