Articles

Top 20 Historical Twitter Datasets Available for download

by Kate Finch social media enthusiast

 

Twitter has differentiated itself from a wide audience as a direct platform for contact. Tweets are accurate and snappy to express, this has made Twitter so famous that it is shaping global communities. In this post, we will address the credible sources for downloading free Twitter datasets.

 

Twitter data sets could be used potentially for scientific studies, social ventures, and marketing methodologies. In this post, I have collected repositories of various free Twitter datasets from multiple sources available for download over the internet

 

I've also listed a way to get specific historical Twitter datasets, but first, let's talk about all free sources of Twitter data sets.

 

1. TrackMyHashtag

Type- Corona Virus (Covid-19) Tweet Metadata

Link- www.trackmyhashtag.com/data/COVID-19.zip

The Twitter Dataset contains 60 K tweet accusations from public Twitter accounts relating to the"Covid-19" search term. The dataset was collected over 8 weeks (Dec 1st, 2019–28 Jan 2020) using the Twitter Stream API. The Excel / CSV dataset is broken down into three separate fields-tweet metadata, images, and videos.

 

2. Archive.org

Type- Miscellaneous research data (2013-2018)

Link- www.archive.org/details/twitterstream

It is a series of free Twitter data sets that were amassed for research, history testing, sentiment analysis, and data retention. In this archive, we can use tons of data and pick the stream we need. These archives contain loads of data that can be sorted and used when needed. This Twitter dataset can be downloaded for free. You can download the Twitter datasets here free of charge.

 

3. Data.world

Type- MNC’s Twitter accounts

Link- www.data.world/datasets/twitter

Data.world is a repository for free Twitter datasets. Datasets from businesses to prominent people are open to users. We can just go to the website and search the Twitter dataset array.

 

4. Github

Type- Russian troll tweets to celebrity accounts

Link- www.github.com/shaypal5/awesome-twitter-data

This is a free data archive, like all else on Github. The datasets vary between tweets with Elon musk and Russian troll tweets.  Users may simply go to the URL and search the vast Twitter datasets array.

 

5. Kaggle

Type- Scientific research data

Link- www.kaggle.com/datasets?search=twitter

Kaggle is also a freely available online archive for the exchange of codes, and scientific data. There are a wide number of Twitter datasets available for free download. The data range from eco-studies to demonetization tweets in India.

 

6. ICWSM

Type- Academic research data

Link- www.icwsm.org/2015/datasets/datasets/

ICWSM is a data-sharing project and has a large array of Twitter datasets. The collection can be downloaded free of charge, and users must register only on the website and sign a declaration that they will not share the report. In academic research, these data sets can be extremely useful.

 

7. Figshare

Type- Dataset related to real-world events

Link- www.figshare.com/articles/Twitter_event_datasets_2012-2016_/5100460

This collection contains a selection of 30 separate data sets linked to real-world events and collected using a sequence of keywords between 2012 and 2016 with the streaming API. Such data can only be accessed for non-commercial use according to the Twitter TOS.

 

8. ISI.edu

Type- Old Twitter data from 2010

Link- www.isi.edu/~lerman/downloads/twitter/twitter2010.html

This dataset includes tweets published in October 2010 on Twitter. Although very mature, it may still refer to data minors and academics. To access the dataset simply click on the link.

 

9. Trec.nist.gov

Type- Sample of 16 million unfiltered tweets

Link- www.trec.nist.gov/data/tweets/

This archive includes about 16 million tweets from 23 January to 8 February. This is an unfiltered database of relevant tweets and spam. The user simply needs to sign a document that does not allow the data to be used for commercial purposes and you can then access the archive immediately.

 

10. Kdnuggets

Type- Miscellaneous

Link- www.kdnuggets.com/datasets/index.html

Kdnuggets offers information about work, related classes, webinars and Twitter datasets for free access. The given connection can be accessed directly and your datasets collection can be viewed.

 

11. Github scraped public tweets

Type- Miscellaneous public tweets

Link- www.archive.org/details/twitter_cikm_2010

This dataset represents a series of public tweets used to research the geolocation data associated with the tweets, in collaboration with the academic project.

 

12. Kaggle customer support datasets

Type- Customer support tweets

Link- www.kaggle.com/thoughtvector/customer-support-on-twitter

This dataset includes over 3 Million customer support tweets from different global brands and firms. It can be used to understand conversational models and to research current methods and consequences of customer service.

 

13. Follow the Hashtag

Type- NASDAQ companies’ tweets

Link- www.followthehashtag.com/datasets/

The data sets ranging from the top 100 NASDAQ companies to the UK geolocation tweet data. To download the data set, simply click on the link.

 

14. Lionbridge

Type- Miscellaneous

Link- www.lionbridge.ai/datasets/top-20-twitter-datasets-for-natural-language-processing-and-machine-learning/

Lionbridge provides an extensive range of Twitter datasets ranging from local news to tweets with the #Avengersendgame hashtag etc. Only click the link and browse the list of your data sets.

 

15. Academic torrents

Type- URL’s posted on Twitter in October 2010

Link- www.academictorrents.com/details/d8b3a315172c8d804528762f37fa67db14577cdb

This dataset contains URLs shared on Twitter in October 2010. The connection will allow you to download Torrent files through a Torrent client.

 

16. Sentiment140

Type- Tweet sentiment analysis data

Link- www.help.sentiment140.com/for-students

You can discover the sentiments of a brand, product or subject on Twitter with Sentiment140. This filters through the tweets by thinking about the negative or positive impact of the message or comment by emoticons.

 

17. Docnow

Type- Miscellaneous

Link- www.www.docnow.io/catalog/

Docnow provides catalogs of publicly accessible data sets on the internet. You would have to import the data sets first and then use the Hydrator software program or Twarc if you are comfortable operating on the command line, in order to restore such tweet identifier data to its original JSON format.

 

18. Harvard dataverse

Type- USA Presidential election tweets

Link- www.dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PDI7IN

The archive includes about 280 million tweets for the 2016 U.S. Presidential elections. They were taken from the Twitter API by Social Feed Manager between 13 July 2016 and 10 November 2016.

19. Dfreelon

Type- Miscellaneous

Link- www.dfreelon.org/2017/01/03/beyond-the-hashtags-twitter-data/

This Twitter dataset contains all 40,815,975 tweets, matching at least one of the 45 following keywords that had not been deleted or covered since July 2015 and posted between June 1, 2014, and May 31, 2015. Please go to the connection to find and access the list of 45 keywords.

 

20. Kaist

Type- Miscellaneous

Link- www.twitter.mpi-sws.org/

The data set contains Twitter user-to-user connections and various retweets (the RT, RT, Retweet, RT, HT, R / T, and symbols of recycling) every day. The data were collected for the purpose of conducting an analysis to imagine the media environment, to discover topical authorities, public attitudes, to classify topical content and to define the Twitter information exchange.

 

Closing Thoughts

Analyzing Twitter datasets can provide a wide range of insights, be it for research purposes or for social media marketing efforts.

 

TrackMyHashtag is a paid Twitter analytics tool that can provide you with specific Twitter datasets as per your requirements.

 

With this, we come to the end of the post. Until next time!

 

 


Sponsor Ads


About Kate Finch Innovator   social media enthusiast

14 connections, 0 recommendations, 57 honor points.
Joined APSense since, January 20th, 2020, From noida, India.

Created on Mar 30th 2020 00:54. Viewed 2,222 times.

Comments

No comment, be the first to comment.
Please sign in before you comment.