Top 20 Historical Twitter Datasets Available for download
by Kate Finch social media enthusiastTwitter
has differentiated itself from a wide audience as a direct platform for
contact. Tweets are accurate and snappy to express, this has made Twitter so
famous that it is shaping global communities. In this post, we will address the
credible sources for downloading free Twitter datasets.
Twitter
data sets could be used potentially for scientific studies, social
ventures, and marketing methodologies. In this post, I have collected
repositories of various free Twitter datasets from multiple sources
available for download over the internet
I've
also listed a way to get specific historical Twitter datasets,
but first, let's talk about all free sources of Twitter data sets.
1.
TrackMyHashtag
Type-
Corona Virus (Covid-19) Tweet Metadata
Link-
www.trackmyhashtag.com/data/COVID-19.zip
The
Twitter Dataset contains 60 K tweet accusations from public Twitter accounts
relating to the"Covid-19" search term. The dataset was collected over
8 weeks (Dec 1st, 2019–28 Jan 2020) using the Twitter Stream API. The Excel /
CSV dataset is broken down into three separate fields-tweet metadata, images,
and videos.
2.
Archive.org
Type-
Miscellaneous research data (2013-2018)
Link-
www.archive.org/details/twitterstream
It
is a series of free Twitter data sets that were amassed
for research, history testing, sentiment analysis, and data retention.
In this archive, we can use tons of data and pick the stream we need. These
archives contain loads of data that can be sorted and used when needed. This
Twitter dataset can be downloaded for free. You can download the Twitter
datasets here free of charge.
3.
Data.world
Type-
MNC’s Twitter accounts
Link- www.data.world/datasets/twitter
Data.world
is a repository for free Twitter datasets. Datasets from businesses to
prominent people are open to users. We can just go to the website and search
the Twitter dataset array.
4.
Github
Type-
Russian troll tweets to celebrity accounts
Link-
www.github.com/shaypal5/awesome-twitter-data
This
is a free data archive, like all else on Github. The datasets vary between
tweets with Elon musk and Russian troll tweets. Users may simply go to
the URL and search the vast Twitter datasets array.
5.
Kaggle
Type-
Scientific research data
Link-
www.kaggle.com/datasets?search=twitter
Kaggle
is also a freely available online archive for the exchange of codes, and
scientific data. There are a wide number of Twitter datasets available for
free download. The data range from eco-studies to demonetization tweets in
India.
6.
ICWSM
Type-
Academic research data
Link-
www.icwsm.org/2015/datasets/datasets/
ICWSM
is a data-sharing project and has a large array of Twitter datasets. The
collection can be downloaded free of charge, and users must register only on
the website and sign a declaration that they will not share the report. In
academic research, these data sets can be extremely useful.
7.
Figshare
Type-
Dataset related to real-world events
Link-
www.figshare.com/articles/Twitter_event_datasets_2012-2016_/5100460
This
collection contains a selection of 30 separate data sets linked to real-world
events and collected using a sequence of keywords between 2012 and 2016 with
the streaming API. Such data can only be accessed for non-commercial use
according to the Twitter TOS.
8.
ISI.edu
Type-
Old Twitter data from 2010
Link-
www.isi.edu/~lerman/downloads/twitter/twitter2010.html
This
dataset includes tweets published in October 2010 on Twitter. Although very
mature, it may still refer to data minors and academics. To access the dataset
simply click on the link.
9.
Trec.nist.gov
Type-
Sample of 16 million unfiltered tweets
Link-
www.trec.nist.gov/data/tweets/
This
archive includes about 16 million tweets from 23 January to 8 February. This is
an unfiltered database of relevant tweets and spam. The user simply needs to
sign a document that does not allow the data to be used for commercial purposes
and you can then access the archive immediately.
10.
Kdnuggets
Type-
Miscellaneous
Link-
www.kdnuggets.com/datasets/index.html
Kdnuggets
offers information about work, related classes, webinars and Twitter datasets
for free access. The given connection can be accessed directly and your
datasets collection can be viewed.
11.
Github scraped public tweets
Type-
Miscellaneous public tweets
Link-
www.archive.org/details/twitter_cikm_2010
This
dataset represents a series of public tweets used to research the
geolocation data associated with the tweets, in collaboration with the
academic project.
12.
Kaggle customer support datasets
Type-
Customer support tweets
Link-
www.kaggle.com/thoughtvector/customer-support-on-twitter
This
dataset includes over 3 Million customer support tweets from different global
brands and firms. It can be used to understand conversational models and to
research current methods and consequences of customer service.
13.
Follow the Hashtag
Type-
NASDAQ companies’ tweets
Link-
www.followthehashtag.com/datasets/
The
data sets ranging from the top 100 NASDAQ companies to the UK geolocation
tweet data. To download the data set, simply click on the link.
14.
Lionbridge
Type-
Miscellaneous
Lionbridge
provides an extensive range of Twitter datasets ranging from local news to
tweets with the #Avengersendgame hashtag etc. Only click the link and browse
the list of your data sets.
15.
Academic torrents
Type-
URL’s posted on Twitter in October 2010
Link-
www.academictorrents.com/details/d8b3a315172c8d804528762f37fa67db14577cdb
This
dataset contains URLs shared on Twitter in October 2010. The connection will
allow you to download Torrent files through a Torrent client.
16.
Sentiment140
Type-
Tweet sentiment analysis data
Link-
www.help.sentiment140.com/for-students
You
can discover the sentiments of a brand, product or subject on Twitter with
Sentiment140. This filters through the tweets by thinking about the negative or
positive impact of the message or comment by emoticons.
17.
Docnow
Type-
Miscellaneous
Link-
www.www.docnow.io/catalog/
Docnow
provides catalogs of publicly accessible data sets on the internet. You
would have to import the data sets first and then use the Hydrator
software program or Twarc if you are comfortable operating on the command line,
in order to restore such tweet identifier data to its original JSON format.
18.
Harvard dataverse
Type-
USA Presidential election tweets
Link-
www.dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/PDI7IN
The
archive includes about 280 million tweets for the 2016 U.S. Presidential
elections. They were taken from the Twitter API by Social Feed Manager between
13 July 2016 and 10 November 2016.
19.
Dfreelon
Type-
Miscellaneous
Link-
www.dfreelon.org/2017/01/03/beyond-the-hashtags-twitter-data/
This
Twitter dataset contains all 40,815,975 tweets, matching at least one of the 45
following keywords that had not been deleted or covered since July 2015 and
posted between June 1, 2014, and May 31, 2015. Please go to the connection to
find and access the list of 45 keywords.
20.
Kaist
Type-
Miscellaneous
Link-
www.twitter.mpi-sws.org/
The
data set contains Twitter user-to-user connections and various retweets (the
RT, RT, Retweet, RT, HT, R / T, and symbols of recycling) every day. The data
were collected for the purpose of conducting an analysis to imagine the media environment,
to discover topical authorities, public attitudes, to classify topical content
and to define the Twitter information exchange.
Closing Thoughts
Analyzing
Twitter datasets can provide a wide range of insights, be it for research
purposes or for social media marketing efforts.
TrackMyHashtag is a paid Twitter
analytics tool that can provide you with specific Twitter datasets as per your
requirements.
With
this, we come to the end of the post. Until next time!
Sponsor Ads
Created on Mar 30th 2020 00:54. Viewed 2,222 times.