Skip to content

irhallac/User2vec-Twitter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

User Representation Learning for Social Networks: An Empirical Study

cite If you find this dataset and code useful in your research, please consider citing:

@article{hallac2021user,
title = {User Representation Learning for Social Networks: An Empirical Study},
author = {Hallac, Ibrahim Riza and Ay, Betul and Aydin, Galip},
journal = {Applied Sciences},
volume = {11},
number = {12},
pages = {5489},
year = {2021},
publisher = {Multidisciplinary Digital Publishing Institute}
}

https://www.mdpi.com/2076-3417/11/12/5489

TwitterUserDataset

500K Tweets collected from 500 Twitter users' timelines.

Each user belongs to one of the 5 predefined categories: Economy, Crypto Economy, Technology, Fashion, and Politics.

Tweets are not publicly shared due to Twitter API’s Terms of Service. Because "Twitter does not allow for making large amounts of raw Twitter data available on the Web" [1]

In order to get the actual Text of the tweets you should rehydrate them with their IDs. You can use a tool like Twarch [2] for harvesting Tweets. Please contact to corresponding author for the list of ids used in the paper.

Alt text

[1] Digital Collecting Toolkit, http://digitalcollecting.lib.virginia.edu/toolkit/
[2] "twarc is a command line tool and Python library for archiving Twitter", https://github.com/DocNow/twarc

Contributors for building this dataset

  • Mutlu Halil ERİŞEN
  • Zeynep KOYUN
  • Semiha MAKİNİST
  • İbrahim Rıza HALLAÇ

About

TwitterUserDataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published