Twitter dataset to train word embeddings

Open Data Asked by Mugdha Pandya on September 29, 2021

I’m working on a project related to manipulating word embeddings. In order to do this, I need to train them myself on twitter data. Given Twitter’s policy, I am unable to find a suitable dataset. Does anyone have one or know where I can find one?

The dataset should:

  • contain public tweets
  • have no specific topic, just need lots of tweets
  • be pre-processed

One Answer

Tweet Sentiment Extraction

Kaggle supports a variety of dataset publication formats.

Kaggle Twitter Datasets

You need to have an account to download the datasets. You can find the code for preprocessing in the kernels tab.

Answered by Pluviophile on September 29, 2021

Add your own answers!

Related Questions

Open API for SEC data?

12  Asked on January 4, 2022


OpenFDA Covid19 Serology Tests missing manufacturer?

1  Asked on January 2, 2022 by maksim-grinman


xbox achievement / trophy statistics?

0  Asked on December 12, 2021


How to access data from using API’s?

0  Asked on December 2, 2021 by shishir-kumar


Index in Google Trends

1  Asked on November 20, 2021 by pedro-stallone


Financial Corpus

1  Asked on November 20, 2021 by tomi-sargiotto


(Serious) Dataset of paedophilic Youtube comments (or similar)?

1  Asked on November 13, 2021 by guillermo-mosse


Looking for a motorcycle database

2  Asked on September 30, 2021 by patrick-lamatiere


3D brain tumor datasets for classification

1  Asked on September 30, 2021 by hela-yahyaoui


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP