-
Notifications
You must be signed in to change notification settings - Fork 20
Description
I've tried to download the dataset, but it seems impossible to download.
I went from your recent article: https://ahmedbesbes.com/overview-and-benchmark-of-traditional-and-deep-learning-models-in-text-classification.html
To this: http://thinknook.com/twitter-sentiment-analysis-training-corpus-dataset-2012-09-22/
To then this: http://www.sananalytics.com/lab/twitter-sentiment/
However the last link of sananalytics.com doesn't load at all.
Or else, I try to download the data from your previous blog post:
https://ahmedbesbes.com/sentiment-analysis-on-twitter-using-word2vec-and-keras.html
I've tried to download the dataset from the Google Drive, but it seems erroneous. First, I copied your def ingest(): method. Then, I tried. first it didn't load: had the change the encoding to latin-1. Then, I got this and I realized the dataset had no columns. I had the error: ValueError: labels ['ItemID' 'SentimentSource'] not contained in axis, and it was on this line: data.drop(['ItemID', 'SentimentSource'], axis=1, inplace=True).
I wonder how I would be able to reproduce your experiments or at least use the same data for a quick comparison. I didn't tried further than what I've put above. I guess adding names to the columns manually might do it, but from this point on I suspect that probably other things wouldn't work as expected too down the road. It'd be very cool if you could an easy data loading pipeline.
Thanks!