Skip to content

Commit 2a04fef

Browse files
committed
improve readme for seed
1 parent 3978609 commit 2a04fef

File tree

1 file changed

+10
-7
lines changed

1 file changed

+10
-7
lines changed

VinePlus.Seed/README.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,20 @@
11
# VinePlus.Seed
22

33
Scrapes all data from comicvine servers for use in the postgres database
4-
As of the time of writing, there are about 840,000 threads and 22,000,000 posts on comicvine, all of which would take around 300MB, and 22GB of space respectively
5-
After it is done, a bunch of csv files are created: `threads_full.csv` containing the data for all threads, and `posts_<x>.csv` for all the <x> posts
4+
5+
As of the time of writing, there are about 800+ thousand threads and 20+ million posts on comicvine, which would take around 300MB, and 22GB of space respectively when downloaded
6+
7+
After it is completely downloaded, a bunch of csv files are created: `threads_full.csv` containing the data for all threads, and `posts_<x>.csv` for all the <x> posts
68

79
## Prerequisites
8-
- Make sure you have [installed dotnet 6.0](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)
9-
- Make sure you have [installed redis](https://redis.io/docs/getting-started/installation/)
10-
- Bash shell, jq
10+
- [dotnet 6.0](https://dotnet.microsoft.com/en-us/download/dotnet/6.0)
11+
- [redis](https://redis.io/docs/getting-started/installation/)
12+
- bash shell
13+
- jq
1114

1215
## Getting Started
13-
- Make sure redis is installed and star the redis server with `redis-server`
14-
- Build and run the project `dotnet run`
16+
- Make sure redis is installed and start the redis server with `redis-server`
17+
- Build and run this project `dotnet run`
1518
- This scrapes comicvine, and might take a while depending on your internet speed.
1619
- After this is done, a `threads_full.json` file, and a bunch of files in the format `posts_*.json` ('*' is a number from 0+) should be created.
1720
- Run `thread.sh threads_full` to convert the threads json file to csv

0 commit comments

Comments
 (0)