Add need to know

onesuper · onesuper · commit 6d4039db6e3f · 2023-06-16T23:58:23.000+08:00
diff --git a/README.md b/README.md
@@ -9,10 +9,10 @@ The purpose of this repository is to let people evaluate the quality of datasets
 
 * [Google Colab Notebook](https://colab.research.google.com/drive/1c8rWB2gtUrBHQcmmvA_NAXxc7Cexn1vM?usp=sharing)
 
+## Running the app
+### Instructions
 
-## Instructions
-
-1.Prerequisites
+1. Prerequisites
 Note that the code only works `Python >= 3.9` and `streamlit >= 1.23.1`
 
 ```
@@ -26,11 +26,16 @@ $ cd HuggingFace-Datasets-Text-Quality-Analysis
 $ pip install -r requirements.txt
 ```
 
-3.Run Streamlit application
+3. Run Streamlit application
 ```
 python -m streamlit run app.py
 ```
 
+### Need to know
+
+When the dataset you download from Hugging Face is too large, running the application may exceed the memory of your machine and causes some errors. Sample the data or refer to some libraries that can run Pandas on a cluster, such as Xorbits, Dask.
+
+
 ## Todos
 
 - [ ] Introduce more dimensions to evaluate the dataset quality