Skip to content

Commit be2f9ab

Browse files
replace nltk punkt with punkt_tab and update documentation and tests accordingly
1 parent 6042eeb commit be2f9ab

File tree

4 files changed

+5
-5
lines changed

4 files changed

+5
-5
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,7 @@ print(f'Run time: {time:.2f}s')
342342
rules.to_csv('output.csv')
343343
```
344344

345-
**Note:** You may need to download stopwords and the punkt tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt')`.
345+
**Note:** You may need to download stopwords and the punkt_tab tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt_tab')`.
346346

347347
For a full list of examples see the [examples folder](https://github.com/firefly-cpp/NiaARM/tree/main/examples)
348348
in the GitHub repository.

docs/getting_started.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,7 @@ added to the :mod:`niaarm.mine` module.
285285
print('No rules generated')
286286
print(f'Run time: {time:.2f}s')
287287
288-
**Note:** You may need to download stopwords and the punkt tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt')`.
288+
**Note:** You may need to download stopwords and the punkt_tab tokenizer from nltk by running `import nltk; nltk.download('stopwords'); nltk.download('punkt_tab')`.
289289

290290
**Output:**
291291

examples/text_mining.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,13 @@
77
df = pd.read_json("datasets/text/artm_test_dataset.json", orient="records")
88
documents = df["text"].tolist()
99

10-
# create a Corpus object from the documents (requires nltk's punkt tokenizer and the stopwords list)
10+
# create a Corpus object from the documents (requires nltk's punkt_tab tokenizer and the stopwords list)
1111
try:
1212
corpus = Corpus.from_list(documents)
1313
except LookupError:
1414
import nltk
1515

16-
nltk.download("punkt")
16+
nltk.download("punkt_tab")
1717
nltk.download("stopwords")
1818
corpus = Corpus.from_list(documents)
1919

tests/test_text_mining.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010

1111
class TestTextMining(TestCase):
1212
def setUp(self):
13-
nltk.download("punkt")
13+
nltk.download("punkt_tab")
1414
nltk.download("stopwords")
1515
ds_path = os.path.join(
1616
os.path.dirname(__file__), "test_data", "artm_test_dataset.json"

0 commit comments

Comments
 (0)