Skip to content

Commit 00185c0

Browse files
author
Keiron Pizzey
committed
Modifed get_text
Modified it to make a new soup, rather than using deepcopy.
1 parent 8ce601e commit 00185c0

File tree

1 file changed

+4
-7
lines changed

1 file changed

+4
-7
lines changed

nidaba/core/objects.py

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
1-
from copy import deepcopy
2-
31
from bs4 import BeautifulSoup
4-
from bs4.element import NavigableString
52
from .util import Text
63

74
class SEObject(object):
@@ -59,13 +56,13 @@ def _get_text(self):
5956

6057
# Hacky. But the official docs say that to remove tags (such as <code></code>) you should use
6158
# the LC method below. Unfortunately that ruins self.soup for any other methods. Making a
62-
# deepcopy seemed the best choice.
63-
soup = deepcopy(self.soup)
59+
# new soup seemed the best choice.
60+
soup = BeautifulSoup(self.body)
6461

6562
[s.extract() for s in soup('code')]
6663

67-
text = [Text(text.strip()) for text in soup.recursiveChildGenerator()
68-
if isinstance(text, NavigableString) and text != '\n']
64+
# text = [Text(text.strip()) for text in soup.recursiveChildGenerator()
65+
# if isinstance(text, NavigableString) and text != '\n']
6966

7067
text = Text(soup.get_text().strip())
7168

0 commit comments

Comments
 (0)