Skip to content

Commit 258d92e

Browse files
authored
Merge branch 'main' into classifier
2 parents c2a0e08 + 0b7bcb4 commit 258d92e

File tree

17 files changed

+500
-119
lines changed

17 files changed

+500
-119
lines changed

.gitattributes

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# .gitattributes
2+
3+
*.ipynb linguist-vendored

README.md

Lines changed: 76 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,79 @@
1-
# LLM Detector
2-
3-
Synthetic text detection API in Python using Flask, Celery, Redis, Gunicorn, Nginx and HuggingFace.
1+
# Malone
42

53
## News
64

7-
**2024-07-08**: llm_detector is officially part of the Backdrop Build V5 cohort under the tentative name 'Malone' starting today. Check out the [build page](https://backdropbuild.com/builds/v5/cadmus) for updates.
5+
**2024-07-08**: llm_detector is officially part of the Backdrop Build V5 cohort under the tentative name 'malone' starting today. Check out the backdrop [build page](https://backdropbuild.com/builds/v5/cadmus) for updates.
6+
7+
**2024-07-30**: Malone is live in Beta on Telegram, give it a try [here](https://t.me/the_malone_bot). Note: some Firefox users have reported issues with the botlink, you can also find malone by messaging '*/start*' to @the_malone_bot anywhere you use Telegram.
8+
9+
**2024-08-01**: [Lauch video](https://youtu.be/6zdLcsC9I_I?si=R6knOnxMySDIRKDQ) is up on YouTube. Congrats to all of the other Backdrop Build finishers.
10+
11+
![malone](https://github.com/gperdrizet/llm_detector/blob/main/telegram_bot/assets/malone_A.jpg?raw=true)
12+
13+
Malone is a synthetic text detection service available on [Telegram Messenger](https://telegram.org/), written in Python using [HuggingFace](https://huggingface.co), [scikit-learn](https://scikit-learn.org/stable/), [XGBoost](https://github.com/dmlc/xgboost), [Luigi](https://github.com/spotify/luigi) and [python-telegram-bot](https://github.com/python-telegram-bot/python-telegram-bot), supported by [Flask](https://flask.palletsprojects.com/en/3.0.x), [Celery](https://docs.celeryq.dev/en/stable/index.html), [Redis](https://redis.io/) & [Docker](https://www.docker.com/) and served via [Gunicorn](https://gunicorn.org/) and [Nginx](https://nginx.org/). Malone uses an in-house trained gradient boosting classifier to estimate the probability that a given text was generated by an LLM. It uses a set of engineered features derived from the input text, for more details see the [feature engineering notebooks](https://github.com/gperdrizet/llm_detector/tree/main/classifier/notebooks).
14+
15+
## Table of Contents
16+
17+
1. Features
18+
2. Where to find malone
19+
3. Usage
20+
4. Performance
21+
5. Demonstration/experimentation notebooks
22+
6. About the author
23+
7. Disclaimer
24+
25+
## 1. Features
26+
27+
- **Easily accessible** - use it anywhere you can access Telegram: iOS or Android apps and any web browser.
28+
- **Simple interface** - no frills, just send the bot text and it will send back the probability that the text was machine generated.
29+
- **Useful and accurate** - provides a probability that text is synthetic, allowing users to make their own decisions when evaluating content. Maximum likelihood classification accuracy ~90% on held-out test data.
30+
- **Model agnostic** - malone is not trained to detect the output of a specific LLM, instead, it uses a gradient boosting classifier and a set of numerical features derived from/calibrated on a large corpus of human and synthetic text samples from multiple LLMs.
31+
- **No logs** - no user data or message contents are ever persisted to disk.
32+
- **Open source codebase** - malone is an open source project. Clone it, fork it, extend it, modify it, host it yourself and use it the way you want to use it.
33+
- **Free**
34+
35+
## 2. Where to find malone
36+
37+
Malone is publicly available on Telegram. You can find malone on the [Telegram bot page](https://t.me/the_malone_bot), or just message @the_malone_bot with '/*start*' to start using it.
38+
39+
There are also plans in the works to offer the bare API to interested parties. If that's you, see section 6 below.
40+
41+
## 3. Usage
42+
43+
To use malone you will need a Telegram account. Telegram is free to use and available as an app for iOS and Android. There is also a web version for desktop use.
44+
45+
Once you have a Telegram account, malone is simple to use. Send the bot any 'suspect' text and it will reply with the probability that the text in question was written by a human or generated by an LLM. For smartphone use, a good trick is long press on 'suspect' text and then share it to malone on Telegram via the context menu. Malone is never more that 2 taps away!
46+
47+
![telegram app screenshot](https://github.com/gperdrizet/llm_detector/blob/main/telegram_bot/assets/telegram_screenshot.jpg?raw=true)
48+
49+
Malone can run in two response modes: 'default' and 'verbose'. Default mode returns the probability associated with the most likely class as a percent (e.g. 75% chance a human wrote this). Verbose mode gives a little more detail about the feature values and prediction metrics. Set the mode by messaging '*/set_mode verbose*' or '*/set_mode default*'.
50+
51+
For best results, submitted text must be between 50 and 500 words.
52+
53+
## 4. Performance
54+
55+
Malone is ~90% accurate with a binary log loss of ~0.25 on hold-out test data depending on the model and feature engineering hyperparameters and the specific train/test split (see example confusion matrix below). The miss-classified examples are more or less evenly split between false negatives and false positives.
56+
57+
![XGBoost confusion matrix](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/figures/XGBoost_confusion_matrix.png?raw=true)
58+
59+
For more details on the classifier training and performance see: [XGBoost experimentation](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/04.1-XGBoost_classifier_experimentation.ipynb) and [XGBoost finalized](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/04.2-XGBoost_classifier_finalized.ipynb).
60+
61+
## 5. Demonstration/experimentation notebooks
62+
63+
Most of the testing and benchmarking during the design phase of the project was trialed in Jupyter notebooks before refactoring into modules. These notebooks are the best way to understand the approach and the engineered features used to train the classifier.
64+
65+
1. [Human and synthetic text training data](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/01-hans_2024_data.ipynb)
66+
2. [Perplexity ratio score](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/02.2-perplexity_ratio_score_finalized.ipynb)
67+
3. [TF-IDF score](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/03.2-TF-IDF_finalized.ipynb)
68+
4. [XGBoost classifier](https://github.com/gperdrizet/llm_detector/blob/main/classifier/notebooks/04.2-XGBoost_classifier_finalized.ipynb)
69+
70+
## 6. About the author
71+
72+
My name is Dr. George Perdrizet, I am a biochemistry & molecular biology PhD seeking a career step from academia to professional data science and/or machine learning engineering. This project was conceived from the scientific literature and built solo over the course of a few weeks - I strongly believe that I have a ton to offer the right organization. If you or anyone you know is interested in an ex-researcher from University of Chicago turned builder and data scientist, please reach out, I'd love to learn from and contribute to your project.
73+
74+
- **Email**: <hire.me@perdrizet.org>
75+
- **LinkedIn**: [linkedin.com/gperdrizet](https://www.linkedin.com/in/gperdrizet/)
76+
77+
## 7. Disclaimer
78+
79+
Malone is an experimental research project meant for educational, informational and entertainment purposes only. Any predictions made are inherently probabilistic in nature and subject to stochastic errors. Text classifications, no matter how high or low the reported probability, should never be interpreted as proof of authorship or the lack thereof in regard to any text submitted for analysis. Decisions about the source or value of any text are made by the user who considers all factors relevant to themselves and their purpose and takes full responsibility for their own judgment any and actions they may take as a result.

api/__main__.py

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
'''Main module to initialize LLMs, set-up and launch Celery & Flask apps
22
using either Gunicorn or the Flask development server'''
33

4+
import pickle
45
import api.functions.flask_app as app_funcs
56
import api.functions.helper as helper_funcs
67
import api.configuration as config
78

89
# Start the logger
910
logger = helper_funcs.start_logger()
10-
1111
logger.info('Running in %s mode', config.MODE)
1212

1313
if config.MODE == 'testing':
@@ -22,8 +22,34 @@
2222
reader_model, writer_model = helper_funcs.start_models(logger)
2323
logger.info('Models started')
2424

25+
# Load the other scoring assets
26+
27+
# Load the perplexity ratio Kullback-Leibler kernel density estimate
28+
with open(config.PERPLEXITY_RATIO_KLD_KDE, 'rb') as input_file:
29+
perplexity_ratio_kld_kde = pickle.load(input_file)
30+
31+
# Load the TF-IDF luts
32+
with open(config.TFIDF_LUT, 'rb') as input_file:
33+
tfidf_luts = pickle.load(input_file)
34+
35+
# Load the TF_IDF Kullback-Leibler kernel density estimate
36+
with open(config.TFIDF_SCORE_KLD_KDE, 'rb') as input_file:
37+
tfidf_kld_kde = pickle.load(input_file)
38+
39+
# Load the model
40+
with open(config.XGBOOST_CLASSIFIER, 'rb') as input_file:
41+
model = pickle.load(input_file)
42+
2543
# Initialize Flask app
26-
flask_app = app_funcs.create_flask_celery_app(reader_model, writer_model)
44+
flask_app = app_funcs.create_flask_celery_app(
45+
reader_model,
46+
writer_model,
47+
perplexity_ratio_kld_kde,
48+
tfidf_luts,
49+
tfidf_kld_kde,
50+
model
51+
)
52+
2753
logger.info('Flask app initialized')
2854

2955
# Start the celery app

api/classes/llm.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ def __init__(
4646
self.cpu_cores = cpu_cores
4747
self.max_new_tokens = max_new_tokens
4848

49-
# Reserve loading the tokenizer and model for the load method to
49+
# Reserve loading the tokenizer and model for the load method to
5050
# give the user a chance to override default parameter values
5151
self.model = None
5252
self.tokenizer = None
@@ -77,7 +77,7 @@ def load(self) -> None:
7777
)
7878

7979
# Set the model to evaluation mode to deactivate any dropout
80-
# modules the is done to ensure reproducibility of results
80+
# modules the is done to ensure reproducibility of results
8181
# during evaluation
8282
self.model.eval()
8383

api/configuration.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
DATA_PATH=f'{PROJECT_ROOT_PATH}/data'
2323

2424
# Logging stuff
25-
LOG_LEVEL='DEBUG'
25+
LOG_LEVEL='INFO'
2626
LOG_PREFIX='%(levelname)s - %(message)s'
2727
CLEAR_LOGS=True
2828

api/functions/flask_app.py

Lines changed: 86 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,24 @@
22

33
from typing import Callable
44
import random
5-
from flask import Flask, request # type: ignore
6-
from celery import Celery, Task, shared_task # type: ignore
5+
from flask import Flask, request
6+
from celery import Celery, Task, shared_task
7+
from celery.app import trace
78
from celery.result import AsyncResult
89
from celery.utils.log import get_task_logger
910
import api.configuration as config
1011
import api.functions.scoring as scoring_funcs
1112
# pylint: disable=W0223
1213

14+
# Comment ##############################################################
15+
# Code ########################################################################
16+
17+
# Disable return portion task success message log so that
18+
# user messages don't get logged.
19+
trace.LOG_SUCCESS = '''\
20+
Task %(name)s[%(id)s] succeeded in %(runtime)ss\
21+
'''
22+
1323
def create_celery_app(app: Flask) -> Celery:
1424
'''Sets up Celery app object'''
1525

@@ -41,7 +51,11 @@ def __call__(self, *args: object, **kwargs: object) -> object:
4151

4252
def create_flask_celery_app(
4353
reader_model: Callable = None,
44-
writer_model: Callable = None
54+
writer_model: Callable = None,
55+
perplexity_ratio_kld_kde: Callable = None,
56+
tfidf_luts: Callable = None,
57+
tfidf_kld_kde: Callable = None,
58+
model: Callable = None
4559
) -> Flask:
4660

4761
'''Creates Flask app for use with Celery'''
@@ -67,64 +81,94 @@ def create_flask_celery_app(
6781
# Get task logger
6882
logger = get_task_logger(__name__)
6983

84+
7085
@shared_task(ignore_result = False)
71-
def score_text(suspect_string: str = None, response_mode: str = 'default') -> str:
86+
def score_text(
87+
suspect_string: str = None,
88+
response_mode: str = 'default'
89+
) -> str:
90+
7291
'''Takes a string and scores it, returns a dict.
7392
containing the author call and the original string'''
7493

75-
logger.info(f'Submitting for score: {suspect_string}')
76-
logger.info(f'Response mode is: {response_mode}')
94+
logger.info('Submitting string for score.')
95+
logger.info('Response mode is: %s', response_mode)
96+
97+
# Check to make sure that text is of sane length
98+
text_length = len(suspect_string.split(' '))
99+
100+
if text_length < 50 or text_length > 400:
101+
102+
reply = '''For best results text should be longer than 50 words and\
103+
shorter than 400 words.'''
104+
105+
else:
77106

78-
# Call the real scoring function or mock based on mode
79-
if config.MODE == 'testing':
107+
# Call the real scoring function or mock based on mode
108+
if config.MODE == 'testing':
80109

81-
# Mock the score with a random float
82-
score = [random.uniform(0, 1)]
110+
# Mock the score with a random float
111+
score = [random.uniform(0, 1)]
83112

84-
# Threshold the score
85-
if score[0] >= 0.5:
86-
call = 'human'
113+
# Threshold the score
114+
if score[0] >= 0.5:
115+
reply = 'Text is human'
87116

88-
elif score[0] < 0.5:
89-
call = 'synthetic'
117+
elif score[0] < 0.5:
118+
reply = 'Text is synthetic'
90119

91-
elif config.MODE == 'production':
120+
elif config.MODE == 'production':
92121

93-
# Call the scoring function
94-
response = scoring_funcs.score_string(
95-
reader_model,
96-
writer_model,
97-
suspect_string,
98-
response_mode
99-
)
122+
# Call the scoring function
123+
response = scoring_funcs.score_string(
124+
reader_model,
125+
writer_model,
126+
perplexity_ratio_kld_kde,
127+
tfidf_luts,
128+
tfidf_kld_kde,
129+
model,
130+
suspect_string,
131+
response_mode
132+
)
100133

101-
if response_mode == 'default':
134+
if response_mode == 'default':
102135

103-
human_probability = response[0] * 100
104-
machine_probability = response[1] * 100
136+
human_probability = response[0] * 100
137+
machine_probability = response[1] * 100
105138

106-
if human_probability > machine_probability:
107-
reply = f'{human_probability:.1f}% chance that this text was written by a human.'
139+
if human_probability > machine_probability:
140+
reply = f'''{human_probability:.1f}% chance that this text was written by\
141+
a human.'''
108142

109-
elif human_probability < machine_probability:
110-
reply = f'{machine_probability:.1f}% chance that this text was written by a machine.'
143+
elif human_probability < machine_probability:
144+
reply = f'{machine_probability:.1f}% chance that this text was written by a machine.'
111145

112-
elif response_mode == 'verbose':
146+
elif response_mode == 'verbose':
113147

114-
features = (f"Fragment length (tokens): {response[2]['Fragment length (tokens)']:.0f}\n"
115-
f"Perplexity: {response[2]['Perplexity']:.2f}\n"
116-
f"Cross-perplexity: {response[2]['Cross-perplexity']:.2f}\n"
117-
f"Perplexity ratio score: {response[2]['Perplexity ratio score']:.3f}\n"
118-
f"Perplexity ratio Kullback-Leibler score: {response[2]['Perplexity ratio Kullback-Leibler score']:.3f}\n"
119-
f"Human TF-IDF: {response[2]['Human TF-IDF']:.2f}\n"
120-
f"Synthetic TF-IDF: {response[2]['Synthetic TF-IDF']:.2f}\n"
121-
f"TF-IDF score: {response[2]['TF-IDF score']:.3f}\n"
122-
f"TF-IDF Kullback-Leibler score: {response[2]['TF-IDF Kullback-Leibler score']:.3f}")
148+
features = ('Fragment length (tokens): '
149+
f"{response[2]['Fragment length (tokens)']:.0f}\n"
150+
'Perplexity: '
151+
f"{response[2]['Perplexity']:.2f}\n"
152+
'Cross-perplexity: '
153+
f"{response[2]['Cross-perplexity']:.2f}\n"
154+
'Perplexity ratio score: '
155+
f"{response[2]['Perplexity ratio score']:.3f}\n"
156+
'Perplexity ratio Kullback-Leibler score: '
157+
f"{response[2]['Perplexity ratio Kullback-Leibler score']:.3f}\n"
158+
'Human TF-IDF: '
159+
f"{response[2]['Human TF-IDF']:.2f}\n"
160+
'Synthetic TF-IDF: '
161+
f"{response[2]['Synthetic TF-IDF']:.2f}\n"
162+
'TF-IDF score: '
163+
f"{response[2]['TF-IDF score']:.3f}\n"
164+
'TF-IDF Kullback-Leibler score: '
165+
f"{response[2]['TF-IDF Kullback-Leibler score']:.3f}")
123166

124-
reply = f'Class probabilities: human = {response[0]:.3f}, machine = {response[1]:.3f}\n\nFeature values:\n{features}.'
167+
reply = f'''Class probabilities: human = {response[0]:.3f},\
168+
machine = {response[1]:.3f}\n\nFeature values:\n{features}.'''
125169

126170
# Return the result from the output queue
127-
return {'author_call': reply, 'text': suspect_string}
171+
return {'reply': reply, 'text': suspect_string}
128172

129173
# Set listener for text strings via POST
130174
@app.post('/submit_text')

0 commit comments

Comments
 (0)