Skip to content

Commit 3893583

Browse files
authored
Merge pull request #20 from WhatTheFuzz/feature/rename-variables
Feature/rename variables
2 parents aacb73c + d39de0a commit 3893583

File tree

7 files changed

+170
-46
lines changed

7 files changed

+170
-46
lines changed

README.md

Lines changed: 37 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,13 @@
22

33
# BinaryNinja-OpenAI
44

5-
Integrates OpenAI's GPT3 with Binary Ninja via a plugin. Creates a query asking
6-
"What does this function do?" followed by the instructions in the High Level IL
7-
function or the decompiled pseudo-C. Returns the response to the user in Binary
8-
Ninja's console.
5+
Integrates OpenAI's GPT3 with Binary Ninja via a plugin and currently supports
6+
two actions:
7+
8+
- Queries OpenAI to determine what a given function does (in Pseudo-C and HLIL).
9+
- The results are logged to Binary Ninja's log to assist with RE.
10+
- Allows users to rename variables in HLIL using OpenAI.
11+
- Variable are renamed immediately and the decompiler is reloaded.
912

1013
## Installation
1114

@@ -60,20 +63,42 @@ You can find your API key at https://beta.openai.com.
6063

6164
## Usage
6265

66+
### What Does this Function Do?
67+
6368
After installation, you can right-click on any function in Binary Ninja and
64-
select `Plugins > OpenAI > What Does this Function Do (HLIL)?`. Alternatively,
65-
select a function in Binary Ninja (by clicking on any instruction in the
66-
function) and use the menu bar options
67-
`Plugins > OpenAI > What Does this Function Do (HLIL)?`. If your cursor has
68-
anything else selected other than an instruction inside a function, `OpenAI`
69-
will not appear as a selection inside the `Plugins` menu. This can happen if
70-
you've selected data or instructions that Binary Ninja determined did not belong
71-
inside of the function.
69+
select `Plugins > OpenAI > What Does this Function Do (HLIL/Pseudo-C)?`.
70+
Alternatively, select a function in Binary Ninja (by clicking on any instruction
71+
in the function) and use the menu bar options `Plugins > OpenAI > ...`. If your
72+
cursor has anything else selected other than an instruction inside a function,
73+
`OpenAI` will not appear as a selection inside the `Plugins` menu. This can
74+
happen if you've selected data or instructions that Binary Ninja determined did
75+
not belong inside of the function. Additionally, the HLIL options are context
76+
sensitive; if you're looking at the decompiled results in LLIL, you will not see
77+
the HLIL options; this is easily fixed by changing the user view to HLIL
78+
(Pseudo-C should always be visible).
7279

7380
The output will appear in Binary Ninja's Log like so:
7481

7582
![The output of running the plugin.](https://github.com/WhatTheFuzz/binaryninja-openai/blob/main/resources/output.png?raw=true)
7683

84+
### Renaming Variables
85+
86+
I feel like half of reverse engineering is figuring out variable names (which
87+
in-turn assist with program understanding). This plugin is an experimental look
88+
to see if OpenAI can assist with that. Right click on an instruction where a
89+
variable is initialized and select `OpenAI > Rename Variable (HLIL)`. Watch the
90+
magic happen. Here's a quick before-and-after.
91+
92+
![Before renaming](https://github.com/WhatTheFuzz/binaryninja-openai/blob/main/resources/rename-before.png?raw=true)
93+
94+
![After renaming](https://github.com/WhatTheFuzz/binaryninja-openai/blob/main/resources/rename-after.png?raw=true)
95+
96+
Renaming variables only works on HLIL instructions that are initializations (ie.
97+
`HighLevelILVarInit`). You might also want this to support assignments
98+
(`HighLevelILAssign`), but I did not get great results with this. Most of the
99+
responses were just `result`. If your experience is different, please submit a
100+
pull request.
101+
77102
## OpenAI Model
78103

79104
By default, the plugin uses the `text-davinci-003` model, you can tweak this

__init__.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,28 @@
11
from binaryninja import PluginCommand
22
from . src.settings import OpenAISettings
3-
from . src.entry import check_function
3+
from . src.entry import check_function, rename_variable
44

55
# Register the settings group in Binary Ninja to store the API key and model.
66
OpenAISettings()
77

8-
PluginCommand.register_for_high_level_il_function("OpenAI\What Does this Function Do (HLIL)?",
8+
PluginCommand.register_for_high_level_il_function(r"OpenAI\What Does this Function Do (HLIL)?",
99
"Checks OpenAI to see what this HLIL function does." \
1010
"Requires an internet connection and an API key "
1111
"saved under the environment variable "
1212
"OPENAI_API_KEY or modify the path in entry.py.",
1313
check_function)
1414

15-
PluginCommand.register_for_function("OpenAI\What Does this Function Do (Pseudo-C)?",
15+
PluginCommand.register_for_function(r"OpenAI\What Does this Function Do (Pseudo-C)?",
1616
"Checks OpenAI to see what this pseudo-C function does." \
1717
"Requires an internet connection and an API key "
1818
"saved under the environment variable "
1919
"OPENAI_API_KEY or modify the path in entry.py.",
2020
check_function)
21+
22+
PluginCommand.register_for_high_level_il_instruction(r"OpenAI\Rename Variable (HLIL)",
23+
"If the current expression is a HLIL Initialization " \
24+
"(HighLevelILVarInit), then query OpenAI to rename the " \
25+
"variable to what it believes is correct. If the expression" \
26+
"is not an HighLevelILVarInit, then do nothing. Requires " \
27+
"an internet connection and an API key. ",
28+
rename_variable)

resources/rename-after.png

7.79 KB
Loading

resources/rename-before.png

7.66 KB
Loading

src/agent.py

Lines changed: 73 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
from __future__ import annotations
2+
from collections.abc import Callable
13
import os
24
from typing import Optional, Union
35
from pathlib import Path
@@ -9,7 +11,8 @@
911
from binaryninja.function import Function
1012
from binaryninja.lowlevelil import LowLevelILFunction
1113
from binaryninja.mediumlevelil import MediumLevelILFunction
12-
from binaryninja.highlevelil import HighLevelILFunction
14+
from binaryninja.highlevelil import HighLevelILFunction, HighLevelILInstruction, \
15+
HighLevelILVarInit
1316
from binaryninja.settings import Settings
1417
from binaryninja import log, BinaryView
1518

@@ -19,11 +22,16 @@
1922

2023
class Agent:
2124

22-
question: str = '''
25+
function_question: str = '''
2326
This is a function that was decompiled with Binary Ninja.
2427
It is in IL_FORM. What does this function do?
2528
'''
2629

30+
rename_variable_question: str = "In one word, what should the variable " \
31+
"be for the variable that is assigned to the result of the C " \
32+
"expression:\n"
33+
34+
2735
# A mapping of IL forms to their names.
2836
il_name: dict[type, str] = {
2937
LowLevelILFunction: 'Low Level Intermediate Language',
@@ -34,28 +42,17 @@ class Agent:
3442

3543
def __init__(self,
3644
bv: BinaryView,
37-
function: Union[Function, LowLevelILFunction,
38-
MediumLevelILFunction, HighLevelILFunction],
3945
path_to_api_key: Optional[Path]=None) -> None:
4046

4147
# Read the API key from the environment variable.
4248
openai.api_key = self.read_api_key(path_to_api_key)
4349

44-
# Ensure that a function type was passed in.
45-
if not isinstance(
46-
function,
47-
(Function, LowLevelILFunction, MediumLevelILFunction,
48-
HighLevelILFunction)):
49-
raise TypeError(f'Expected a BNIL function of type '
50-
f'Function, LowLevelILFunction, '
51-
f'MediumLevelILFunction, or HighLevelILFunction, '
52-
f'got {type(function)}.')
53-
5450
assert bv is not None, 'BinaryView is None. Check how you called this function.'
5551
# Set instance attributes.
5652
self.bv = bv
57-
self.function = function
5853
self.model = self.get_model()
54+
# Used for the callback function.
55+
self.instruction = None
5956

6057
def read_api_key(self, filename: Optional[Path]=None) -> str:
6158
'''Checks for the API key in three locations.
@@ -72,7 +69,7 @@ def read_api_key(self, filename: Optional[Path]=None) -> str:
7269
settings: Settings = Settings()
7370
if settings.contains('openai.api_key'):
7471
if key := settings.get_string('openai.api_key'):
75-
return key
72+
return str(key)
7673

7774
# If the settings don't exist, contain the key, or the key is empty,
7875
# check the environment variable.
@@ -111,7 +108,7 @@ def get_model(self) -> str:
111108
if model := settings.get_string('openai.model'):
112109
# Check that is a valid model by querying the OpenAI API.
113110
if self.is_valid_model(model):
114-
return model
111+
return str(model)
115112
# Return a valid, default model.
116113
assert self.is_valid_model('text-davinci-003')
117114
return 'text-davinci-003'
@@ -124,7 +121,7 @@ def get_token_count(self) -> int:
124121
if settings.contains('openai.max_tokens'):
125122
# Check that the value is not None.
126123
if (max_tokens := settings.get_integer('openai.max_tokens')) is not None:
127-
return max_tokens
124+
return int(max_tokens)
128125
return 1_024
129126

130127
def instruction_list(self, function: Union[LowLevelILFunction,
@@ -133,6 +130,15 @@ def instruction_list(self, function: Union[LowLevelILFunction,
133130
'''Generates a list of instructions in string representation given a
134131
BNIL function.
135132
'''
133+
134+
# Ensure that a function type was passed in.
135+
if not isinstance(function, (Function, LowLevelILFunction,
136+
MediumLevelILFunction, HighLevelILFunction)):
137+
raise TypeError(f'Expected a BNIL function of type '
138+
f'Function, LowLevelILFunction, '
139+
f'MediumLevelILFunction, or HighLevelILFunction, '
140+
f'got {type(function)}.')
141+
136142
if isinstance(function, Function):
137143
return Pseudo_C(self.bv, function).get_c_source()
138144
instructions: list[str] = []
@@ -144,21 +150,64 @@ def generate_query(self, function: Union[Function,
144150
LowLevelILFunction,
145151
MediumLevelILFunction,
146152
HighLevelILFunction]) -> str:
147-
'''Generates a query string given a BNIL function. Reads the file
148-
prompt.txt and replaces the IL form with the name of the IL form.
153+
'''Generates a query string given a BNIL function. Returns the query as
154+
a string.
149155
'''
150-
prompt: str = self.question
156+
prompt: str = self.function_question
151157
# Read the prompt from the text file.
152-
prompt = self.question.replace('IL_FORM', self.il_name[type(function)])
158+
prompt = prompt.replace('IL_FORM', self.il_name[type(function)])
153159
# Add some new lines. Maybe not necessary.
154160
prompt += '\n\n'
155161
# Add the instructions to the prompt.
156162
prompt += '\n'.join(self.instruction_list(function))
157163
return prompt
158164

159-
def send_query(self, query: str) -> None:
165+
def generate_rename_variable_query(self,
166+
instruction: HighLevelILInstruction) -> str:
167+
'''Generates a query string given a BNIL instruction. Returns the query
168+
as a string.
169+
'''
170+
if not isinstance(instruction, HighLevelILVarInit):
171+
raise TypeError(f'Expected a BNIL instruction of type '
172+
f'HighLevelILVarInit got {type(instruction)}.')
173+
# Assign the instruction to the Agent instance. This is used for the
174+
# callback function so we don't need to pass in the instruction to the
175+
# Query instance. This is kind of janky and should be examined in future
176+
# versions.
177+
self.instruction = instruction
178+
179+
prompt: str = self.rename_variable_question
180+
# Get the disassembly lines and add them to the prompt.
181+
for line in instruction.instruction_operands:
182+
prompt += str(line)
183+
184+
return prompt
185+
186+
def rename_variable(self, response: str) -> None:
187+
'''Renames the variable of the instruction saved in the Agent instance
188+
to the response passed in as an argument.
189+
'''
190+
if self.instruction is None:
191+
raise TypeError('No instruction was saved in the Agent instance.')
192+
if response is None or response == '':
193+
raise TypeError(f'No response was returned from OpenAI; got type {type(response)}.')
194+
# Get just one word from the response. Remove spaces and quotes.
195+
try:
196+
response = response.split()[0]
197+
response = response.replace(' ', '')
198+
response = response.replace('"', '')
199+
response = response.replace('\'', '')
200+
except IndexError as error:
201+
raise IndexError(f'Could not split the response: `{response}`.') from error
202+
# Assign the variable name to the response.
203+
log.log_debug(f'Renaming variable in expression {self.instruction} to {response}.')
204+
self.instruction.dest.name = response
205+
206+
207+
def send_query(self, query: str, callback: Optional[Callable]=None) -> None:
160208
'''Sends a query to the engine and prints the response.'''
161209
query = Query(query_string=query,
162210
model=self.model,
163-
max_token_count=self.get_token_count())
211+
max_token_count=self.get_token_count(),
212+
callback_function=callback)
164213
query.start()

src/entry.py

Lines changed: 32 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,44 @@
11
from pathlib import Path
22
from binaryninja import BinaryView, Function
3+
from binaryninja.highlevelil import HighLevelILInstruction, HighLevelILVarInit
4+
from binaryninja.log import log_error
35
from . agent import Agent
46

57
API_KEY_PATH = Path.home() / Path('.openai/api_key.txt')
68

7-
def check_function(bv: BinaryView, func: Function) -> bool:
9+
def check_function(bv: BinaryView, func: Function) -> None:
810
agent: Agent = Agent(
911
bv=bv,
10-
function=func,
1112
path_to_api_key=API_KEY_PATH
1213
)
1314
query: str = agent.generate_query(func)
1415
agent.send_query(query)
16+
17+
def rename_variable(bv: BinaryView, instruction: HighLevelILInstruction) -> None:
18+
19+
if not isinstance(instruction, HighLevelILVarInit):
20+
log_error(f'Instruction must be of type HighLevelILVarInit, got type: ' \
21+
f'{type(instruction)}')
22+
return
23+
24+
agent: Agent = Agent(
25+
bv=bv,
26+
path_to_api_key=API_KEY_PATH
27+
)
28+
query: str = agent.generate_rename_variable_query(instruction)
29+
agent.send_query(query=query, callback=agent.rename_variable)
30+
31+
# Difficult to test without a payment method added, given that the rate limits
32+
# are so low. This should also probably take place in a background task of its
33+
# own.
34+
# def rename_all_variables_in_function(bv: BinaryView, func: HighLevelILFunction) -> None:
35+
# # Get each instruction in the High Level IL Function.
36+
# for instruction in func.instructions:
37+
# match instruction:
38+
# # Rename the variable if it is a HighLevelILVarInit.
39+
# case HighLevelILVarInit():
40+
# rename_variable(bv, instruction)
41+
# # Explicit pass for all other cases.
42+
# case _ :
43+
# pass
44+

src/query.py

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,37 @@
1+
from __future__ import annotations
2+
from collections.abc import Callable
3+
from typing import Optional
14
import openai
25
from binaryninja.plugin import BackgroundTaskThread
3-
6+
from binaryninja.log import log_debug, log_info
47

58
class Query(BackgroundTaskThread):
69

710
def __init__(self, query_string: str, model: str,
8-
max_token_count: int) -> None:
11+
max_token_count: int, callback_function: Optional[Callable]=None) -> None:
912
BackgroundTaskThread.__init__(self,
1013
initial_progress_text="",
1114
can_cancel=False)
1215
self.query_string: str = query_string
1316
self.model: str = model
1417
self.max_token_count: int = max_token_count
18+
self.callback = callback_function
1519

1620
def run(self) -> None:
1721
self.progress = "Submitting query to OpenAI."
1822

19-
response: str = openai.Completion.create(
23+
log_debug(f'Sending query: {self.query_string}')
24+
25+
response = openai.Completion.create(
2026
model=self.model,
2127
prompt=self.query_string,
2228
max_tokens=self.max_token_count,
2329
)
24-
# Notify the user.
25-
print(response.choices[0].text)
30+
# Get the response text.
31+
result: str = response.choices[0].text
32+
# If there is a callback, do something with it.
33+
if self.callback:
34+
self.callback(result)
35+
# Otherwise, assume we just want to log it.
36+
else:
37+
log_info(result)

0 commit comments

Comments
 (0)