Enviroment: Python 3 Jupyter Notebook.
Part I - Web crawler
Primary task: Write a script that parses the HTML files in the HTML data directory, Extracts the artist, works, currency, price amount and outputs to stdout
Output format: A JSON array of objects
Part II - Predictive Model
Primary task: Train a machine learning model that predicts the price of a work of art given its 19 variables, including artist_name, auction_date, location, size(depth, height, width), etc.
Target variable: hammer_price
Metric: Root mean squared error RMSE
Final file: "model.py", containing an importable predict function.