Skip to content

Commit e13cb90

Browse files
Merge pull request #93 from holgerteichgraeber/documentation
Documentation enhancements + draft of JOSS paper.
2 parents 9f4a326 + 24051e9 commit e13cb90

28 files changed

+18272
-17969
lines changed

CONTRIBUTING.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
## How to contribute to ClustForOpt.jl
2+
Welcome! Thank you for considering to contribute to `ClustForOpt.jl`. If you have a comment, question, feature request, or bug report, please open a new [issue](https://github.com/holgerteichgraeber/ClustForOpt.jl/issues).
3+
4+
If you like to file a bug report, or like to contribute to the documentation or the code (always welcome!), the [JuMP.jl Contributing.md](https://github.com/JuliaOpt/JuMP.jl/blob/master/CONTRIBUTING.md) has some great tips on how to get started.

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ keywords = ["clustering", "JuMP", "optimization"]
44
license = "MIT"
55
desc = "julia implementation of using different clustering methods for finding representative periods for the optimization of energy systems"
66
author = ["Holger Teichgraeber"]
7-
version = "0.4.0"
7+
version = "0.4.1"
88

99
[deps]
1010
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"

README.md

Lines changed: 58 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,39 @@
55
[![License](http://img.shields.io/badge/license-MIT-brightgreen.svg?style=flat)](LICENSE)
66
[![Build Status](https://travis-ci.com/holgerteichgraeber/ClustForOpt.jl.svg?token=HRFemjSxM1NBCsbHGNDG&branch=master)](https://travis-ci.com/holgerteichgraeber/ClustForOpt.jl)
77

8-
ClustForOpt is a [julia](www.juliaopt.com) implementation of clustering methods for finding representative periods for the optimization of energy systems. The package can be used in conjunction with the multi-node capacity expansion model [CapacityExpansion](https://github.com/YoungFaithful/CapacityExpansion.jl).
98

10-
The package has two main purposes: 1) Provide a simple process of clustering time-series input data, with clustered data output in a generalized type system 2) provide an interface between clustered data and optimization problem.
9+
[ClustForOpt](https://github.com/holgerteichgraeber/ClustForOpt.jl) is a [julia](https://www.juliaopt.com) implementation of unsupervised machine learning methods for finding representative periods for energy systems optimization problems.
10+
By reducing the number of time steps used in the optimization model, using representative periods leads to significant reductions in computational complexity.
1111

12-
The package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012).
13-
The package is actively developed, and new features are continuously added. For a reproducible version of the methods and data of the original paper by [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012), please refer to release [v0.1](https://github.com/holgerteichgraeber/ClustForOpt.jl/tree/v0.1).
12+
The package has three main purposes:
13+
1) Provide a simple process of finding representative periods for time-series input data, with implementations of the most commonly used clustering methods and extreme value selection methods.
14+
2) Provide an interface between representative period data and optimization problem by having representative period data stored in a generalized type system.
15+
3) Provide a generalized import feature for time series, where variable names, attributes, and node names are automatically stored and can then be used in the definition of sets of the optimization problem later.
16+
17+
An example energy systems optimization problem that uses ClustForOpt for its input data is the package [CapacityExpansion](https://github.com/YoungFaithful/CapacityExpansion.jl), which implements a scalable generation and transmission capacity expansion problem.
18+
19+
The ClustForOpt package follows the clustering framework presented in [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012).
20+
The package is actively developed, and new features are continuously added. For a reproducible version of the methods and data of the original paper by [Teichgraeber and Brandt, 2019](https://doi.org/10.1016/j.apenergy.2019.02.012), please refer to [v0.1](https://github.com/holgerteichgraeber/ClustForOpt.jl/tree/v0.1).
1421

1522
This package is developed by Holger Teichgraeber [@holgerteichgraeber](https://github.com/holgerteichgraeber) and Elias Kuepper [@YoungFaithful](https://github.com/youngfaithful).
1623

24+
## Installation
25+
This package runs under julia v1.0 and higher.
26+
Install using:
27+
28+
```julia
29+
import Pkg
30+
Pkg.add("ClustForOpt")
31+
```
32+
33+
## Documentation
34+
[Documentation (Stable)](https://holgerteichgraeber.github.io/ClustForOpt.jl/stable): Please refer to this documentation for details on how to use ClustForOpt the current version of ClustForOpt. This is the documentation of the default version of the package.
35+
36+
[Documentation (Development)](https://holgerteichgraeber.github.io/ClustForOpt.jl/dev): If you like to try the development version of ClustForOpt, please refer to this documentation.
37+
38+
**See [NEWS](NEWS.md) for significant breaking changes when updating from one version of ClustForOpt to another.**
39+
40+
## Citing ClustForOpt
1741
If you find ClustForOpt useful in your work, we kindly request that you cite the following paper ([link](https://doi.org/10.1016/j.apenergy.2019.02.012)):
1842

1943
```
@@ -28,82 +52,43 @@ If you find ClustForOpt useful in your work, we kindly request that you cite the
2852
}
2953
```
3054

31-
## Installation
32-
This package runs under julia v1.0 and higher.
33-
Install using:
55+
## Quick Start Guide
3456

35-
```julia
36-
]
37-
add ClustForOpt
38-
```
39-
where `]` opens the julia package manager.
40-
41-
**See [NEWS](NEWS.md) for significant breaking changes when updating from one version of ClustForOpt to another.**
42-
43-
## Documentation
44-
[Stable](https://holgerteichgraeber.github.io/ClustForOpt.jl/stable)
57+
This quick start guide introduces the main concepts of using ClustForOpt. For more detail on the different functionalities that ClustForOpt provides, please refer to the subsequent chapters of the documentation or the examples in the [examples](https://github.com/holgerteichgraeber/ClustForOpt.jl/tree/master/examples) folder.
4558

46-
[Development](https://holgerteichgraeber.github.io/ClustForOpt.jl/dev)
47-
48-
## Workflow
49-
50-
Generally, the workflow requires three steps:
59+
Generally, the workflow consists of three steps:
5160
- load data
52-
- clustering
61+
- find representative periods (clustering + extreme period selection)
5362
- optimization
5463

55-
An example workflow with examples on how to use the different functions can be found in [`examples/workflow_introduction.jl`](examples/workflow_introduction.jl)
56-
57-
```julia
64+
## Example Workflow
65+
After ClustForOpt is installed, you can use it by saying:
66+
```@repl workflow
5867
using ClustForOpt
59-
60-
# load data (electricity price day ahead market)
61-
ts_input_data, = load_timeseries_data("DAM", "GER";K=365, T=24) #DAM
62-
63-
# run standard kmeans clustering algorithm to cluster into 5 representative periods, with 1000 initial starting points
64-
clust_res = run_clust(ts_input_data;method="kmeans",representation="centroid",n_clust=5,n_init=1000)
65-
66-
# battery operations optimization on the clustered data
67-
opt_res = run_opt(clust_res)
6868
```
6969

70-
### Load data
71-
`load_timeseries_data()` loads the data for a given `application` and `region`.
72-
Possible applications are
73-
- `DAM`: Day ahead market price data
74-
- `CEP`: Capacity Expansion Problem data
75-
76-
Possible regions are:
77-
- `GER`: Germany
78-
- `CA`: California
79-
- `TX`: Texas
80-
81-
The optional input parameters to `load_timeseries_data()` are the number of periods `K` and the number of time steps per period `T`. By default, they are chosen such that they result in daily time slices.
82-
83-
84-
### Clustering
85-
`run_clust()` takes the full `data` and gives a struct with the clustered data as the output.
86-
87-
The input parameter `n_clust` determines the number of clusters,i.e., representative periods.
88-
89-
#### Supported clustering methods
90-
91-
The following combinations of clustering method and representations are supported by [run\_clust()](src/clustering/run_clust.jl):
92-
93-
Name | method | representation
94-
---- | --------------- | -----------------------
95-
k-means clustering | `<kmeans>` | `<centroid>`
96-
k-means clustering with medoid representation | `<kmeans>` | `<medoid>`
97-
k-medoids clustering (partitional) | `<kmedoids>` | `<medoid>`
98-
k-medoids clustering (exact) [requires Gurobi] | `<kmedoids_exact>` | `<medoid>`
99-
hierarchical clustering with centroid representation | `<hierarchical>` | `<centroid>`
100-
hierarchical clustering with medoid representation | `<hierarchical>` | `<medoid>`
101-
102-
For use of DTW barycenter averaging (DBA) and k-shape clustering on single-attribute data (e.g. electricity prices), please use branch `v0.1-appl_energy-framework-comp`.
103-
104-
70+
The first step is to load the data. The following example loads hourly wind, solar, and demand data for Germany (1 region) for one year.
71+
```@repl workflow
72+
ts_input_data = load_timeseries_data(:CEP_GER1)
73+
```
74+
The output `ts_input_data` is a `ClustData` data struct that contains the data and additional information about the data.
75+
```@repl workflow
76+
ts_input_data.data # a dictionary with the data.
77+
ts_input_data.data["wind-germany"] # the wind data (choose solar, el_demand as other options in this example)
78+
ts_input_data.K # number of periods
79+
```
10580

106-
### Optimization
107-
The function `run_opt()` runs the optimization problem and gives as an output a struct that contains optimal objective function value, decision variables, and additional info. The `run_opt()` function infers the optimization problem type from the input data. See the examples folder for further details.
81+
The second step is to cluster the data into representative periods. Here, we use k-means clustering and get 5 representative periods.
82+
```@repl workflow
83+
clust_res = run_clust(ts_input_data;method="kmeans",n_clust=5)
84+
ts_clust_data = clust_res.clust_data
85+
```
86+
The `ts_clust_data` is a `ClustData` data struct, this time with clustered data (i.e. less representative periods).
87+
```@repl workflow
88+
ts_clust_data.data # the clustered data
89+
ts_clust_data.data["wind-germany"] # the wind data. Note the dimensions compared to ts_input_data
90+
ts_clust_data.K # number of periods
91+
```
10892

109-
A Capacity Expansion Optimization Problem that utilizes `ClustForOpt` can be found in the package [CapacityExpansion](https://github.com/YoungFaithful/CapacityExpansion.jl).
93+
The clustered input data can be used as input to an optimization problem.
94+
The optimization problem formulated in the package [CapacityExpansion](https://github.com/YoungFaithful/CapacityExpansion.jl) can be used with the data clustered in this example.

0 commit comments

Comments
 (0)