|
| 1 | +\documentclass[a4paper,10pt, notitlepage]{report} |
| 2 | +\usepackage[utf8]{inputenc} |
| 3 | +\usepackage{natbib} |
| 4 | +\usepackage{amssymb} |
| 5 | +\usepackage{amsmath} |
| 6 | +\usepackage[shortlabels]{enumitem} |
| 7 | +% \usepackage[portuguese]{babel} |
| 8 | + |
| 9 | + |
| 10 | +% Title Page |
| 11 | +\title{O Brother, How Far Art Thou?} |
| 12 | +\author{Computational Statistics \\ Instructor: Luiz Max de Carvalho} |
| 13 | + |
| 14 | +\begin{document} |
| 15 | +\maketitle |
| 16 | + |
| 17 | +\textbf{Hand-in date: 06/10/2021.} |
| 18 | + |
| 19 | +\section*{General guidance} |
| 20 | +\begin{itemize} |
| 21 | + \item State and prove all non-trivial mathematical results necessary to substantiate your arguments; |
| 22 | + \item Do not forget to add appropriate scholarly references~\textit{at the end} of the document; |
| 23 | + \item Mathematical expressions also receive punctuation; |
| 24 | + \item Please hand in a single PDF file as your final main document. |
| 25 | + |
| 26 | + Code appendices are welcome,~\textit{in addition} to the main PDF document. |
| 27 | + \end{itemize} |
| 28 | + |
| 29 | +\newpage |
| 30 | + |
| 31 | +\section*{Background} |
| 32 | + |
| 33 | +A large portion of the content of this course is concerned with computing high-dimensional integrals~\textit{via} simulation. |
| 34 | +Today you will be introduced to a simple-looking problem with a complicated closed-form solution and one we can approach using simulation. |
| 35 | + |
| 36 | +Suppose you have a disc $C_R$ of radius $R$. |
| 37 | +Take $p = (p_x, p_y)$ and $ q = (q_x, q_y) \in C_R$ two points in the disc. |
| 38 | +Consider the Euclidean distance between $p$ and $q$, $||p-q|| = \sqrt{(p_x-q_x)^2 + (p_y-q_y)^2} = |p-q|$. |
| 39 | +\paragraph{Problem A:} What is the \textit{average} distance between pairs of points in $C_R$ if they are picked uniformly at random? |
| 40 | + |
| 41 | +\section*{Part I: nuts and bolts} |
| 42 | + |
| 43 | +\begin{enumerate} |
| 44 | + \item To start building intuition, let's solve a related but much simpler problem. |
| 45 | + Consider an interval $[0, s]$, with $s>0$ and take $x_1,x_2 \in [0, s]$~\textit{uniformly at random}. |
| 46 | + Show that the average distance between $x_1$ and $x_2$ is $s/3$. |
| 47 | + \item Show that Problem A is equivalent to computing |
| 48 | + \begin{equation*} |
| 49 | + I = \frac{1}{\pi^2 R^4}\int_{0}^{R}\int_{0}^{R}\int_{0}^{2\pi}\int_{0}^{2\pi}\sqrt{r_1^2 + r_2^2 - 2r_1r_2\cos\phi(\theta_1, \theta_2)}r_1r_2\,d\theta_1\,d\theta_2\,dr_1\,dr_2, |
| 50 | + \end{equation*} |
| 51 | + where $\phi(\theta_1, \theta_2)$ is the central angle between $r_1$ and $r_2$. |
| 52 | + |
| 53 | + \textit{Hint:} Draw a picture. |
| 54 | + \item Compute $I$ in closed-form. |
| 55 | + |
| 56 | + \textit{Hint:} Look up \textit{Crofton's mean value theorem} or \textit{Crofton's formula}. |
| 57 | +\end{enumerate} |
| 58 | + |
| 59 | +\section*{Part II -- getting your hands dirty} |
| 60 | + |
| 61 | +Now we will move on to implementation. |
| 62 | + |
| 63 | +\paragraph{Problem B:} Employ a simulation algorithm to approximate $I$. |
| 64 | +Provide point and interval estimates and give theoretical guarantees about them (consistency, coverage, etc). |
| 65 | + |
| 66 | +\begin{enumerate} |
| 67 | + \item You have been (randomly) assigned a simulation method -- see list at the end. |
| 68 | + Represent $I$ as $\int_{\mathcal{X}} \phi(x)\pi(x)\,dx$ and justify your choice of $\phi$, $\pi$ and $\mathcal{X}$. |
| 69 | + Recall that these choices are arbitrary up to a point, but they might lead to wildly different empirical performances~\textbf{and} theoretical properties for estimators of $I$. |
| 70 | + \textbf{Justify} your choices in light of the method you have been given to work with. |
| 71 | + Choose wisely and be rigorous in your justifications. |
| 72 | + \item Again, starting from the eventual samples you will obtain with your method, construct a non-empty\footnote{This is a joke. |
| 73 | + It means you should come up with at least one estimator. But you might, and are even encouraged to, entertain more than one estimator.} family of estimators of $I$ and discuss whether it is (strongly) consistent and whether a central limit theorem can be established. |
| 74 | + \item Detail a suite of diagnostics that might be employed in your application to detect convergence or performance problems. |
| 75 | + Extra points for those who design algorithms that exploit the structure of this particular integration problem. |
| 76 | + \item For each $R \in \{0.01, 0.1, 1, 10, 100, 1000, 10000\}$, perform $M=500$ runs from your simulation method and compute: (i) variance (ii) bias (iii) standard deviation of the mean (MCSE). |
| 77 | + \item Can you identify one key quantity missing from the previous item? |
| 78 | + \textit{Hint:} it bears relevance to the real world application of any computational method. |
| 79 | +estimator.\end{enumerate} |
| 80 | + |
| 81 | +Here we will list a selection of methods that will be randomly assigned to each student, along with some questions that need to be answered for that particular method. |
| 82 | + |
| 83 | +\begin{itemize} |
| 84 | + \item \textbf{Rejection sampling} |
| 85 | + \begin{itemize} |
| 86 | + \item Justify your choice of proposal distribution and show that it conforms to the necessary conditions for the algorithm to work; in particular, try to find a proposal that gives the highest acceptance probability. |
| 87 | + \end{itemize} |
| 88 | + \item \textbf{Importance sampling} |
| 89 | + \begin{itemize} |
| 90 | + \item Justify your choice of proposal based on the variance of the resulting estimator. |
| 91 | + \end{itemize} |
| 92 | + \item \textbf{Gibbs sampling} |
| 93 | + \begin{itemize} |
| 94 | + \item Write your full conditionals out and show that they adhere to the Hammersley-Clifford condition. |
| 95 | + \end{itemize} |
| 96 | + \item \textbf{Metropolis-Hastings} |
| 97 | + \begin{itemize} |
| 98 | + \item Justify your choice of proposal; test different ones if you need to. |
| 99 | + \end{itemize} |
| 100 | +\end{itemize} |
| 101 | +% |
| 102 | +% \bibliographystyle{apalike} |
| 103 | +% \bibliography{refs} |
| 104 | + |
| 105 | +\end{document} |
| 106 | + |
0 commit comments