|
| 1 | +--- |
| 2 | +title: "Managed Service for Apache Flink" |
| 3 | +linkTitle: "Managed Service for Apache Flink" |
| 4 | +description: > |
| 5 | + Get started with Managed Service for Apache Flink on LocalStack |
| 6 | +tags: ["Pro image"] |
| 7 | +--- |
| 8 | + |
| 9 | +{{< callout >}} |
| 10 | +This service was formerly known as 'Kinesis Data Analytics for Apache Flink'. |
| 11 | +{{< /callout >}} |
| 12 | + |
| 13 | +## Introduction |
| 14 | + |
| 15 | +[Apache Flink](https://flink.apache.org/) is a framework for building applications that process and analyze streaming data. |
| 16 | +[Managed Service for Apache Flink (MSAF)](https://docs.aws.amazon.com/managed-flink/latest/java/what-is.html) is an AWS service that provides the underlying infrastructure and a hosted Apache Flink cluster that can run Apache Flink applications. |
| 17 | + |
| 18 | +LocalStack lets you to run Flink applications locally and implements several [AWS-compatible API operations](https://docs.localstack.cloud/references/coverage/coverage_kinesisanalyticsv2/). |
| 19 | + |
| 20 | +{{< callout "note" >}} |
| 21 | +The emulated MSAF provider was introduced and made the default in LocalStack v4.1. |
| 22 | + |
| 23 | +If you wish to use the older mock provider, you can set `PROVIDER_OVERRIDE_KINESISANALYTICSV2=legacy`. |
| 24 | +{{< /callout >}} |
| 25 | + |
| 26 | +## Getting Started |
| 27 | + |
| 28 | +This guide builds a demo Flink application and deploys it to LocalStack. |
| 29 | +The application generates synthetic records, processes them and sends the output to an S3 bucket. |
| 30 | + |
| 31 | +Start the LocalStack container using your preferred method. |
| 32 | + |
| 33 | +### Build Application Code |
| 34 | + |
| 35 | +Begin by cloning the AWS sample repository. |
| 36 | +We will use the [S3 Sink](https://github.com/localstack-samples/amazon-managed-service-for-apache-flink-examples/tree/main/java/S3Sink) application in this example. |
| 37 | + |
| 38 | +{{< command >}} |
| 39 | +$ git clone https://github.com/localstack-samples/amazon-managed-service-for-apache-flink-examples.git |
| 40 | +$ cd java/S3Sink |
| 41 | +{{< /command >}} |
| 42 | + |
| 43 | +Next, use [Maven](https://maven.apache.org/) to compile and package the Flink application into a jar. |
| 44 | + |
| 45 | +{{< command >}} |
| 46 | +$ mvn package |
| 47 | +{{< /command >}} |
| 48 | + |
| 49 | +The Flink application jar file will be placed in the `./target/flink-kds-s3.jar` directory. |
| 50 | + |
| 51 | +### Upload Application Code |
| 52 | + |
| 53 | +MSAF requires that all application code resides in S3. |
| 54 | + |
| 55 | +Create an S3 bucket and upload the compiled Flink application jar. |
| 56 | + |
| 57 | +{{< command >}} |
| 58 | +$ awslocal s3api create-bucket --bucket flink-bucket |
| 59 | +$ awslocal s3api put-object --bucket flink-bucket --key job.jar --body ./target/flink-kds-s3.jar |
| 60 | +{{< /command >}} |
| 61 | + |
| 62 | +### Output Sink |
| 63 | + |
| 64 | +As mentioned earlier, this Flink application writes the output to an S3 bucket. |
| 65 | + |
| 66 | +Create the S3 bucket that will serve as the sink. |
| 67 | + |
| 68 | +{{< command >}} |
| 69 | +$ awslocal s3api create-bucket --bucket sink-bucket |
| 70 | +{{< /command >}} |
| 71 | + |
| 72 | +### Permissions |
| 73 | + |
| 74 | +MSAF requires a service execution role which allows it to connect to other services. |
| 75 | +Without the proper permissions policy and role, this example application will not be able to connect to S3 sink bucket to output the result. |
| 76 | + |
| 77 | +Create an IAM role for the running MSAF application to assume. |
| 78 | + |
| 79 | +```json |
| 80 | +# role.json |
| 81 | +{ |
| 82 | + "Version": "2012-10-17", |
| 83 | + "Statement": [ |
| 84 | + { |
| 85 | + "Effect": "Allow", |
| 86 | + "Principal": {"Service": "kinesisanalytics.amazonaws.com"}, |
| 87 | + "Action": "sts:AssumeRole" |
| 88 | + } |
| 89 | + ] |
| 90 | +} |
| 91 | +``` |
| 92 | + |
| 93 | +{{< command >}} |
| 94 | +$ awslocal iam create-role --role-name msaf-role --assume-role-policy-document file://role.json |
| 95 | +{{< /command >}} |
| 96 | + |
| 97 | +Next create add a permissions policy to this role that permits read and write access to S3. |
| 98 | + |
| 99 | +```json |
| 100 | +# policy.json |
| 101 | +{ |
| 102 | + "Version": "2012-10-17", |
| 103 | + "Statement": [ |
| 104 | + { |
| 105 | + "Effect": "Allow", |
| 106 | + "Action": ["s3:GetObject", "s3:GetObjectVersion", "s3:PutObject"], |
| 107 | + "Resource": "*" |
| 108 | + } |
| 109 | + ] |
| 110 | +} |
| 111 | +``` |
| 112 | + |
| 113 | +{{< command >}} |
| 114 | +$ awslocal iam put-role-policy --role-name msaf-role --policy-name msaf-policy --policy-document file://policy.json |
| 115 | +{{< /command >}} |
| 116 | + |
| 117 | +Now, when the running MSAF application assumes this role, it will have the necessary permissions to write to the S3 sink. |
| 118 | + |
| 119 | +### Deploy Application |
| 120 | + |
| 121 | +With all prerequisite resources in place, the Flink application can now be created and started. |
| 122 | + |
| 123 | +{{< command >}} |
| 124 | +$ awslocal kinesisanalyticsv2 create-application \ |
| 125 | + --application-name msaf-app \ |
| 126 | + --runtime-environment FLINK-1_20 \ |
| 127 | + --application-mode STREAMING \ |
| 128 | + --service-execution-role arn:aws:iam::000000000000:role/msaf-role \ |
| 129 | + --application-configuration '{ |
| 130 | + "ApplicationCodeConfiguration": { |
| 131 | + "CodeContent": { |
| 132 | + "S3ContentLocation": { |
| 133 | + "BucketARN": "arn:aws:s3:::flink-bucket", |
| 134 | + "FileKey": "job.jar" |
| 135 | + } |
| 136 | + }, |
| 137 | + "CodeContentType": "ZIPFILE" |
| 138 | + }, |
| 139 | + "EnvironmentProperties": { |
| 140 | + "PropertyGroups": [{ |
| 141 | + "PropertyGroupId": "bucket", "PropertyMap": {"name": "sink-bucket"} |
| 142 | + }] |
| 143 | + } |
| 144 | + }' |
| 145 | + |
| 146 | +$ awslocal kinesisanalyticsv2 start-application --application-name msaf-app |
| 147 | +{{< /command >}} |
| 148 | + |
| 149 | +Once the Flink cluster is up and running, the application will stream the results to the sink S3 bucket. |
| 150 | +You can verify this with: |
| 151 | + |
| 152 | +{{< command >}} |
| 153 | +$ awslocal s3api list-objects --bucket sink-bucket |
| 154 | +{{< /command >}} |
| 155 | + |
| 156 | +## Supported Flink Versions |
| 157 | + |
| 158 | +| Flink version | Supported by LocalStack | Supported by Apache | |
| 159 | +|:---:|:---:|:---:| |
| 160 | +| 1.20.0 | yes | yes | |
| 161 | +| 1.19.1 | yes | yes | |
| 162 | +| 1.18.1 | yes | yes | |
| 163 | +| 1.15.2 | yes | yes | |
| 164 | +| 1.13.1 | yes | no | |
| 165 | + |
| 166 | +## Limitations |
| 167 | + |
| 168 | +- Application versions are not maintained |
| 169 | +- Only S3 zipfile code is supported |
| 170 | +- Values of 20,000 ms for `execution.checkpointing.interval` and 5,000 ms for `execution.checkpointing.min-pause` are used for checkpointing. |
| 171 | + They can not be overridden. |
| 172 | +- [Tagging](https://docs.aws.amazon.com/managed-flink/latest/java/how-tagging.html) is not supported |
| 173 | +- In-place [version upgrades](https://docs.aws.amazon.com/managed-flink/latest/java/how-in-place-version-upgrades.html) and [roll-backs](https://docs.aws.amazon.com/managed-flink/latest/java/how-system-rollbacks.html) are not supported |
| 174 | +- [Snapshot/savepoint management](https://docs.aws.amazon.com/managed-flink/latest/java/how-snapshots.html) is not implemented |
| 175 | +- CloudWatch and CloudTrail integration is not implemented |
0 commit comments