Skip to content
This repository was archived by the owner on Aug 7, 2025. It is now read-only.

Commit 8bae102

Browse files
Docs for Managed Service for Apache Flink (#1530)
1 parent 7f7ca81 commit 8bae102

File tree

3 files changed

+197
-23
lines changed

3 files changed

+197
-23
lines changed

content/en/user-guide/aws/firehose/index.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,27 @@
11
---
2-
title: "Kinesis Data Firehose"
3-
linkTitle: "Kinesis Data Firehose"
2+
title: "Data Firehose"
3+
linkTitle: "Data Firehose"
44
description: >
5-
Get started with Kinesis Data Firehose on LocalStack
5+
Get started with Data Firehose on LocalStack
66
aliases:
77
- /user-guide/aws/kinesis-firehose/
88
---
99

1010
{{< callout >}}
11-
Amazon recently renamed Kinesis Data Firehose to Data Firehose.
11+
This service was formerly called as 'Kinesis Data Firehose'.
1212
{{< /callout >}}
1313

1414
## Introduction
1515

16-
Kinesis Data Firehose is a service provided by AWS that allows you to extract, transform and load streaming data into various destinations, such as Amazon S3, Amazon Redshift, and Elasticsearch.
17-
With Kinesis Data Firehose, you can ingest and deliver real-time data from different sources as it automates data delivery, handles buffering and compression, and scales according to the data volume.
16+
Data Firehose is a service provided by AWS that allows you to extract, transform and load streaming data into various destinations, such as Amazon S3, Amazon Redshift, and Elasticsearch.
17+
With Data Firehose, you can ingest and deliver real-time data from different sources as it automates data delivery, handles buffering and compression, and scales according to the data volume.
1818

19-
LocalStack allows you to use the Kinesis Data Firehose APIs in your local environment to load and transform real-time data.
20-
The supported APIs are available on our [API coverage page](https://docs.localstack.cloud/references/coverage/coverage_firehose/), which provides information on the extent of Kinesis Data Firehose's integration with LocalStack.
19+
LocalStack allows you to use the Data Firehose APIs in your local environment to load and transform real-time data.
20+
The supported APIs are available on our [API coverage page](https://docs.localstack.cloud/references/coverage/coverage_firehose/), which provides information on the extent of Data Firehose's integration with LocalStack.
2121

2222
## Getting started
2323

24-
This guide is designed for users new to Kinesis Data Firehouse and assumes basic knowledge of the AWS CLI and our [`awslocal`](https://github.com/localstack/awscli-local) wrapper script.
24+
This guide is designed for users new to Data Firehouse and assumes basic knowledge of the AWS CLI and our [`awslocal`](https://github.com/localstack/awscli-local) wrapper script.
2525

2626
Start your LocalStack container using your preferred method.
2727
We will demonstrate how to use Firehose to load Kinesis data into Elasticsearch with S3 Backup with the AWS CLI.
@@ -162,7 +162,7 @@ Additionally, take a look at the designated S3 bucket to ensure the backup proce
162162

163163
## Examples
164164

165-
The following code snippets and sample applications provide practical examples of how to use Kinesis Data Firehose in LocalStack for various use cases:
165+
The following code snippets and sample applications provide practical examples of how to use Data Firehose in LocalStack for various use cases:
166166

167167
- [Search application with Lambda, Kinesis, Firehose, ElasticSearch, S3](https://github.com/localstack/sample-fuzzy-movie-search-lambda-kinesis-elasticsearch)
168168
- [Streaming Data Pipeline with Kinesis, Tinybird, CloudWatch, Lambda](https://github.com/localstack/serverless-streaming-data-pipeline)

content/en/user-guide/aws/kinesisanalytics/index.md

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
---
2-
title: "Kinesis Data Analytics"
3-
linkTitle: "Kinesis Data Analytics"
2+
title: "Kinesis Data Analytics for SQL Applications"
3+
linkTitle: "Kinesis Data Analytics for SQL Applications"
44
description: >
5-
Get started with Kinesis Data Analytics on LocalStack
5+
Get started with Kinesis Data Analytics for SQL Applications on LocalStack
66
tags: ["Pro image"]
77
---
88

9+
{{< callout "warning" >}}
10+
This service is deprecated and marked for removal.
11+
{{< /callout >}}
12+
913
## Introduction
1014

11-
Kinesis Data Analytics is a service offered by Amazon Web Services (AWS) that enables you to process and analyze streaming data in real-time.
12-
Kinesis Data Analytics allows you to apply transformations, filtering, and enrichment to streaming data using standard SQL syntax.
13-
You can also run Java or Scala programs against streaming sources to perform various operations on the data using Apache Flink.
15+
Kinesis Data Analytics for SQL Applications is a service offered by Amazon Web Services (AWS) that enables you to process and analyze streaming data in real-time.
16+
It allows you to apply transformations, filtering, and enrichment to streaming data using standard SQL syntax.
1417

1518
LocalStack allows you to use the Kinesis Data Analytics APIs in your local environment.
1619
The API coverage is available on:
1720

18-
* [Kinesis Data Analytics V1](https://docs.localstack.cloud/references/coverage/coverage_kinesisanalytics/)
19-
* [Kinesis Data Analytics V2](https://docs.localstack.cloud/references/coverage/coverage_kinesisanalyticsv2/)
20-
21-
This provides information on the extent of Kinesis Data Analytics integration with LocalStack.
21+
* [Kinesis Data Analytics](https://docs.localstack.cloud/references/coverage/coverage_kinesisanalytics/)
2222

2323
## Getting started
2424

@@ -105,8 +105,7 @@ The following output would be retrieved:
105105
}
106106
```
107107

108-
## Current Limitations
108+
## Limitations
109109

110-
* LocalStack supports basic emulation for the version 1 of the Kinesis Data Analytics API.
110+
* LocalStack supports basic emulation for Kinesis Data Analytics for SQL Applications.
111111
However, the queries are not fully supported and lack parity with AWS.
112-
* LocalStack supports CRUD mocking for the version 2 of the Kinesis Data Analytics API.
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
title: "Managed Service for Apache Flink"
3+
linkTitle: "Managed Service for Apache Flink"
4+
description: >
5+
Get started with Managed Service for Apache Flink on LocalStack
6+
tags: ["Pro image"]
7+
---
8+
9+
{{< callout >}}
10+
This service was formerly known as 'Kinesis Data Analytics for Apache Flink'.
11+
{{< /callout >}}
12+
13+
## Introduction
14+
15+
[Apache Flink](https://flink.apache.org/) is a framework for building applications that process and analyze streaming data.
16+
[Managed Service for Apache Flink (MSAF)](https://docs.aws.amazon.com/managed-flink/latest/java/what-is.html) is an AWS service that provides the underlying infrastructure and a hosted Apache Flink cluster that can run Apache Flink applications.
17+
18+
LocalStack lets you to run Flink applications locally and implements several [AWS-compatible API operations](https://docs.localstack.cloud/references/coverage/coverage_kinesisanalyticsv2/).
19+
20+
{{< callout "note" >}}
21+
The emulated MSAF provider was introduced and made the default in LocalStack v4.1.
22+
23+
If you wish to use the older mock provider, you can set `PROVIDER_OVERRIDE_KINESISANALYTICSV2=legacy`.
24+
{{< /callout >}}
25+
26+
## Getting Started
27+
28+
This guide builds a demo Flink application and deploys it to LocalStack.
29+
The application generates synthetic records, processes them and sends the output to an S3 bucket.
30+
31+
Start the LocalStack container using your preferred method.
32+
33+
### Build Application Code
34+
35+
Begin by cloning the AWS sample repository.
36+
We will use the [S3 Sink](https://github.com/localstack-samples/amazon-managed-service-for-apache-flink-examples/tree/main/java/S3Sink) application in this example.
37+
38+
{{< command >}}
39+
$ git clone https://github.com/localstack-samples/amazon-managed-service-for-apache-flink-examples.git
40+
$ cd java/S3Sink
41+
{{< /command >}}
42+
43+
Next, use [Maven](https://maven.apache.org/) to compile and package the Flink application into a jar.
44+
45+
{{< command >}}
46+
$ mvn package
47+
{{< /command >}}
48+
49+
The Flink application jar file will be placed in the `./target/flink-kds-s3.jar` directory.
50+
51+
### Upload Application Code
52+
53+
MSAF requires that all application code resides in S3.
54+
55+
Create an S3 bucket and upload the compiled Flink application jar.
56+
57+
{{< command >}}
58+
$ awslocal s3api create-bucket --bucket flink-bucket
59+
$ awslocal s3api put-object --bucket flink-bucket --key job.jar --body ./target/flink-kds-s3.jar
60+
{{< /command >}}
61+
62+
### Output Sink
63+
64+
As mentioned earlier, this Flink application writes the output to an S3 bucket.
65+
66+
Create the S3 bucket that will serve as the sink.
67+
68+
{{< command >}}
69+
$ awslocal s3api create-bucket --bucket sink-bucket
70+
{{< /command >}}
71+
72+
### Permissions
73+
74+
MSAF requires a service execution role which allows it to connect to other services.
75+
Without the proper permissions policy and role, this example application will not be able to connect to S3 sink bucket to output the result.
76+
77+
Create an IAM role for the running MSAF application to assume.
78+
79+
```json
80+
# role.json
81+
{
82+
"Version": "2012-10-17",
83+
"Statement": [
84+
{
85+
"Effect": "Allow",
86+
"Principal": {"Service": "kinesisanalytics.amazonaws.com"},
87+
"Action": "sts:AssumeRole"
88+
}
89+
]
90+
}
91+
```
92+
93+
{{< command >}}
94+
$ awslocal iam create-role --role-name msaf-role --assume-role-policy-document file://role.json
95+
{{< /command >}}
96+
97+
Next create add a permissions policy to this role that permits read and write access to S3.
98+
99+
```json
100+
# policy.json
101+
{
102+
"Version": "2012-10-17",
103+
"Statement": [
104+
{
105+
"Effect": "Allow",
106+
"Action": ["s3:GetObject", "s3:GetObjectVersion", "s3:PutObject"],
107+
"Resource": "*"
108+
}
109+
]
110+
}
111+
```
112+
113+
{{< command >}}
114+
$ awslocal iam put-role-policy --role-name msaf-role --policy-name msaf-policy --policy-document file://policy.json
115+
{{< /command >}}
116+
117+
Now, when the running MSAF application assumes this role, it will have the necessary permissions to write to the S3 sink.
118+
119+
### Deploy Application
120+
121+
With all prerequisite resources in place, the Flink application can now be created and started.
122+
123+
{{< command >}}
124+
$ awslocal kinesisanalyticsv2 create-application \
125+
--application-name msaf-app \
126+
--runtime-environment FLINK-1_20 \
127+
--application-mode STREAMING \
128+
--service-execution-role arn:aws:iam::000000000000:role/msaf-role \
129+
--application-configuration '{
130+
"ApplicationCodeConfiguration": {
131+
"CodeContent": {
132+
"S3ContentLocation": {
133+
"BucketARN": "arn:aws:s3:::flink-bucket",
134+
"FileKey": "job.jar"
135+
}
136+
},
137+
"CodeContentType": "ZIPFILE"
138+
},
139+
"EnvironmentProperties": {
140+
"PropertyGroups": [{
141+
"PropertyGroupId": "bucket", "PropertyMap": {"name": "sink-bucket"}
142+
}]
143+
}
144+
}'
145+
146+
$ awslocal kinesisanalyticsv2 start-application --application-name msaf-app
147+
{{< /command >}}
148+
149+
Once the Flink cluster is up and running, the application will stream the results to the sink S3 bucket.
150+
You can verify this with:
151+
152+
{{< command >}}
153+
$ awslocal s3api list-objects --bucket sink-bucket
154+
{{< /command >}}
155+
156+
## Supported Flink Versions
157+
158+
| Flink version | Supported by LocalStack | Supported by Apache |
159+
|:---:|:---:|:---:|
160+
| 1.20.0 | yes | yes |
161+
| 1.19.1 | yes | yes |
162+
| 1.18.1 | yes | yes |
163+
| 1.15.2 | yes | yes |
164+
| 1.13.1 | yes | no |
165+
166+
## Limitations
167+
168+
- Application versions are not maintained
169+
- Only S3 zipfile code is supported
170+
- Values of 20,000 ms for `execution.checkpointing.interval` and 5,000 ms for `execution.checkpointing.min-pause` are used for checkpointing.
171+
They can not be overridden.
172+
- [Tagging](https://docs.aws.amazon.com/managed-flink/latest/java/how-tagging.html) is not supported
173+
- In-place [version upgrades](https://docs.aws.amazon.com/managed-flink/latest/java/how-in-place-version-upgrades.html) and [roll-backs](https://docs.aws.amazon.com/managed-flink/latest/java/how-system-rollbacks.html) are not supported
174+
- [Snapshot/savepoint management](https://docs.aws.amazon.com/managed-flink/latest/java/how-snapshots.html) is not implemented
175+
- CloudWatch and CloudTrail integration is not implemented

0 commit comments

Comments
 (0)