Containerized Elastic Stack (Elasticsearch, Logstash, and Kibana) with Docker Compose.
This repository contains the source for building an immutable Docker image for the Elastic Stack. The images are published on Docker Hub and can be used as the base image for running the Elastic Stack in a containerized environment.
The Elastic Stack services can be configured using environment variables in the .env file. The following variables are available:
STACK_VERSION: The version of the Elastic Stack to use. The default value is8.14.3.ELASTICSEARCH_HTTP_PORT: The port on which Elasticsearch listens for incoming connections. The default port is9200.ELASTIC_PASSWORD: The password for theelasticuser. This password is used to authenticate with Elasticsearch and Kibana.KIBANA_PORT: The port on which Kibana listens for incoming connections. The default port is5601.LOGSTASH_PORT: The port on which Logstash listens for incoming connections. The default port is5044.ELASTICSEARCH_JAVA_OPTS: The Java options for Elasticsearch. The default value is-Xmx1g -Xms1g.LS_JAVA_OPTS: The Java options for Logstash. The default value is-Xmx1g -Xms1g.
To get started, please clone this repository to the local machine:
git clone git@github.com:tanhongit/docker-elasticsearch-logstash-kibana.gitChange the directory to the cloned repository:
cd docker-elasticsearch-logstash-kibanaCreate a .env file from the .env.example file:
cp .env.example .envChange the value of the
ELASTIC_PASSWORDand any other variables in the.envfile as needed.
Start the Elastic Stack services:
docker-compose up -dAccess the Kibana web interface by navigating to http://localhost:5601 in a web browser. (Use your custom KIBANA_PORT if you have changed it in the .env file.)
To run as amd64. You need to set the default platform to linux/amd64:
export DOCKER_DEFAULT_PLATFORM=linux/amd64The default configuration is set for linux/amd64. No changes needed.
If you encounter exec /bin/tini: exec format error, you need to change the platform: platform: ${PLATFORM:-linux/arm64}
This is an example of how to import data (csv file) into Elasticsearch using Logstash:
- Create a
logstash.conffile with the following content:
input {
file {
path => "/usr/share/logstash/data/employees.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["id", "name", "code", "salary"]
}
mutate {
convert => {
"id" => "integer"
"name" => "string"
"code" => "integer"
"salary" => "float"
}
}
}
output {
elasticsearch {
hosts => "${ELASTIC_HOSTS}"
user => "elastic"
password => "${ELASTIC_PASSWORD}"
index => "employees"
}
stdout { codec => rubydebug }
}
Note:
- The
logstash.conffile reads data from theemployees.csvfile and imports it into Elasticsearch.- The
employees.csvfile should be placed in thelogstash/datadirectory.- The
ELASTIC_HOSTSandELASTIC_PASSWORDenvironment variables are used to connect to Elasticsearch.- The
employeesindex is created in Elasticsearch.- The
rubydebugcodec is used to output the data to the console.- The
sincedb_pathis set to/dev/nullto avoid saving the state of the file.
- Create a
employees.csvfile with the following content:
id,name,code,salary
1,Alice,1001,50000
2,Bob,1002,60000
3,Charlie,1003,70000
4,Dave,1004,80000
5,Eve,1005,90000
- Update docker-compose.yml to include the Logstash service:
logstash:
build:
context: logstash
args:
STACK_VERSION: ${STACK_VERSION:-8.14.3}
container_name: "${COMPOSE_PROJECT_NAME}-logstash"
environment:
NODE_NAME: "logstash"
LS_JAVA_OPTS: "${LS_JAVA_OPTS}"
ELASTIC_USERNAME: "elastic"
ELASTIC_PASSWORD: "${ELASTIC_PASSWORD}"
ELASTIC_HOSTS: "http://elasticsearch:9200"
volumes:
- ./logstash/logstash.conf:/usr/share/logstash/pipeline/logstash.conf
- ./logstash/logstash.yml:/usr/share/logstash/config/logstash.yml
- ./logstash/data/employees.csv:/usr/share/logstash/data/employees.csv # Add this line
...- Start the Logstash service:
docker-compose up -d logstash- Accessing the Elasticsearch API
You can access the Elasticsearch API using curl or tools like Postman. Here are some examples:
curl -X GET "localhost:9200/employees/_search?pretty"Note:
- Replace
employeeswith the name of the index you want to query.- Change the port number if you have modified the
ELASTICSEARCH_HTTP_PORTin the.envfile.
β€οΈβπ₯ Check this branch to see the full example: [feat/import-csv-with-logstash/docker-compose] β€οΈβπ₯
This Makefile helps you quickly run Python Spark test scripts inside Docker containers (spark-master or spark-worker), without remembering the long docker exec and spark-submit command syntax.
make run [CONTAINER=<spark-master|spark-worker>] [TEST_FILE=<test_file.py>]CONTAINER(optional): The name of the Docker container to run the Spark job. Default isspark-master.TEST_FILE(optional): The name of the Python test file inside/opt/bitnami/spark/testsin the container. Default istest_duplicate_people_names.py.
- Run the default test file on the
spark-mastercontainer:
make run- Run
test_employees.pyon thespark-mastercontainer:
make run TEST_FILE=test_employees- Run the default test on the
spark-workercontainer:
make run CONTAINER=spark-worker- Run
test_connections.pyon thespark-workercontainer:
make run CONTAINER=spark-worker TEST_FILE=test_connections.pyEx: (Get file in spark/tests folder)
make run CONTAINER=spark-worker TEST_FILE=connections_analysis.py
make run CONTAINER=spark-worker TEST_FILE=test_duplicate_people_names.py
make run CONTAINER=spark-worker TEST_FILE=test_employees.py
make run CONTAINER=spark-worker TEST_FILE=test_name_start_with_text.py
make run CONTAINER=spark-worker TEST_FILE=test_people.py
make run CONTAINER=spark-worker TEST_FILE=test_relationship_count_per_user.py
make run CONTAINER=spark-worker TEST_FILE=test_spark.py
make run CONTAINER=spark-worker TEST_FILE=test_top_relationships.py
make run CONTAINER=spark-worker TEST_FILE=case/test_employees_case.py
make run CONTAINER=spark-worker TEST_FILE=case/test_connections_case.py
make run CONTAINER=spark-worker TEST_FILE=case/test_people_case.py
make run CONTAINER=spark-worker TEST_FILE=case/test_employees_rdd_case.py
make run CONTAINER=spark-worker TEST_FILE=case/test_connections_rdd_case.py- Identify running containers using
docker ps. - Write or update Python test files in the
spark/testsfolder. - Use the
make runcommand with appropriate parameters. - Check the test results printed in the terminal.