Init script for Cassandra with docker-compose
I solved this problem by patching cassandra's docker-entrypoint.sh
so it will execute sh
and cql
files located in /docker-entrypoint-initdb.d
on startup. This is similar to how MySQL docker containers work.
Basically, I add a small script at the end of the docker-entrypoint.sh
(right before the last line, exec "$@"
), that will run the cql scripts once cassandra is up. A simplified version is:
INIT_DIR=docker-entrypoint-initdb.d
# this whole block will execute in the background
(
cd $INIT_DIR
# wait for cassandra to be ready
while ! cqlsh -e 'describe cluster' > /dev/null 2>&1; do sleep 6; done
echo "$0: Cassandra cluster ready: executing cql scripts found in $INIT_DIR"
# find and execute cql scripts, in name order
for f in $(find . -type f -name "*.cql" -print | sort); do
echo "$0: running $f"
cqlsh -f "$f"
echo "$0: $f executed"
done
) &
This solution works for all cassandra versions (at least until 3.11, as the time of writing).
Hence, you only have to build and use this cassandra image version, and then add proper initializations scripts to the container using docker-compose volumes.
A complete gist with a more robust entrypoint patch (and example) is available here.
We recently tried to solve a similar problem in KillrVideo, a reference application for Cassandra. We are using Docker Compose to spin up the environment needed by the application which includes a DataStax Enterprise (i.e. Cassandra) node. We wanted that node to do some bootstrapping the first time it was started to install the CQL schema (using cqlsh
to run the statements in a .cql
file just like you're trying to do). Basically the approach we took was to write a shell script for our Docker entrypoint that:
- Starts the node normally but in the background.
- Waits until port 9042 is available (this is where clients connect to run CQL statements).
- Uses
cqlsh -f
to run some CQL statements and init the schema. - Stops the node that's running in the background.
- Continues on to the usual entrypoint for our Docker image that starts up the node normally (in the foreground like Docker expects).
We just use the existence of a file to indicate whether the node has already been bootstrapped and check that on startup to determine whether we need to do that logic above or can just start it normally. You can see the results in the killrvideo-dse-docker repository on GitHub.
There is one caveat to this approach. This worked great for us because in our reference application, we're only spinning up a single node (i.e. we aren't creating a cluster with more than one node). If you're running multiple nodes, you'll probably want to make sure that only one of the nodes does the bootstrapping to create the schema because multiple clients modifying the schema simultaneously can cause some issues with your cluster. (This is a known issue and will hopefully be fixed at some point.)
I was also searching for the solution to this question, and here is the way how I accomplished it.
Here the second instance of Cassandra has a volume with the schema.cql and runs CQLSH command
My Version with healthcheck so we can get rid of sleep command
version: '2.2'
services:
cassandra:
image: cassandra:3.11.2
container_name: cassandra
ports:
- "9042:9042"
environment:
- "MAX_HEAP_SIZE=256M"
- "HEAP_NEWSIZE=128M"
restart: always
volumes:
- ./out/cassandra_data:/var/lib/cassandra
healthcheck:
test: ["CMD", "cqlsh", "-u cassandra", "-p cassandra" ,"-e describe keyspaces"]
interval: 15s
timeout: 10s
retries: 10
cassandra-load-keyspace:
container_name: cassandra-load-keyspace
image: cassandra:3.11.2
depends_on:
cassandra:
condition: service_healthy
volumes:
- ./src/main/resources/cassandra_schema.cql:/schema.cql
command: /bin/bash -c "echo loading cassandra keyspace && cqlsh cassandra -f /schema.cql"
NetFlix Version using sleep
version: '3.5'
services:
cassandra:
image: cassandra:latest
container_name: cassandra
ports:
- "9042:9042"
environment:
- "MAX_HEAP_SIZE=256M"
- "HEAP_NEWSIZE=128M"
restart: always
volumes:
- ./out/cassandra_data:/var/lib/cassandra
cassandra-load-keyspace:
container_name: cassandra-load-keyspace
image: cassandra:latest
depends_on:
- cassandra
volumes:
- ./src/main/resources/cassandra_schema.cql:/schema.cql
command: /bin/bash -c "sleep 60 && echo loading cassandra keyspace && cqlsh cassandra -f /schema.cql"
P.S I found this way at one of the Netflix Repos