How to install Apache Cassandra on Ubuntu 20.04

Apache Cassandra is an open source NoSQL distributed database that delivers scalability and high availability without compromising performance and is trusted by thousands of companies. Linear scalability and proven fault tolerance on commodity hardware and cloud infrastructure make it an ideal platform for mission-critical data. This tutorial describes how to install Apache Cassandra on an Ubuntu 20.04 server.

Prerequisites #

  • An Ubuntu 20.04 server
  • Create a non-root user with sudo access.
  • For using cqlsh, the latest version of Python 2.7 or Python 3.6+. To verify that you have the correct version of Python installed, type python --version

Install Apache Cassandra #

  1. Install Java 8

    Install the latest version of Java 8, either the Oracle Java Standard Edition 8 or OpenJDK 8.

    $ sudo apt install openjdk-8-jdk -y

    To verify that you have the correct version of java installed, type java -version.

    $ java -version

    The out may be:

    openjdk version "1.8.0_222"
    OpenJDK Runtime Environment (build 1.8.0_222-8u222-b10-1ubuntu1~16.04.1-b10)
    OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode)
  2. Install the required dependencies.

    $ sudo apt install apt-transport-https gnupg2 -y
  3. Download and add the Apache Cassandra GPG key.

    $ sudo wget -q -O - | sudo apt-key add -

    You may see the output:

    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
    100  266k  100  266k    0     0   320k      0 --:--:-- --:--:-- --:--:--  320k
  4. Add the Apache Cassandra repository to your system.

    $ echo "deb 40x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
  5. Update the package index.

    $ sudo apt-get update
  6. Install Apache Cassandra.

    $ sudo apt-get install cassandra -y
  7. Verify that Apache Cassandra is installed.

    $ dpkg -l | grep cassandra
  8. Verify that Apache Cassandra is running.

    $ sudo systemctl status cassandra
  9. Verify the stats of your node.

    $ sudo nodetool status

    The status column in the output should report UN which stands for "Up/Normal".

    Alternatively, connect to the database with:

    $ cqlsh

    The output should look something like this:

    Connected to Test Cluster at localhost:9042.
    [cqlsh 5.0.1 | Cassandra 3.8 | CQL spec 3.4.2 | Native protocol v4]
    Use HELP for help.

Configuring Apache Cassandra #

The Cassandra configuration files location varies, depending on the type of installation:

  • tarball: conf directory within the tarball install location
  • package: /etc/cassandra directory

Since we are installing using a package, we will use the /etc/cassandra directory.

  1. Edit the cassandra.yaml file.

    $ sudo nano /etc/cassandra/cassandra.yaml

    The file is a YAML file that contains the configuration for Apache Cassandra.

    The file is divided into sections. The sections are separated by a line that starts with a # character.

    Let update the cluster_name section.

    # cluster_name: The name of the cluster.
    cluster_name: My First Cluster

    The cluster_name section is used to identify the cluster.

  2. Clear the system cache

    $ nodetool flush system

    The flush system command clears the system cache.

    The system cache is used to store information about the cluster. This information is used to speed up the cluster startup. The system cache is cleared by running the flush system command. This command is useful when you want to start up the cluster with a different configuration. For example, you might want to start up the cluster with a different number of nodes.

  3. Restart the Cassandra service.

    $ sudo systemctl restart cassandra

    The systemctl command is used to start, stop, restart, and enable/disable the service.

  4. Verify the change

    $ cqlsh