How to plot a confusion matrix using heatmaps in R?
You can achieve a nice result using ggplot2
, but for that you need a data.frame with 3 columns for x, y and the value to plot.
Using gather
from the tidyr
tool it is very easy to reformat your data:
library("dplyr")
library("tidyr")
# Loading your example. Row names should get their own column (here `y`).
hm <- readr::read_delim("y a b c d e f g h i j
a 5 4 0 0 0 0 0 0 0 0
b 0 0 0 0 0 0 0 0 0 0
c 0 0 4 0 0 0 0 0 0 0
d 0 0 0 0 0 0 0 0 0 0
e 2 0 0 0 2 0 0 0 0 0
f 1 0 0 0 0 2 0 0 0 0
g 0 0 0 0 0 0 0 0 0 0
h 0 0 0 0 0 0 0 0 0 0
i 0 0 0 0 0 0 0 0 0 0
j 0 0 0 0 0 0 0 0 0 0", delim=" ")
# Gathering columns a to j
hm <- hm %>% gather(x, value, a:j)
# hm now looks like:
# # A tibble: 100 x 3
# y x value
# <chr> <chr> <dbl>
# 1 a a 5
# 2 b a 0
# 3 c a 0
# 4 d a 0
# 5 e a 2
# # ... with 95 more rows
Perfect! Let's get plotting. the basic geom for heatmap with ggplot2 is geom_tile
to which we'll provide aesthetic x
, y
and fill
.
library("ggplot2")
ggplot(hm, aes(x=x, y=y, fill=value)) + geom_tile()
OK not too bad but we can do much better. First we probably want to reverse the y axis. The trick is to provide x and y as factors with the levels ordered as we want them.
hm <- hm %>%
mutate(x = factor(x), # alphabetical order by default
y = factor(y, levels = rev(unique(y)))) # force reverse alphabetical order
Then I like the black & white theme theme_bw()
which gets rid of the grey background. I also like to use a palette from RColorBrewer
(with direction = 1
to get the darker colors for higher values).
Since you're plotting the same thing on the x
and y
axis, you probably want equal axis scales: coord_equal()
will give you a square plot.
ggplot(hm, aes(x=x, y=y, fill=value)) +
geom_tile() + theme_bw() + coord_equal() +
scale_fill_distiller(palette="Greens", direction=1)
# Other valid palettes: Reds, Blues, Spectral, RdYlBu (red-yellow-blue), ...
The finishing touch: printing the values on top of the tiles and removing the legend since it is not longer useful. Obviously this is all optional but it gives you material to build from. Note geom_text
inherits the x
and y
aesthetics since they were passed to ggplot
.
ggplot(hm, aes(x=x, y=y, fill=value)) +
geom_tile() + theme_bw() + coord_equal() +
scale_fill_distiller(palette="Greens", direction=1) +
guides(fill=F) + # removing legend for `fill`
labs(title = "Value distribution") + # using a title instead
geom_text(aes(label=value), color="black") # printing values
You could also pass color="black"
to geom_tile
to draw (black) lines around the tiles. A final plot with the RdYlBu
color scheme (see RColorBrewer::display.brewer.all()
for a list of available palettes).
As Greg mentioned, image
is probably the way to go:
z = c(5,4,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,4,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
2,0,0,0,2,0,0,0,0,0,
1,0,0,0,0,2,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0)
z = matrix(z, ncol=10)
colnames(z) = c("a","b","c","d","e","f","g","h","i", "j")
rownames(z) = c("a","b","c","d","e","f","g","h","i", "j")
##To get the correct image plot rotation
##We need to flip the plot
image(z[,ncol(z):1], axes=FALSE)
##Add in the y-axis labels. Similar idea for x-axis.
axis(2, at = seq(0, 1, length=length(colnames(z))), labels=colnames(z))
You may also want to look at the heatmap
function:
heatmap(t(z)[ncol(z):1,], Rowv=NA,
Colv=NA, col = heat.colors(256))