SI prefixes in ggplot2 axis labels
I used library("sos"); findFn("{SI prefix}")
to find the sitools
package.
Construct data:
bytes <- 2^seq(0,20) + rnorm(21, 4, 2)
time <- bytes/(1e4 + rnorm(21, 100, 3)) + 8
my_data <- data.frame(time, bytes)
Load packages:
library("sitools")
library("ggplot2")
Create the plot:
(p <- ggplot(data=my_data, aes(x=bytes, y=time)) +
geom_point() +
geom_line() +
scale_x_log10("Message Size [Byte]", labels=f2si) +
scale_y_continuous("Round-Trip-Time [us]"))
I'm not sure how this compares to your function, but at least someone else went to the trouble of writing it ...
I modified your code style a little bit -- semicolons at the ends of lines are harmless but are generally the sign of a MATLAB or C coder ...
edit: I initially defined a generic formatting function
si_format <- function(...) {
function(x) f2si(x,...)
}
following the format of (e.g) scales::comma_format
, but that seems unnecessary in this case -- just part of the deeper ggplot2
magic that I don't fully understand.
The OP's code gives what seems to me to be not quite the right answer: the rightmost axis tick is "1000K" rather than "1M" -- this can be fixed by changing the >1e6
test to >=1e6
. On the other hand, f2si
uses lower-case k
-- I don't know whether K
is wanted (wrapping the results in toupper()
could fix this).
OP results (si_vec
):
My results (f2si
):
Update: Recent versions of the scales
package include functionality to print readable labels.
In this case, label_bytes
can be used:
library(ggplot2)
library(scales)
bytes <- 2^seq(0,20) + rnorm(21, 4, 2)
my_data <- data.frame(
bytes=as.integer(bytes),
time=bytes / (1e4 + rnorm(21, 100, 3)) + 8
)
ggplot(data=my_data, aes(x=bytes, y=time)) +
geom_point() +
geom_line() +
scale_x_log10("Message Size [Byte]", labels=label_bytes()) +
scale_y_continuous("Round-Trip-Time [us]")
Or, if you prefer to have IEC units (KiB = 2^10
, MiB = 2 ^ 20
, ...), specify labels=label_bytes(units = "auto_binary")
. For the result, check out the second plot in the original answer below as the result is very similar.
Original answer
For bytes there is gdata::humanReadable
. humanReadable
supports both SI prefixes (1000 Byte = 1 KB) as well as the binary prefixes defined by the IEC (1024 Byte = 1 KiB).
This function humanReadableLabs
allows to customise the parameters and takes care of NA
values:
humanReadableLabs <- function(...) {
function(x) {
sapply(x, function(val) {
if (is.na(val)) {
return("")
} else {
return(
humanReadable(val, ...)
)
}
})
}
}
Now it is straightforward to change the labels to use SI prefixes and "byte" as the unit:
library(ggplot2)
library(gdata)
bytes <- 2^seq(0,20) + rnorm(21, 4, 2)
my_data <- data.frame(
bytes=as.integer(bytes),
time=bytes / (1e4 + rnorm(21, 100, 3)) + 8
)
humanReadableLabs <- function(...) {...}
ggplot(data=my_data, aes(x=bytes, y=time)) +
geom_point() +
geom_line() +
scale_x_log10("Message Size [Byte]", labels=humanReadableLabs(standard="SI")) +
scale_y_continuous("Round-Trip-Time [us]")
IEC prefixes are plotted by omitting standard="SI"
. Note that the breaks would have to be specified as well to have well-legible values.
ggplot(data=my_data, aes(x=bytes, y=time)) +
geom_point() +
geom_line() +
scale_x_log10("Message Size [Byte]", labels=humanReadableLabs()) +
scale_y_continuous("Round-Trip-Time [us]")