Preserving large numbers
try working with colClasses="character"
read.csv("file.csv", colClasses = "character")
http://stat.ethz.ch/R-manual/R-devel/library/utils/html/read.table.html
Have a look at this link.
Picking up on what you said in the comments, you can directly import the text as a character by specifying the colClasses
in read.table()
. For example:
num <- "1665535004661"
dat.char <- read.table(text = num, colClasses="character")
str(dat.char)
#------
'data.frame': 1 obs. of 1 variable:
$ V1: chr "1665535004661"
dat.char
#------
V1
1 1665535004661
Alternatively (and for other uses), you can specify the digits
variable under options()
. The default is 7 digits and the acceptable range is 1-22. To be clear, setting this option in no way changes or alters the underlying data, it merely controls how it is displayed on screen when printed. From the help page for ?options
:
controls the number of digits to print when printing numeric values. It is a suggestion only.
Valid values are 1...22 with default 7. See the note in print.default about values greater than
15.
Example illustrating this:
options(digits = 7)
dat<- read.table(text = num)
dat
#------
V1
1 1.665535e+12
options(digits = 22)
dat
#------
V1
1 1665535004661
To flesh this out completely and to account for the cases when setting a global setting is not preferable, you can specify digits directly as an argument to print(foo, digits = bar)
. You can read more about this under ?print.default
. This is what John describes in his answer so credit should go to him for illuminating that nuance.
It's not in a "1.67E+12 format", it just won't print entirely using the defaults. R is reading it in just fine and the whole number is there.
x <- 1665535004661
> x
[1] 1.665535e+12
> print(x, digits = 16)
[1] 1665535004661
See, the numbers were there all along. They don't get lost unless you have a really large number of digits. Sorting on what you brought in will work fine and you can just explicitly call print() with the digits option to see your data.frame instead of implicitly by typing the name.
From the ?is.integer page:
"Note that current implementations of R use 32-bit integers for integer vectors, so the range of representable integers is restricted to about +/-2*10^9?
1665535004661L > 2*10^9 [1] TRUE
You want package Rmpfr.
library(Rmpfr)
x <- mpfr(15, precBits= 1024)