how do I parse an iso 8601 date (with optional milliseconds) to a struct tm in C++?
You can use C
's sscanf
(http://www.cplusplus.com/reference/cstdio/sscanf/) to parse it:
const char *dateStr = "2014-11-12T19:12:14.505Z";
int y,M,d,h,m;
float s;
sscanf(dateStr, "%d-%d-%dT%d:%d:%fZ", &y, &M, &d, &h, &m, &s);
If you have std::string
it can be called like this (http://www.cplusplus.com/reference/string/string/c_str/):
std::string dateStr = "2014-11-12T19:12:14.505Z";
sscanf(dateStr.c_str(), "%d-%d-%dT%d:%d:%fZ", &y, &M, &d, &h, &m, &s);
If it should handle different timezones you need to use sscanf
return value - number of parsed arguments:
int tzh = 0, tzm = 0;
if (6 < sscanf(dateStr.c_str(), "%d-%d-%dT%d:%d:%f%d:%dZ", &y, &M, &d, &h, &m, &s, &tzh, &tzm)) {
if (tzh < 0) {
tzm = -tzm; // Fix the sign on minutes.
}
}
And then you can fill tm
(http://www.cplusplus.com/reference/ctime/tm/) struct:
tm time = { 0 };
time.tm_year = y - 1900; // Year since 1900
time.tm_mon = M - 1; // 0-11
time.tm_mday = d; // 1-31
time.tm_hour = h; // 0-23
time.tm_min = m; // 0-59
time.tm_sec = (int)s; // 0-61 (0-60 in C++11)
It also can be done with std::get_time
(http://en.cppreference.com/w/cpp/io/manip/get_time) since C++11
as @Barry mentioned in comment how do I parse an iso 8601 date (with optional milliseconds) to a struct tm in C++?
Old question, and I have some old code to contribute ;). I was using the date library mentioned here. While it works great, it comes at a performance cost. For most common cases this would be not really relevant. However, if you have for example a service parsing data like I do, it really does matter.
I was profiling my server application for performance optimization, and found that parsing an ISO timestamp using the date library was 3 times slower compared to parsing the whole (roughly 500 bytes) json document. In total parsing the timestamp accounted for about 4.8% of total CPU time.
On my quest to optimize this part, I did not find much with C++ that I would consider for a living product. And the code which I did consider further mostly had some dependencies (e.g. the ISO parser in CEPH looks ok and seems well tested).
In the end, I turned to good old C and stripped out some code from the SQLite date.c to make it work standalone. The difference:
date: 872ms
SQLite date.c: 54ms
(Profiled function weight of real life service application)
Here it is (all credits to SQLite):
The header file date_util.h
#include <stdint.h>
#include <stdbool.h>
#ifdef __cplusplus
extern "C" {
#endif
// Calculates time since epoch including milliseconds
uint64_t ParseTimeToEpochMillis(const char *str, bool *error);
// Creates an ISO timestamp with milliseconds from epoch with millis.
// The buffer size (resultLen) for result must be at least 100 bytes.
void TimeFromEpochMillis(uint64_t epochMillis, char *result, int resultLen, bool *error);
#ifdef __cplusplus
}
#endif
This is the C file date_util.c:
#include "_date.h"
#include <ctype.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdarg.h>
#include <assert.h>
#include <stdio.h>
#include <string.h>
/*
** A structure for holding a single date and time.
*/
typedef struct DateTime DateTime;
struct DateTime {
int64_t iJD; /* The julian day number times 86400000 */
int Y, M, D; /* Year, month, and day */
int h, m; /* Hour and minutes */
int tz; /* Timezone offset in minutes */
double s; /* Seconds */
char validJD; /* True (1) if iJD is valid */
char rawS; /* Raw numeric value stored in s */
char validYMD; /* True (1) if Y,M,D are valid */
char validHMS; /* True (1) if h,m,s are valid */
char validTZ; /* True (1) if tz is valid */
char tzSet; /* Timezone was set explicitly */
char isError; /* An overflow has occurred */
};
/*
** Convert zDate into one or more integers according to the conversion
** specifier zFormat.
**
** zFormat[] contains 4 characters for each integer converted, except for
** the last integer which is specified by three characters. The meaning
** of a four-character format specifiers ABCD is:
**
** A: number of digits to convert. Always "2" or "4".
** B: minimum value. Always "0" or "1".
** C: maximum value, decoded as:
** a: 12
** b: 14
** c: 24
** d: 31
** e: 59
** f: 9999
** D: the separator character, or \000 to indicate this is the
** last number to convert.
**
** Example: To translate an ISO-8601 date YYYY-MM-DD, the format would
** be "40f-21a-20c". The "40f-" indicates the 4-digit year followed by "-".
** The "21a-" indicates the 2-digit month followed by "-". The "20c" indicates
** the 2-digit day which is the last integer in the set.
**
** The function returns the number of successful conversions.
*/
static int GetDigits(const char *zDate, const char *zFormat, ...){
/* The aMx[] array translates the 3rd character of each format
** spec into a max size: a b c d e f */
static const uint16_t aMx[] = { 12, 14, 24, 31, 59, 9999 };
va_list ap;
int cnt = 0;
char nextC;
va_start(ap, zFormat);
do{
char N = zFormat[0] - '0';
char min = zFormat[1] - '0';
int val = 0;
uint16_t max;
assert( zFormat[2]>='a' && zFormat[2]<='f' );
max = aMx[zFormat[2] - 'a'];
nextC = zFormat[3];
val = 0;
while( N-- ){
if( !isdigit(*zDate) ){
goto end_getDigits;
}
val = val*10 + *zDate - '0';
zDate++;
}
if( val<(int)min || val>(int)max || (nextC!=0 && nextC!=*zDate) ){
goto end_getDigits;
}
*va_arg(ap,int*) = val;
zDate++;
cnt++;
zFormat += 4;
}while( nextC );
end_getDigits:
va_end(ap);
return cnt;
}
/*
** Parse a timezone extension on the end of a date-time.
** The extension is of the form:
**
** (+/-)HH:MM
**
** Or the "zulu" notation:
**
** Z
**
** If the parse is successful, write the number of minutes
** of change in p->tz and return 0. If a parser error occurs,
** return non-zero.
**
** A missing specifier is not considered an error.
*/
static int ParseTimezone(const char *zDate, DateTime *p){
int sgn = 0;
int nHr, nMn;
int c;
while( isspace(*zDate) ){ zDate++; }
p->tz = 0;
c = *zDate;
if( c=='-' ){
sgn = -1;
}else if( c=='+' ){
sgn = +1;
}else if( c=='Z' || c=='z' ){
zDate++;
goto zulu_time;
}else{
return c!=0;
}
zDate++;
if( GetDigits(zDate, "20b:20e", &nHr, &nMn)!=2 ){
return 1;
}
zDate += 5;
p->tz = sgn*(nMn + nHr*60);
zulu_time:
while( isspace(*zDate) ){ zDate++; }
p->tzSet = 1;
return *zDate!=0;
}
/*
** Parse times of the form HH:MM or HH:MM:SS or HH:MM:SS.FFFF.
** The HH, MM, and SS must each be exactly 2 digits. The
** fractional seconds FFFF can be one or more digits.
**
** Return 1 if there is a parsing error and 0 on success.
*/
static int ParseHhMmSs(const char *zDate, DateTime *p){
int h, m, s;
double ms = 0.0;
if( GetDigits(zDate, "20c:20e", &h, &m)!=2 ){
return 1;
}
zDate += 5;
if( *zDate==':' ){
zDate++;
if( GetDigits(zDate, "20e", &s)!=1 ){
return 1;
}
zDate += 2;
if( *zDate=='.' && isdigit(zDate[1]) ){
double rScale = 1.0;
zDate++;
while( isdigit(*zDate) ){
ms = ms*10.0 + *zDate - '0';
rScale *= 10.0;
zDate++;
}
ms /= rScale;
}
}else{
s = 0;
}
p->validJD = 0;
p->rawS = 0;
p->validHMS = 1;
p->h = h;
p->m = m;
p->s = s + ms;
if( ParseTimezone(zDate, p) ) return 1;
p->validTZ = (p->tz!=0)?1:0;
return 0;
}
/*
** Put the DateTime object into its error state.
*/
static void DatetimeError(DateTime *p){
memset(p, 0, sizeof(*p));
p->isError = 1;
}
/*
** Convert from YYYY-MM-DD HH:MM:SS to julian day. We always assume
** that the YYYY-MM-DD is according to the Gregorian calendar.
**
** Reference: Meeus page 61
*/
static void ComputeJD(DateTime *p){
int Y, M, D, A, B, X1, X2;
if( p->validJD ) return;
if( p->validYMD ){
Y = p->Y;
M = p->M;
D = p->D;
}else{
Y = 2000; /* If no YMD specified, assume 2000-Jan-01 */
M = 1;
D = 1;
}
if( Y<-4713 || Y>9999 || p->rawS ){
DatetimeError(p);
return;
}
if( M<=2 ){
Y--;
M += 12;
}
A = Y/100;
B = 2 - A + (A/4);
X1 = 36525*(Y+4716)/100;
X2 = 306001*(M+1)/10000;
p->iJD = (int64_t)((X1 + X2 + D + B - 1524.5 ) * 86400000);
p->validJD = 1;
if( p->validHMS ){
p->iJD += p->h*3600000 + p->m*60000 + (int64_t)(p->s*1000);
if( p->validTZ ){
p->iJD -= p->tz*60000;
p->validYMD = 0;
p->validHMS = 0;
p->validTZ = 0;
}
}
}
/*
** Parse dates of the form
**
** YYYY-MM-DD HH:MM:SS.FFF
** YYYY-MM-DD HH:MM:SS
** YYYY-MM-DD HH:MM
** YYYY-MM-DD
**
** Write the result into the DateTime structure and return 0
** on success and 1 if the input string is not a well-formed
** date.
*/
static int ParseYyyyMmDd(const char *zDate, DateTime *p){
int Y, M, D, neg;
if( zDate[0]=='-' ){
zDate++;
neg = 1;
}else{
neg = 0;
}
if( GetDigits(zDate, "40f-21a-21d", &Y, &M, &D)!=3 ){
return 1;
}
zDate += 10;
while( isspace(*zDate) || 'T'==*(uint8_t*)zDate ){ zDate++; }
if( ParseHhMmSs(zDate, p)==0 ){
/* We got the time */
}else if( *zDate==0 ){
p->validHMS = 0;
}else{
return 1;
}
p->validJD = 0;
p->validYMD = 1;
p->Y = neg ? -Y : Y;
p->M = M;
p->D = D;
if( p->validTZ ){
ComputeJD(p);
}
return 0;
}
/* The julian day number for 9999-12-31 23:59:59.999 is 5373484.4999999.
** Multiplying this by 86400000 gives 464269060799999 as the maximum value
** for DateTime.iJD.
**
** But some older compilers (ex: gcc 4.2.1 on older Macs) cannot deal with
** such a large integer literal, so we have to encode it.
*/
#define INT_464269060799999 ((((int64_t)0x1a640)<<32)|0x1072fdff)
/*
** Return TRUE if the given julian day number is within range.
**
** The input is the JulianDay times 86400000.
*/
static int ValidJulianDay(int64_t iJD){
return iJD>=0 && iJD<=INT_464269060799999;
}
/*
** Compute the Year, Month, and Day from the julian day number.
*/
static void ComputeYMD(DateTime *p){
int Z, A, B, C, D, E, X1;
if( p->validYMD ) return;
if( !p->validJD ){
p->Y = 2000;
p->M = 1;
p->D = 1;
}else if( !ValidJulianDay(p->iJD) ){
DatetimeError(p);
return;
}else{
Z = (int)((p->iJD + 43200000)/86400000);
A = (int)((Z - 1867216.25)/36524.25);
A = Z + 1 + A - (A/4);
B = A + 1524;
C = (int)((B - 122.1)/365.25);
D = (36525*(C&32767))/100;
E = (int)((B-D)/30.6001);
X1 = (int)(30.6001*E);
p->D = B - D - X1;
p->M = E<14 ? E-1 : E-13;
p->Y = p->M>2 ? C - 4716 : C - 4715;
}
p->validYMD = 1;
}
/*
** Compute the Hour, Minute, and Seconds from the julian day number.
*/
static void ComputeHMS(DateTime *p){
int s;
if( p->validHMS ) return;
ComputeJD(p);
s = (int)((p->iJD + 43200000) % 86400000);
p->s = s/1000.0;
s = (int)p->s;
p->s -= s;
p->h = s/3600;
s -= p->h*3600;
p->m = s/60;
p->s += s - p->m*60;
p->rawS = 0;
p->validHMS = 1;
}
/*
** Compute both YMD and HMS
*/
static void ComputeYMD_HMS(DateTime *p){
ComputeYMD(p);
ComputeHMS(p);
}
/*
** Input "r" is a numeric quantity which might be a julian day number,
** or the number of seconds since 1970. If the value if r is within
** range of a julian day number, install it as such and set validJD.
** If the value is a valid unix timestamp, put it in p->s and set p->rawS.
*/
static void SetRawDateNumber(DateTime *p, double r){
p->s = r;
p->rawS = 1;
if( r>=0.0 && r<5373484.5 ){
p->iJD = (int64_t)(r*86400000.0 + 0.5);
p->validJD = 1;
}
}
/*
** Clear the YMD and HMS and the TZ
*/
static void ClearYMD_HMS_TZ(DateTime *p){
p->validYMD = 0;
p->validHMS = 0;
p->validTZ = 0;
}
// modified methods to only calculate for and back between epoch and iso timestamp with millis
uint64_t ParseTimeToEpochMillis(const char *str, bool *error) {
assert(str);
assert(error);
*error = false;
DateTime dateTime;
int res = ParseYyyyMmDd(str, &dateTime);
if (res) {
*error = true;
return 0;
}
ComputeJD(&dateTime);
ComputeYMD_HMS(&dateTime);
// get fraction (millis of a full second): 24.355 => 355
int millis = (dateTime.s - (int)(dateTime.s)) * 1000;
uint64_t epoch = (int64_t)(dateTime.iJD/1000 - 21086676*(int64_t)10000) * 1000 + millis;
return epoch;
}
void TimeFromEpochMillis(uint64_t epochMillis, char *result, int resultLen, bool *error) {
assert(resultLen >= 100);
assert(result);
assert(error);
int64_t seconds = epochMillis / 1000;
int millis = epochMillis - seconds * 1000;
DateTime x;
*error = false;
memset(&x, 0, sizeof(x));
SetRawDateNumber(&x, seconds);
/*
** unixepoch
**
** Treat the current value of p->s as the number of
** seconds since 1970. Convert to a real julian day number.
*/
{
double r = x.s*1000.0 + 210866760000000.0;
if( r>=0.0 && r<464269060800000.0 ){
ClearYMD_HMS_TZ(&x);
x.iJD = (int64_t)r;
x.validJD = 1;
x.rawS = 0;
}
ComputeJD(&x);
if( x.isError || !ValidJulianDay(x.iJD) ) {
*error = true;
}
}
ComputeYMD_HMS(&x);
snprintf(result, resultLen, "%04d-%02d-%02dT%02d:%02d:%02d.%03dZ",
x.Y, x.M, x.D, x.h, x.m, (int)(x.s), millis);
}
These two helper methods simply convert to and from a timestamp in with milliseconds. Setting a tm struct from the DateTime should be obvious.
Example usage:
// Calculate milliseconds since epoch
std::string timeStamp = "2019-09-02T22:02:24.355Z";
bool error;
uint64_t time = ParseTimeToEpochMillis(timeStamp.c_str(), &error);
// Get ISO timestamp with milliseconds component from epoch in milliseconds.
// Multiple by 1000 in case you have a standard epoch in seconds)
uint64_t epochMillis = 1567461744355; // == "2019-09-02T22:02:24.355Z"
char result[100] = {0};
TimeFromEpochMillis(epochMillis, result, sizeof(result), &error);
std::string resultStr(result); // == "2019-09-02T22:02:24.355Z"
New answer for old question. Rationale: updated tools.
Using this free, open source library, one can parse into a std::chrono::time_point<system_clock, milliseconds>
, which has the advantage over a tm
of being able to hold millisecond precision. And if you really need to, you can continue on to the C API via system_clock::to_time_t
(losing the milliseconds along the way).
#include "date.h"
#include <iostream>
#include <sstream>
date::sys_time<std::chrono::milliseconds>
parse8601(std::istream&& is)
{
std::string save;
is >> save;
std::istringstream in{save};
date::sys_time<std::chrono::milliseconds> tp;
in >> date::parse("%FT%TZ", tp);
if (in.fail())
{
in.clear();
in.exceptions(std::ios::failbit);
in.str(save);
in >> date::parse("%FT%T%Ez", tp);
}
return tp;
}
int
main()
{
using namespace date;
using namespace std;
cout << parse8601(istringstream{"2014-11-12T19:12:14.505Z"}) << '\n';
cout << parse8601(istringstream{"2014-11-12T12:12:14.505-5:00"}) << '\n';
}
This outputs:
2014-11-12 19:12:14.505
2014-11-12 17:12:14.505
Note that both outputs are UTC. The parse
converted the local time to UTC using the -5:00
offset. If you actually want local time, there is also a way to parse into a type called date::local_time<milliseconds>
which would then parse but ignore the offset. One can even parse the offset into a chrono::minutes
if desired (using a parse
overload taking minutes&
).
The precision of the parse is controlled by the precision of the chrono::time_point
you pass in, instead of by flags in the format string. And the offset can either be of the style +/-hhmm
with %z
, or +/-[h]h:mm
with %Ez
.
Modern C++ version of parse ISO 8601* function
* - this code supports only subset of ISO 8601. The only supported forms are "2020-09-19T05:12:32Z" and "2020-09-19T05:12:32.123Z". Milliseconds can be 3 digit length or no milliseconds part at all, no timezone except Z
, no other more rare features.
#include <cstdlib>
#include <ctime>
#include <string>
#ifdef _WIN32
#define timegm _mkgmtime
#endif
inline int ParseInt(const char* value)
{
return std::strtol(value, nullptr, 10);
}
// ParseISO8601 returns milliseconds since 1970
std::time_t ParseISO8601(const std::string& input)
{
constexpr const size_t expectedLength = sizeof("1234-12-12T12:12:12Z") - 1;
static_assert(expectedLength == 20, "Unexpected ISO 8601 date/time length");
if (input.length() < expectedLength)
{
return 0;
}
std::tm time = { 0 };
time.tm_year = ParseInt(&input[0]) - 1900;
time.tm_mon = ParseInt(&input[5]) - 1;
time.tm_mday = ParseInt(&input[8]);
time.tm_hour = ParseInt(&input[11]);
time.tm_min = ParseInt(&input[14]);
time.tm_sec = ParseInt(&input[17]);
time.tm_isdst = 0;
const int millis = input.length() > 20 ? ParseInt(&input[20]) : 0;
return timegm(&time) * 1000 + millis;
}