how do I parse an iso 8601 date (with optional milliseconds) to a struct tm in C++?

You can use C's sscanf (http://www.cplusplus.com/reference/cstdio/sscanf/) to parse it:

const char *dateStr = "2014-11-12T19:12:14.505Z";
int y,M,d,h,m;
float s;
sscanf(dateStr, "%d-%d-%dT%d:%d:%fZ", &y, &M, &d, &h, &m, &s);

If you have std::string it can be called like this (http://www.cplusplus.com/reference/string/string/c_str/):

std::string dateStr = "2014-11-12T19:12:14.505Z";
sscanf(dateStr.c_str(), "%d-%d-%dT%d:%d:%fZ", &y, &M, &d, &h, &m, &s);

If it should handle different timezones you need to use sscanf return value - number of parsed arguments:

int tzh = 0, tzm = 0;
if (6 < sscanf(dateStr.c_str(), "%d-%d-%dT%d:%d:%f%d:%dZ", &y, &M, &d, &h, &m, &s, &tzh, &tzm)) {
    if (tzh < 0) {
       tzm = -tzm;    // Fix the sign on minutes.
    }
}

And then you can fill tm (http://www.cplusplus.com/reference/ctime/tm/) struct:

tm time = { 0 };
time.tm_year = y - 1900; // Year since 1900
time.tm_mon = M - 1;     // 0-11
time.tm_mday = d;        // 1-31
time.tm_hour = h;        // 0-23
time.tm_min = m;         // 0-59
time.tm_sec = (int)s;    // 0-61 (0-60 in C++11)

It also can be done with std::get_time (http://en.cppreference.com/w/cpp/io/manip/get_time) since C++11 as @Barry mentioned in comment how do I parse an iso 8601 date (with optional milliseconds) to a struct tm in C++?


Old question, and I have some old code to contribute ;). I was using the date library mentioned here. While it works great, it comes at a performance cost. For most common cases this would be not really relevant. However, if you have for example a service parsing data like I do, it really does matter.

I was profiling my server application for performance optimization, and found that parsing an ISO timestamp using the date library was 3 times slower compared to parsing the whole (roughly 500 bytes) json document. In total parsing the timestamp accounted for about 4.8% of total CPU time.

On my quest to optimize this part, I did not find much with C++ that I would consider for a living product. And the code which I did consider further mostly had some dependencies (e.g. the ISO parser in CEPH looks ok and seems well tested).

In the end, I turned to good old C and stripped out some code from the SQLite date.c to make it work standalone. The difference:

date: 872ms

SQLite date.c: 54ms

(Profiled function weight of real life service application)

Here it is (all credits to SQLite):

The header file date_util.h

#include <stdint.h>
#include <stdbool.h>

#ifdef __cplusplus
extern "C" {
#endif

    // Calculates time since epoch including milliseconds
    uint64_t ParseTimeToEpochMillis(const char *str, bool *error);

    // Creates an ISO timestamp with milliseconds from epoch with millis.
    // The buffer size (resultLen) for result must be at least 100 bytes.
    void TimeFromEpochMillis(uint64_t epochMillis, char *result, int resultLen, bool *error);

#ifdef __cplusplus
}
#endif

This is the C file date_util.c:

#include "_date.h"
#include <ctype.h>
#include <stdio.h>
#include <stdarg.h>
#include <stdarg.h>
#include <assert.h>
#include <stdio.h>
#include <string.h>


/*
 ** A structure for holding a single date and time.
 */
typedef struct DateTime DateTime;
struct DateTime {
    int64_t iJD;        /* The julian day number times 86400000 */
    int Y, M, D;        /* Year, month, and day */
    int h, m;           /* Hour and minutes */
    int tz;             /* Timezone offset in minutes */
    double s;           /* Seconds */
    char validJD;       /* True (1) if iJD is valid */
    char rawS;          /* Raw numeric value stored in s */
    char validYMD;      /* True (1) if Y,M,D are valid */
    char validHMS;      /* True (1) if h,m,s are valid */
    char validTZ;       /* True (1) if tz is valid */
    char tzSet;         /* Timezone was set explicitly */
    char isError;       /* An overflow has occurred */
};

/*
 ** Convert zDate into one or more integers according to the conversion
 ** specifier zFormat.
 **
 ** zFormat[] contains 4 characters for each integer converted, except for
 ** the last integer which is specified by three characters.  The meaning
 ** of a four-character format specifiers ABCD is:
 **
 **    A:   number of digits to convert.  Always "2" or "4".
 **    B:   minimum value.  Always "0" or "1".
 **    C:   maximum value, decoded as:
 **           a:  12
 **           b:  14
 **           c:  24
 **           d:  31
 **           e:  59
 **           f:  9999
 **    D:   the separator character, or \000 to indicate this is the
 **         last number to convert.
 **
 ** Example:  To translate an ISO-8601 date YYYY-MM-DD, the format would
 ** be "40f-21a-20c".  The "40f-" indicates the 4-digit year followed by "-".
 ** The "21a-" indicates the 2-digit month followed by "-".  The "20c" indicates
 ** the 2-digit day which is the last integer in the set.
 **
 ** The function returns the number of successful conversions.
 */
static int GetDigits(const char *zDate, const char *zFormat, ...){
    /* The aMx[] array translates the 3rd character of each format
     ** spec into a max size:    a   b   c   d   e     f */
    static const uint16_t aMx[] = { 12, 14, 24, 31, 59, 9999 };
    va_list ap;
    int cnt = 0;
    char nextC;
    va_start(ap, zFormat);
    do{
        char N = zFormat[0] - '0';
        char min = zFormat[1] - '0';
        int val = 0;
        uint16_t max;

        assert( zFormat[2]>='a' && zFormat[2]<='f' );
        max = aMx[zFormat[2] - 'a'];
        nextC = zFormat[3];
        val = 0;
        while( N-- ){
            if( !isdigit(*zDate) ){
                goto end_getDigits;
            }
            val = val*10 + *zDate - '0';
            zDate++;
        }
        if( val<(int)min || val>(int)max || (nextC!=0 && nextC!=*zDate) ){
            goto end_getDigits;
        }
        *va_arg(ap,int*) = val;
        zDate++;
        cnt++;
        zFormat += 4;
    }while( nextC );
end_getDigits:
    va_end(ap);
    return cnt;
}

/*
 ** Parse a timezone extension on the end of a date-time.
 ** The extension is of the form:
 **
 **        (+/-)HH:MM
 **
 ** Or the "zulu" notation:
 **
 **        Z
 **
 ** If the parse is successful, write the number of minutes
 ** of change in p->tz and return 0.  If a parser error occurs,
 ** return non-zero.
 **
 ** A missing specifier is not considered an error.
 */
static int ParseTimezone(const char *zDate, DateTime *p){
    int sgn = 0;
    int nHr, nMn;
    int c;
    while( isspace(*zDate) ){ zDate++; }
    p->tz = 0;
    c = *zDate;
    if( c=='-' ){
        sgn = -1;
    }else if( c=='+' ){
        sgn = +1;
    }else if( c=='Z' || c=='z' ){
        zDate++;
        goto zulu_time;
    }else{
        return c!=0;
    }
    zDate++;
    if( GetDigits(zDate, "20b:20e", &nHr, &nMn)!=2 ){
        return 1;
    }
    zDate += 5;
    p->tz = sgn*(nMn + nHr*60);
zulu_time:
    while( isspace(*zDate) ){ zDate++; }
    p->tzSet = 1;
    return *zDate!=0;
}

/*
 ** Parse times of the form HH:MM or HH:MM:SS or HH:MM:SS.FFFF.
 ** The HH, MM, and SS must each be exactly 2 digits.  The
 ** fractional seconds FFFF can be one or more digits.
 **
 ** Return 1 if there is a parsing error and 0 on success.
 */
static int ParseHhMmSs(const char *zDate, DateTime *p){
    int h, m, s;
    double ms = 0.0;
    if( GetDigits(zDate, "20c:20e", &h, &m)!=2 ){
        return 1;
    }
    zDate += 5;
    if( *zDate==':' ){
        zDate++;
        if( GetDigits(zDate, "20e", &s)!=1 ){
            return 1;
        }
        zDate += 2;
        if( *zDate=='.' && isdigit(zDate[1]) ){
            double rScale = 1.0;
            zDate++;
            while( isdigit(*zDate) ){
                ms = ms*10.0 + *zDate - '0';
                rScale *= 10.0;
                zDate++;
            }
            ms /= rScale;
        }
    }else{
        s = 0;
    }
    p->validJD = 0;
    p->rawS = 0;
    p->validHMS = 1;
    p->h = h;
    p->m = m;
    p->s = s + ms;
    if( ParseTimezone(zDate, p) ) return 1;
    p->validTZ = (p->tz!=0)?1:0;
    return 0;
}

/*
 ** Put the DateTime object into its error state.
 */
static void DatetimeError(DateTime *p){
    memset(p, 0, sizeof(*p));
    p->isError = 1;
}

/*
 ** Convert from YYYY-MM-DD HH:MM:SS to julian day.  We always assume
 ** that the YYYY-MM-DD is according to the Gregorian calendar.
 **
 ** Reference:  Meeus page 61
 */
static void ComputeJD(DateTime *p){
    int Y, M, D, A, B, X1, X2;

    if( p->validJD ) return;
    if( p->validYMD ){
        Y = p->Y;
        M = p->M;
        D = p->D;
    }else{
        Y = 2000;  /* If no YMD specified, assume 2000-Jan-01 */
        M = 1;
        D = 1;
    }
    if( Y<-4713 || Y>9999 || p->rawS ){
        DatetimeError(p);
        return;
    }
    if( M<=2 ){
        Y--;
        M += 12;
    }
    A = Y/100;
    B = 2 - A + (A/4);
    X1 = 36525*(Y+4716)/100;
    X2 = 306001*(M+1)/10000;
    p->iJD = (int64_t)((X1 + X2 + D + B - 1524.5 ) * 86400000);
    p->validJD = 1;
    if( p->validHMS ){
        p->iJD += p->h*3600000 + p->m*60000 + (int64_t)(p->s*1000);
        if( p->validTZ ){
            p->iJD -= p->tz*60000;
            p->validYMD = 0;
            p->validHMS = 0;
            p->validTZ = 0;
        }
    }
}

/*
 ** Parse dates of the form
 **
 **     YYYY-MM-DD HH:MM:SS.FFF
 **     YYYY-MM-DD HH:MM:SS
 **     YYYY-MM-DD HH:MM
 **     YYYY-MM-DD
 **
 ** Write the result into the DateTime structure and return 0
 ** on success and 1 if the input string is not a well-formed
 ** date.
 */
static int ParseYyyyMmDd(const char *zDate, DateTime *p){
    int Y, M, D, neg;

    if( zDate[0]=='-' ){
        zDate++;
        neg = 1;
    }else{
        neg = 0;
    }
    if( GetDigits(zDate, "40f-21a-21d", &Y, &M, &D)!=3 ){
        return 1;
    }
    zDate += 10;
    while( isspace(*zDate) || 'T'==*(uint8_t*)zDate ){ zDate++; }
    if( ParseHhMmSs(zDate, p)==0 ){
        /* We got the time */
    }else if( *zDate==0 ){
        p->validHMS = 0;
    }else{
        return 1;
    }
    p->validJD = 0;
    p->validYMD = 1;
    p->Y = neg ? -Y : Y;
    p->M = M;
    p->D = D;
    if( p->validTZ ){
        ComputeJD(p);
    }
    return 0;
}

/* The julian day number for 9999-12-31 23:59:59.999 is 5373484.4999999.
 ** Multiplying this by 86400000 gives 464269060799999 as the maximum value
 ** for DateTime.iJD.
 **
 ** But some older compilers (ex: gcc 4.2.1 on older Macs) cannot deal with
 ** such a large integer literal, so we have to encode it.
 */
#define INT_464269060799999  ((((int64_t)0x1a640)<<32)|0x1072fdff)

/*
 ** Return TRUE if the given julian day number is within range.
 **
 ** The input is the JulianDay times 86400000.
 */
static int ValidJulianDay(int64_t iJD){
    return iJD>=0 && iJD<=INT_464269060799999;
}

/*
 ** Compute the Year, Month, and Day from the julian day number.
 */
static void ComputeYMD(DateTime *p){
    int Z, A, B, C, D, E, X1;
    if( p->validYMD ) return;
    if( !p->validJD ){
        p->Y = 2000;
        p->M = 1;
        p->D = 1;
    }else if( !ValidJulianDay(p->iJD) ){
        DatetimeError(p);
        return;
    }else{
        Z = (int)((p->iJD + 43200000)/86400000);
        A = (int)((Z - 1867216.25)/36524.25);
        A = Z + 1 + A - (A/4);
        B = A + 1524;
        C = (int)((B - 122.1)/365.25);
        D = (36525*(C&32767))/100;
        E = (int)((B-D)/30.6001);
        X1 = (int)(30.6001*E);
        p->D = B - D - X1;
        p->M = E<14 ? E-1 : E-13;
        p->Y = p->M>2 ? C - 4716 : C - 4715;
    }
    p->validYMD = 1;
}

/*
 ** Compute the Hour, Minute, and Seconds from the julian day number.
 */
static void ComputeHMS(DateTime *p){
    int s;
    if( p->validHMS ) return;
    ComputeJD(p);
    s = (int)((p->iJD + 43200000) % 86400000);
    p->s = s/1000.0;
    s = (int)p->s;
    p->s -= s;
    p->h = s/3600;
    s -= p->h*3600;
    p->m = s/60;
    p->s += s - p->m*60;
    p->rawS = 0;
    p->validHMS = 1;
}

/*
 ** Compute both YMD and HMS
 */
static void ComputeYMD_HMS(DateTime *p){
    ComputeYMD(p);
    ComputeHMS(p);
}

/*
 ** Input "r" is a numeric quantity which might be a julian day number,
 ** or the number of seconds since 1970.  If the value if r is within
 ** range of a julian day number, install it as such and set validJD.
 ** If the value is a valid unix timestamp, put it in p->s and set p->rawS.
 */
static void SetRawDateNumber(DateTime *p, double r){
    p->s = r;
    p->rawS = 1;
    if( r>=0.0 && r<5373484.5 ){
        p->iJD = (int64_t)(r*86400000.0 + 0.5);
        p->validJD = 1;
    }
}

/*
 ** Clear the YMD and HMS and the TZ
 */
static void ClearYMD_HMS_TZ(DateTime *p){
    p->validYMD = 0;
    p->validHMS = 0;
    p->validTZ = 0;
}

// modified methods to only calculate for and back between epoch and iso timestamp with millis

uint64_t ParseTimeToEpochMillis(const char *str, bool *error) {
    assert(str);
    assert(error);
    *error = false;
    DateTime dateTime;

    int res = ParseYyyyMmDd(str, &dateTime);
    if (res) {
        *error = true;
        return 0;
    }

    ComputeJD(&dateTime);
    ComputeYMD_HMS(&dateTime);

    // get fraction (millis of a full second): 24.355 => 355
    int millis = (dateTime.s - (int)(dateTime.s)) * 1000;
    uint64_t epoch = (int64_t)(dateTime.iJD/1000 - 21086676*(int64_t)10000) * 1000 + millis;

    return epoch;
}

void TimeFromEpochMillis(uint64_t epochMillis, char *result, int resultLen, bool *error) {
    assert(resultLen >= 100);
    assert(result);
    assert(error);

    int64_t seconds = epochMillis / 1000;
    int millis = epochMillis - seconds * 1000;
    DateTime x;

    *error = false;
    memset(&x, 0, sizeof(x));
    SetRawDateNumber(&x, seconds);

    /*
     **    unixepoch
     **
     ** Treat the current value of p->s as the number of
     ** seconds since 1970.  Convert to a real julian day number.
     */
    {
        double r = x.s*1000.0 + 210866760000000.0;
        if( r>=0.0 && r<464269060800000.0 ){
            ClearYMD_HMS_TZ(&x);
            x.iJD = (int64_t)r;
            x.validJD = 1;
            x.rawS = 0;
        }

        ComputeJD(&x);
        if( x.isError || !ValidJulianDay(x.iJD) ) {
            *error = true;
        }
    }

    ComputeYMD_HMS(&x);
    snprintf(result, resultLen, "%04d-%02d-%02dT%02d:%02d:%02d.%03dZ",
             x.Y, x.M, x.D, x.h, x.m, (int)(x.s), millis);
}

These two helper methods simply convert to and from a timestamp in with milliseconds. Setting a tm struct from the DateTime should be obvious.

Example usage:

// Calculate milliseconds since epoch
std::string timeStamp = "2019-09-02T22:02:24.355Z";
bool error;
uint64_t time = ParseTimeToEpochMillis(timeStamp.c_str(), &error);

// Get ISO timestamp with milliseconds component from epoch in milliseconds.
// Multiple by 1000 in case you have a standard epoch in seconds)
uint64_t epochMillis = 1567461744355; // == "2019-09-02T22:02:24.355Z"
char result[100] = {0};
TimeFromEpochMillis(epochMillis, result, sizeof(result), &error);
std::string resultStr(result); // == "2019-09-02T22:02:24.355Z"

New answer for old question. Rationale: updated tools.

Using this free, open source library, one can parse into a std::chrono::time_point<system_clock, milliseconds>, which has the advantage over a tm of being able to hold millisecond precision. And if you really need to, you can continue on to the C API via system_clock::to_time_t (losing the milliseconds along the way).

#include "date.h"
#include <iostream>
#include <sstream>

date::sys_time<std::chrono::milliseconds>
parse8601(std::istream&& is)
{
    std::string save;
    is >> save;
    std::istringstream in{save};
    date::sys_time<std::chrono::milliseconds> tp;
    in >> date::parse("%FT%TZ", tp);
    if (in.fail())
    {
        in.clear();
        in.exceptions(std::ios::failbit);
        in.str(save);
        in >> date::parse("%FT%T%Ez", tp);
    }
    return tp;
}

int
main()
{
    using namespace date;
    using namespace std;
    cout << parse8601(istringstream{"2014-11-12T19:12:14.505Z"}) << '\n';
    cout << parse8601(istringstream{"2014-11-12T12:12:14.505-5:00"}) << '\n';
}

This outputs:

2014-11-12 19:12:14.505
2014-11-12 17:12:14.505

Note that both outputs are UTC. The parse converted the local time to UTC using the -5:00 offset. If you actually want local time, there is also a way to parse into a type called date::local_time<milliseconds> which would then parse but ignore the offset. One can even parse the offset into a chrono::minutes if desired (using a parse overload taking minutes&).

The precision of the parse is controlled by the precision of the chrono::time_point you pass in, instead of by flags in the format string. And the offset can either be of the style +/-hhmm with %z, or +/-[h]h:mm with %Ez.


Modern C++ version of parse ISO 8601* function

* - this code supports only subset of ISO 8601. The only supported forms are "2020-09-19T05:12:32Z" and "2020-09-19T05:12:32.123Z". Milliseconds can be 3 digit length or no milliseconds part at all, no timezone except Z, no other more rare features.

#include <cstdlib>
#include <ctime>
#include <string>

#ifdef _WIN32
#define timegm _mkgmtime
#endif

inline int ParseInt(const char* value)
{
    return std::strtol(value, nullptr, 10);
}

// ParseISO8601 returns milliseconds since 1970

std::time_t ParseISO8601(const std::string& input)
{
    constexpr const size_t expectedLength = sizeof("1234-12-12T12:12:12Z") - 1;
    static_assert(expectedLength == 20, "Unexpected ISO 8601 date/time length");

    if (input.length() < expectedLength)
    {
        return 0;
    }

    std::tm time = { 0 };
    time.tm_year = ParseInt(&input[0]) - 1900;
    time.tm_mon = ParseInt(&input[5]) - 1;
    time.tm_mday = ParseInt(&input[8]);
    time.tm_hour = ParseInt(&input[11]);
    time.tm_min = ParseInt(&input[14]);
    time.tm_sec = ParseInt(&input[17]);
    time.tm_isdst = 0;
    const int millis = input.length() > 20 ? ParseInt(&input[20]) : 0;
    return timegm(&time) * 1000 + millis;
}