Convert JavaScript string in dot notation into an object reference

recent note: While I'm flattered that this answer has gotten many upvotes, I am also somewhat horrified. If one needs to convert dot-notation strings like "x.a.b.c" into references, it could (maybe) be a sign that there is something very wrong going on (unless maybe you're performing some strange deserialization).

That is to say, novices who find their way to this answer must ask themselves the question "why am I doing this?"

It is of course generally fine to do this if your use case is small and you will not run into performance issues, AND you won't need to build upon your abstraction to make it more complicated later. In fact, if this will reduce code complexity and keep things simple, you should probably go ahead and do what OP is asking for. However, if that's not the case, consider if any of these apply:

case 1: As the primary method of working with your data (e.g. as your app's default form of passing objects around and dereferencing them). Like asking "how can I look up a function or variable name from a string".

  • This is bad programming practice (unnecessary metaprogramming specifically, and kind of violates function side-effect-free coding style, and will have performance hits). Novices who find themselves in this case, should instead consider working with array representations, e.g. ['x','a','b','c'], or even something more direct/simple/straightforward if possible: like not losing track of the references themselves in the first place (most ideal if it's only client-side or only server-side), etc. (A pre-existing unique id would be inelegant to add, but could be used if the spec otherwise requires its existence regardless.)

case 2: Working with serialized data, or data that will be displayed to the user. Like using a date as a string "1999-12-30" rather than a Date object (which can cause timezone bugs or added serialization complexity if not careful). Or you know what you're doing.

  • This is maybe fine. Be careful that there are no dot strings "." in your sanitized input fragments.

If you find yourself using this answer all the time and converting back and forth between string and array, you may be in the bad case, and should consider an alternative.

Here's an elegant one-liner that's 10x shorter than the other solutions:

function index(obj,i) {return obj[i]}
'a.b.etc'.split('.').reduce(index, obj)

[edit] Or in ECMAScript 6:

'a.b.etc'.split('.').reduce((o,i)=>o[i], obj)

(Not that I think eval always bad like others suggest it is (though it usually is), nevertheless those people will be pleased that this method doesn't use eval. The above will find obj.a.b.etc given obj and the string "a.b.etc".)

In response to those who still are afraid of using reduce despite it being in the ECMA-262 standard (5th edition), here is a two-line recursive implementation:

function multiIndex(obj,is) {  // obj,['1','2','3'] -> ((obj['1'])['2'])['3']
    return is.length ? multiIndex(obj[is[0]],is.slice(1)) : obj
}
function pathIndex(obj,is) {   // obj,'1.2.3' -> multiIndex(obj,['1','2','3'])
    return multiIndex(obj,is.split('.'))
}
pathIndex('a.b.etc')

Depending on the optimizations the JS compiler is doing, you may want to make sure any nested functions are not re-defined on every call via the usual methods (placing them in a closure, object, or global namespace).

edit:

To answer an interesting question in the comments:

how would you turn this into a setter as well? Not only returning the values by path, but also setting them if a new value is sent into the function? – Swader Jun 28 at 21:42

(sidenote: sadly can't return an object with a Setter, as that would violate the calling convention; commenter seems to instead be referring to a general setter-style function with side-effects like index(obj,"a.b.etc", value) doing obj.a.b.etc = value.)

The reduce style is not really suitable to that, but we can modify the recursive implementation:

function index(obj,is, value) {
    if (typeof is == 'string')
        return index(obj,is.split('.'), value);
    else if (is.length==1 && value!==undefined)
        return obj[is[0]] = value;
    else if (is.length==0)
        return obj;
    else
        return index(obj[is[0]],is.slice(1), value);
}

Demo:

> obj = {a:{b:{etc:5}}}

> index(obj,'a.b.etc')
5
> index(obj,['a','b','etc'])   #works with both strings and lists
5

> index(obj,'a.b.etc', 123)    #setter-mode - third argument (possibly poor form)
123

> index(obj,'a.b.etc')
123

...though personally I'd recommend making a separate function setIndex(...). I would like to end on a side-note that the original poser of the question could (should?) be working with arrays of indices (which they can get from .split), rather than strings; though there's usually nothing wrong with a convenience function.


A commenter asked:

what about arrays? something like "a.b[4].c.d[1][2][3]" ? –AlexS

Javascript is a very weird language; in general objects can only have strings as their property keys, so for example if x was a generic object like x={}, then x[1] would become x["1"]... you read that right... yup...

Javascript Arrays (which are themselves instances of Object) specifically encourage integer keys, even though you could do something like x=[]; x["puppy"]=5;.

But in general (and there are exceptions), x["somestring"]===x.somestring (when it's allowed; you can't do x.123).

(Keep in mind that whatever JS compiler you're using might choose, maybe, to compile these down to saner representations if it can prove it would not violate the spec.)

So the answer to your question would depend on whether you're assuming those objects only accept integers (due to a restriction in your problem domain), or not. Let's assume not. Then a valid expression is a concatenation of a base identifier plus some .identifiers plus some ["stringindex"]s

This would then be equivalent to a["b"][4]["c"]["d"][1][2][3], though we should probably also support a.b["c\"validjsstringliteral"][3]. You'd have to check the ecmascript grammar section on string literals to see how to parse a valid string literal. Technically you'd also want to check (unlike in my first answer) that a is a valid javascript identifier.

A simple answer to your question though, if your strings don't contain commas or brackets, would be just be to match length 1+ sequences of characters not in the set , or [ or ]:

> "abc[4].c.def[1][2][\"gh\"]".match(/[^\]\[.]+/g)
// ^^^ ^  ^ ^^^ ^  ^   ^^^^^
["abc", "4", "c", "def", "1", "2", ""gh""]

If your strings don't contain escape characters or " characters, and because IdentifierNames are a sublanguage of StringLiterals (I think???) you could first convert your dots to []:

> var R=[], demoString="abc[4].c.def[1][2][\"gh\"]";
> for(var match,matcher=/^([^\.\[]+)|\.([^\.\[]+)|\["([^"]+)"\]|\[(\d+)\]/g; 
      match=matcher.exec(demoString); ) {
  R.push(Array.from(match).slice(1).filter(x=>x!==undefined)[0]);
  // extremely bad code because js regexes are weird, don't use this
}
> R

["abc", "4", "c", "def", "1", "2", "gh"]

Of course, always be careful and never trust your data. Some bad ways to do this that might work for some use cases also include:

// hackish/wrongish; preprocess your string into "a.b.4.c.d.1.2.3", e.g.: 
> yourstring.replace(/]/g,"").replace(/\[/g,".").split(".")
"a.b.4.c.d.1.2.3"  //use code from before

Special 2018 edit:

Let's go full-circle and do the most inefficient, horribly-overmetaprogrammed solution we can come up with... in the interest of syntactical purityhamfistery. With ES6 Proxy objects!... Let's also define some properties which (imho are fine and wonderful but) may break improperly-written libraries. You should perhaps be wary of using this if you care about performance, sanity (yours or others'), your job, etc.

// [1,2,3][-1]==3 (or just use .slice(-1)[0])
if (![1][-1])
    Object.defineProperty(Array.prototype, -1, {get() {return this[this.length-1]}}); //credit to caub

// WARNING: THIS XTREME™ RADICAL METHOD IS VERY INEFFICIENT,
// ESPECIALLY IF INDEXING INTO MULTIPLE OBJECTS,
// because you are constantly creating wrapper objects on-the-fly and,
// even worse, going through Proxy i.e. runtime ~reflection, which prevents
// compiler optimization

// Proxy handler to override obj[*]/obj.* and obj[*]=...
var hyperIndexProxyHandler = {
    get: function(obj,key, proxy) {
        return key.split('.').reduce((o,i)=>o[i], obj);
    },
    set: function(obj,key,value, proxy) {
        var keys = key.split('.');
        var beforeLast = keys.slice(0,-1).reduce((o,i)=>o[i], obj);
        beforeLast[keys[-1]] = value;
    },
    has: function(obj,key) {
        //etc
    }
};
function hyperIndexOf(target) {
    return new Proxy(target, hyperIndexProxyHandler);
}

Demo:

var obj = {a:{b:{c:1, d:2}}};
console.log("obj is:", JSON.stringify(obj));

var objHyper = hyperIndexOf(obj);
console.log("(proxy override get) objHyper['a.b.c'] is:", objHyper['a.b.c']);
objHyper['a.b.c'] = 3;
console.log("(proxy override set) objHyper['a.b.c']=3, now obj is:", JSON.stringify(obj));

console.log("(behind the scenes) objHyper is:", objHyper);

if (!({}).H)
    Object.defineProperties(Object.prototype, {
        H: {
            get: function() {
                return hyperIndexOf(this); // TODO:cache as a non-enumerable property for efficiency?
            }
        }
    });

console.log("(shortcut) obj.H['a.b.c']=4");
obj.H['a.b.c'] = 4;
console.log("(shortcut) obj.H['a.b.c'] is obj['a']['b']['c'] is", obj.H['a.b.c']);

Output:

obj is: {"a":{"b":{"c":1,"d":2}}}

(proxy override get) objHyper['a.b.c'] is: 1

(proxy override set) objHyper['a.b.c']=3, now obj is: {"a":{"b":{"c":3,"d":2}}}

(behind the scenes) objHyper is: Proxy {a: {…}}

(shortcut) obj.H['a.b.c']=4

(shortcut) obj.H['a.b.c'] is obj['a']['b']['c'] is: 4

inefficient idea: You can modify the above to dispatch based on the input argument; either use the .match(/[^\]\[.]+/g) method to support obj['keys'].like[3]['this'], or if instanceof Array, then just accept an Array as input like keys = ['a','b','c']; obj.H[keys].


Per suggestion that maybe you want to handle undefined indices in a 'softer' NaN-style manner (e.g. index({a:{b:{c:...}}}, 'a.x.c') return undefined rather than uncaught TypeError)...:

1) This makes sense from the perspective of "we should return undefined rather than throw an error" in the 1-dimensional index situation ({})['e.g.']==undefined, so "we should return undefined rather than throw an error" in the N-dimensional situation.

2) This does not make sense from the perspective that we are doing x['a']['x']['c'], which would fail with a TypeError in the above example.

That said, you'd make this work by replacing your reducing function with either:

(o,i)=>o===undefined?undefined:o[i], or (o,i)=>(o||{})[i].

(You can make this more efficient by using a for loop and breaking/returning whenever the subresult you'd next index into is undefined, or using a try-catch if you expect such failures to be sufficiently rare.)


If you can use lodash, there is a function, which does exactly that:

_.get(object, path, [defaultValue])

var val = _.get(obj, "a.b");

you could also use lodash.get

You just install this package (npm i --save lodash.get) and then use it like this:

const get = require('lodash.get');

const myObj = { user: { firstName: 'Stacky', lastName: 'Overflowy' }, id: 123 };

console.log(get(myObj, 'user.firstName')); // prints Stacky
console.log(get(myObj, 'id')); //prints  123

//You can also update values
get(myObj, 'user').firstName = John;

Tags:

Javascript