How are paper authors uniquely identified?

Aside from ORCID (which by far not every paper and person has), there really is no sure-fire way to uniquely identify an author. Using the name becomes problematic with common names (not unusual anywhere in the world, but a particularly common issue in Asia) or name changes (for instance in case of marriage). Combining with affiliation and e-mail address will also only get you so far as most academics tend to change universities at least once or twice in their career, and both affiliation and e-mail address tend to change in these cases.

For bibliographic research, the most promising approach is probably to combine all of the above with field information (e.g., a Markus Huber publishing in medicine is not particularly likely to be the same as a Markus Huber publishing in philosophy), and train some sort of heuristic classifier. Clearly, false positives/negatives will happen, but if your goal is to holistically assess a larger field of research a few false categorizations are unlikely to impact the overall picture too much.

If your goal is to assess an individual researcher, really the most accurate information is usually to trust what information the researchers themselves maintain (e.g., a CV or publicly available publication list).


This is exactly what ORCID tries to achieve:

ORCID is a nonprofit organization helping create a world in which all who participate in research, scholarship and innovation are uniquely identified and connected to their contributions and affiliations, across disciplines, borders, and time. (from their website)

However, not everybody is aware of this initiative or cares enough to set up an ORCID for themselves. Some journals request ORCIDs upon submission, e.g. for Nature Methods each Corresponding authors needs to have an ORCID. The problem with using other information to identify researcher, is that this information can change as opposed to a uniquely assigned number.