When does a statistical consultant become a co-author or collaborator?

This is not an easy question, as are many other questions about coauthorship. To an extent, all answers will be opinion-based.

I am in a similar position. I work in industry but advise many of my wife's clinical & biological psychology students on statistical matters.

My personal cutoff point is somewhat later than yours seems to be. I'll happily invest three or four hours to help students clean up and understand their data, do some exploratory data analysis and plots and some simple models and release them into the wild with an initial R script. If that is the full extent of my engagement, I would be uncomfortable with coauthorship (but an acknowledgement would be appropriate).

Often, things go further than this. After this initial session, I'll code up more complex analyses, more sophisticated graphics, maybe research special models or approaches and/or read up on stuff. This usually involves some work on my own, multiple email exchanges and more personal meetings. At this point, we usually agree that I get coauthorship, and as you write, this means that I'll be a lot more involved in the rest of the manuscript's lifecycle.

I'll usually not write up the statistical methods section as you appear to do. My take is that the first author should have full ownership of the manuscript and should essentially understand everything in sufficient depth to describe even the statistics himself. After all, he will be the one to present it at conferences and/or thesis defenses. (It helps that psychologists get a lot of statistical training.) Of course, I'll go over and correct the methods sections, and I happily rarely find that the student fundamentally misunderstood the statistics.

If things have progressed far enough for me to be a coauthor, I'll usually be quite involved with the rest of the manuscript, too. I am rather anal-retentive and will happily nitpick the entire structure of the paper, the internal logic, grammer and punctuation. I don't know whether nitpickery is more generally a specific skill of statisticians, though... Of course, in maybe half of all submissions, reviewers ask about the statistics, and I often have to revisit the statistical analysis and/or at least its description.


Coauthorships arrived at in this way are actually pretty low-effort for me. Other people on the author list have spent weeks interviewing people in African refugee camps, juggling noxious chemicals in a lab and/or digging through prior publications. I, on the other hand, spend maybe one full work week altogether (sometimes more), sitting in a comfortable chair at my computer with a cup of coffee within reach. It's comparisons like these, and I'd guess that most statistical engagements are similar, that make me uncomfortable starting to discuss coauthorship after only one hour.


Laying down ground rules early on, as you suggest, is an excellent idea. Part of these should be how much time you can afford before thinking about coauthorship, whether this is 20 minutes or four hours.

One other thing, which the blog post you link to discusses, is that you as the statistician will need to understand the scientific context, so the researcher will have to spend some time explaining the situation to you. This time will need to come out of the initial "budget", since time does not grow on trees. Your client will usually try to handwave this away and insist that he only has "a little statistical question". This will usually not make much sense. If a client insists, you can always point him to CrossValidated for his "little question". If he gets a good answer there, great.


I went and asked this question on the American Statistical Association's mailing list. If you have a login to the ASA's website, you can view the thread here. I'll paste the answers in here without names (and cleaning up a bit):


Unless the consulting was very minor (in 10 minutes), the statistical consultant should in my opinion be one of the coauthors. It is a matter of ethics and not a issue of being paid as a consultant. When I do the modeling and the analysis and the interpretation, I expect to be co-author. Sometimes I am the lead author. It seems to be a fight with the claim that "it is your job to do the consulting". I simply refuse such attitudes from faculty or clients. The way I see it is that without our services, there would not exist a paper anyways. Therefore, we must be co-authors.


I usually leave that decision up to the research leader and have fared well over the years. But then again, I consult in the area of agriculture, food, and natural resources, where the competition for attention is not quite as cut throat.

Personally, my benchmark is based on the answer to the question ' would this publication have seen the light of day without my involvement'? If the answer is no, then I should be a coauthor. If the answer is maybe or yes, then co-authorship is not really warranted.

An example, recently I spent quite a bit of time helping an author revise a manuscript that had been rejected, only to have the 'statistical expert' of the journal assume the 'my way or the highway' attitude. I suggested that the author not fight the 'expert' and get the manuscript published. The author had included me as a coauthor but I asked that my name be removed because of my personal criterion.


I generally agree with everyone so far but would like to add two little bits, that are apropos of co-authoring in general.

The initial question is very easy to answer — if you made a contribution you should be included as a co-author.

The next question is where on the author list should your name appear. When I was young I thought it should be based on the size of the contribution — I once argued with a colleague about who should be first. He felt that he should be since he provided the data. I claimed that the artistry was mine, and had he ever seen a painting signed Sherman Williams & Pablo Picasso? He prevailed.

As I got older I used a more Marxian approach - from each according to their ability, to each according to their need. What this meant was that I tended to put student or more junior authors first and put myself at the end. I have never regretted this choice.


There is, I think, no algorithmic answer to this question. The general principle is clear enough, anyone who has made an important intellectual contribution to the work should be listed as co-author. Ghost authorship is, in principle, as problematic as honorary authorship. The listed authors get artificially high credit if there are ghost authors. But then, one also has a duty to take part in the whole process with the manuscript. What constitutes an important intellectual contribution depends on context, the same contribution may be sufficient in a short article and too small in a larger and more complex work, usually the first author and/or the project supervisor should make that judgement.

There are also cases where an author has made an important contribution, but cannot agree with the conclusion and/or important methodological decisions. Then one cannot be listed as author. For instance, I have on some occasions made it clear that I could not be listed as author if the article included procedures such as stepwise regression, last observation carried forward or repeated measures anova.


Very interesting thread! I proposed a roundtable lunch on this topic for JSM Vancouver but it didn't enroll enough to take place. I'm glad to know that I was not wrong that there was some interest!

My own idea, which I got from several sources is:

  • I'm happy to meet for an hour with any colleague who has a stat question, just to be a good colleague and as a community service.
  • If we meet a second time, I would like some acknowledgment ... a mention in a footnote, or a note to the Dean, or some other professional marker.
  • Before we meet a third time, I ask that to continue I would like to be a co-author because it's rarely exactly three times ... the count seems to go once, twice, many times. I am also then willing to work on the project in ways other than just meetings with the primary authors.

If I barely know the client and have to assume that his understanding of the non-statistical subject matter is correct, I often prefer an acknowledgement to co-authorship. The same applies if I am presented with an experiment that was already run and don't want people to blame the design of the experiment on me.


There was also a recommendation of Parker & Berman, "Criteria for authorship for statisticians in medical papers", Statistics in Medicine, 1998, 17, 2289-2299, which I don't have access to at the moment.

In addition, many societies and professional bodies have general guidelines or criteria for authorship, although these may not discuss the role of statisticians explicitly. For instance, those of the International Committee of Medical Journal Editors were recommended twice (and do not discuss statisticians).


EDIT 2018-08-08: there were no more new replies to that ASA mailing list thread.


I've recently dealt with this, and agree that this is largely subjective. Your proposal, to implement clear rules at the start, is highly recommended. I tend to do an "initial consultation" for free: an hour where we just sit and have coffee. If I'm interested, and have time, I'll suggest that I'd love to participate, but have to make sure that participation is respectful to myself and my other responsibilities. For example (as a professor), I'll fall back on something like "I'd love to contribute, but I need to justify the time expenditure to my department chair/dean/tenure committee/spouse/dog. Would you be comfortable treating my help as a collaboration that leads to authorship?" I tend to expound on a "minimum" contribution needed for that, write an e-mail summing up our conversation, and go from there.

One other thing to consider in these conversations are standards of ethics put forth by different professional organizations. I often work with psychologists, so I tend to lean on the APA standards (which I'll base the rest of my response on). A quote from the main page:

Authorship credit should reflect the individual's contribution to the study. An author is considered anyone involved with initial research design, data collection and analysis, manuscript drafting, and final approval.

On the website, this is directly contrasted with: funding, mentorship, and not participating in the actual publication. The last one is tricky, and how I interpret it is: if you aren't using analysis that I ran/interpreted, my statistical tables, any graphics I made, or any of my writing (obviously), then I'm not contributing. From my perspective, though, if you use even one of those things in the manuscript/presentation, I have contributed to the manuscript in a tangible way, and should be included. I feel obligated to mention (as this has happened) that, from my perspective, if you take my code and change the color of the plot and include it, you're still presenting a product of someone else (and need to provide credit for that).

I believe the need to provide credit is the primary consideration. If you have a published software, you shouldn't be given authorship as credit for its use (as a citation to the software is sufficient). If you have a paper on a unique method, you shouldn't be given authorship as credit for its use. Now, if you designed a program or statistic, you probably should be given authorship, as there isn't another appropriate way to provide credit (an acknowledgement doesn't count for that, in my opinion).

Speaking of that, I believe acknowledgements should come in for a small contribution that doesn't result in authorship (maybe data cleaning, data collection, etc). Notice that these have no writing and no tangibles of this will be used directly in the manuscript. If someone does something "monotonous" and writes, though (say, a lit review), they should absolutely be included as an author.

All said and done, having the conversation up front should indicate the type of compensation you get (and if you don't feel comfortable building a custom database from scratch for an acknowledgement, it is better to know that up front). Establishing a minimum, tangible contribution for that compensation establishes clarity for all researchers.