Why hasn't functional programming taken over yet?
Because all those advantages are also disadvantages.
Stateless programs; No side effects
Real-world programs are all about side effects and mutation. When the user presses a button it's because they want something to happen. When they type in something, they want that state to replace whatever state used to be there. When Jane Smith in accounting gets married and changes her name to Jane Jones, the database backing the business process that prints her paycheque had better be all about handling that sort of mutation. When you fire the machine gun at the alien, most people do not mentally model that as the construction of a new alien with fewer hit points; they model that as a mutation of an existing alien's properties.
When the programming language concepts fundamentally work against the domain being modelled, it's hard to justify using that language.
Concurrency; Plays extremely nice with the rising multi-core technology
The problem is just pushed around. With immutable data structures you have cheap thread safety at the cost of possibly working with stale data. With mutable data structures you have the benefit of always working on fresh data at the cost of having to write complicated logic to keep the data consistent. It's not like one of those is obviously better than the other.
Programs are usually shorter and in some cases easier to read
Except in the cases where they are longer and harder to read. Learning how to read programs written in a functional style is a difficult skill; people seem to be much better at conceiving of programs as a series of steps to be followed, like a recipe, rather than as a series of calculations to be carried out.
Productivity goes up (example: Erlang)
Productivity has to go up a lot in order to justify the massive expense of hiring programmers who know how to program in a functional style.
And remember, you don't want to throw away a working system; most programmers are not building new systems from scratch, but rather maintaining existing systems, most of which were built in non-functional languages. Imagine trying to justify that to shareholders. Why did you scrap your existing working payroll system to build a new one at the cost of millions of dollars? "Because functional programming is awesome" is unlikely to delight the shareholders.
Imperative programming is a very old paradigm (as far as I know) and possibly not suitable for the 21th century
Functional programming is very old too. I don't see how the age of the concept is relevant.
Don't get me wrong. I love functional programming, I joined this team because I wanted to help bring concepts from functional programming into C#, and I think that programming in an immutable style is the way of the future. But there are enormous costs to programming in a functional style that can't simply be wished away. The shift towards a more functional style is going to happen slowly and gradually over a period of decades. And that's what it will be: a shift towards a more functional style, not a wholesale embracing of the purity and beauty of Haskell and the abandoning of C++.
I build compilers for a living and we are definitely embracing a functional style for the next generation of compiler tools. That's because functional programming is fundamentally a good match for the sorts of problems we face. Our problems are all about taking in raw information -- strings and metadata -- and transforming them into different strings and metadata. In situations where mutations occur, like someone is typing in the IDE, the problem space inherently lends itself to functional techniques such as incrementally rebuilding only the portions of the tree that changed. Many domains do not have these nice properties that make them obviously amenable to a functional style.
Masterminds of Programming: Conversations with the Creators of Major Programming Languages
[Haskell]
Why do you think no functional programming language has entered the mainstream?
John Hughes: Poor marketing! I don't mean propaganda; we've had plenty of that. I mean a careful choice of a target market niche to dominate, followed by a determined effort to make functional programming by far the most effective way to address that niche. In the happy days of the 80s, we thought functional programming was good for everything - but calling new technology "good for everything" is the same as calling it "particularly good at nothing". What's the brand supposed to be? This is a problem that John Launchbury described very clearly in his invited talk at ICFP. Galois Connections nearly went under when their brand was "software in functional languages," but they've gone from strength to strength since focusing on "high-assurance software."
Many people have no idea how technological innovation happens, and expect that better technology will simply become dominant all by itself (the "better mousetrap" effect), but the world's just not like that.
The stock answer is that neither will or should replace the other - they are different tools that have different sets of pros and cons, and which approach has the edge will differ depending on the project and other "soft" issues like the available talent pool.
I think you're right that the growth of concurrency due to multi-core will increase the percentage (of the global set of development projects) when functional programming is chosen over other styles.
I think it's rare today because the majority of today's professional talent pool is most comfortable with imperative and object-oriented technologies. For example, I have more than once chosen Java as the language for a commercial project because it was good enough, non-controversial, and I know that I will never run out of people that can program (well enough) in it.