Why is the application of probability in QM fundamentally different from application of probability in other areas?
The theory of probability used in QM is intrinsically different from the one commonly used for the following reason: The space of events is non-commutative (more properly non-Boolean) and this fact deeply affects the conditional probability theory. The probability that A happens if B happened is computed differently in classical probability theory and in quantum theory, when A and B are quantum incompatible events. In both cases probability is a measure on a lattice, but, in the classical case, the lattice is a Boolean one (a $\sigma$-algebra), in the quantum case it is not.
To be clearer, classical probability is a map $\mu: \Sigma(X) \to [0,1]$ such that $\Sigma(X)$ is a class of subsets of the set $X$ including $\emptyset$, closed with respect to the complement and the countable union, and such that $\mu(X)=1$ and: $$\mu(\cup_{n\in \mathbb N}E_n) = \sum_n \mu(E_n)\quad \mbox{if $E_k \in \Sigma(X)$ with $E_p\cap E_q= \emptyset$ for $p\neq q$.}$$ The elements of $\Sigma(X)$ are the events whose probability is $\mu$. In this view, for instance, if $E,F \in \Sigma(X)$, $E\cap F$ is logically interpreted as the event "$E$ AND $F$". Similarly $E\cup F$ corresponds to "$E$ OR $F$" and $X\setminus F$ has the meaning of "NOT $F$" and so on. The probability of $P$ when $Q$ is given verifies $$\mu(P|Q) = \frac{\mu(P \cap Q)}{\mu(Q)}\:.\tag{1}$$
If you instead consider a quantum system, there are "events", i.e. elementary "yes/no" propositions experimentally testable, that cannot by joined by logical operators AND and OR.
An example is $P=$"the $x$ component of the spin of this electron is $1/2$" and $Q=$"the $y$ component is $1/2$". There is no experimental device able to assign a truth value to $P$ and $Q$ simultaneously, so that elementary propositions as "$P$ and $Q$" make no sense. Pairs of propositions like $P$ and $Q$ above are physically incompatible.
In quantum theories (the most elementary version due to von Neumann), the events of a physical system are represented by the orthogonal projectors of a separable Hilbert space $H$. The set ${\cal P}(H)$ of those operators replaces the classical $\Sigma(X)$.
In general, the meaning of $P\in {\cal P}(H)$ is something like "the value of the observable $Z$ belongs to the subset $I \subset \mathbb R$" for some observable $Z$ and some set $I$. There is a procedure to integrate such a class of projectors labelled on real subsets to construct a self-adjoint operator $\hat{Z}$ associated to the observable $Z$, and this is nothing but the physical meaning of the spectral decomposition theorem.
If $P, Q \in {\cal P}(H)$, there are two possibilities: $P$ and $Q$ commute or they do not.
Von Neumann's fundamental axiom states that commutativity is the mathematically corresponding of physical compatibility.
When $P$ and $Q$ commutes, $PQ$ and $P+Q-PQ$ still are orthogonal projectors, that is elements of ${\cal P}(H)$.
In this situation, $PQ$ corresponds to "$P$ AND $Q$", whereas $P+Q-PQ$ corresponds to "$P$ OR $Q$" and so on, in particular "NOT $P$" is always interpreted as the orthogonal projector onto $P(H)^\perp$ (the orthogonal subspace of $P(H)$), and all classical formalism holds true this way. As a matter of fact, a maximal set of pairwise commuting projectors has formal properties identical to those of classical logic: is a Boolean $\sigma$-algebra.
In this picture, a quantum state is a map assigning the probability $\mu(P)$ that $P$ is experimentally verified to every $P\in {\cal P}(H)$. It has to satisfy: $\mu(I)=1$ and $$\mu\left(\sum_{n\in \mathbb N}P_n\right) = \sum_n \mu(P_n)\quad \mbox{if $P_k \in {\cal P}(H)$ with $P_p P_q= P_qP_p =0$ for $p\neq q$.}$$
Celebrated Gleason's Theorem, establishes that, if $\text{dim}(H)\neq 2$, the measures $\mu$ are all of the form $\mu(P)= \text{tr}(\rho_\mu P)$ for some mixed state $\rho_\mu$ (a positive trace-class operator with unit trace), biunivocally determined by $\mu$. In the convex set of states, the extremal elements are the standard pure states. They are determined, up to a phase, by unit vectors $\psi \in H$, so that, with some trivial computation (completing $\psi_\mu$ to an orthonormal basis of $H$ and using that basis to compute the trace), $$\mu(P) = \langle \psi_\mu | P \psi_\mu \rangle = ||P \psi_\mu||^2\:.$$
(Nowadays, there is a generalized version of this picture, where the set ${\cal P}(H)$ is replaced by the class of bounded positive operators in $H$ (the so-called "effects") and Gleason's theorem is replaced by Busch's theorem with a very similar statement.)
Quantum probability is therefore given by the map, for a given generally mixed state $\rho$, $${\cal P}(H) \ni P \mapsto \mu(P) =\text{tr}(\rho_\mu P) $$
It is clear that, as soon as one deals with physically incompatible propositions, (1) cannot hold just because there is nothing like $P \cap Q$ in the set of physically sensible quantum propositions. All that is due to the fact that the space of events ${\cal P}(H)$ is now a non-commutative set of projectors, giving rise to a non-Boolean lattice.
The formula replacing (1) is now:
$$\mu(P|Q) = \frac{\text{tr}(\rho_\mu QPQ)}{\text{tr}(\rho_\mu Q)}\tag{2}\:.$$
Therein, $QPQ$ is an orthogonal projector and can be interpreted as "$P$ AND $Q$" (i.e., $P\cap Q$) when $P$ and $Q$ are compatible. In this case (1) holds true again. (2) gives rise to all "strange things" showing up in quantum experiments (as in the double slit one). In particular the fact that, in QM, probabilities are computed by combining complex probability amplitudes arises from (2).
(2) just relies upon the von Neumann-Luders reduction postulate stating that, if the outcome of the measurement of $P\in {\cal P}(H)$ is YES when the state was $\mu$ (i.e., $\rho_\mu$), the the state immediately after the measurement is $\mu'$ associated to $\rho_{\mu'}$ with
$$\rho_{\mu'} := \frac{P\rho_\mu P}{\text{tr}(\rho_\mu P)}\:.$$
ADDENDUM. Actually, it is possible to extend the notion of logical operators AND and OR for all pairs of elements in ${\cal P}(H)$ and that was the program of von Neumann and Birkhoff (the quantum logic). In fact just the lattice structure of ${\cal P}(H)$ permits it, or better is it. With this extended notion of AND and OR, "$P$ AND $Q$" is the orthogonal projector onto $P(H)\cap Q(H)$ whereas "$P$ OR $Q$" is the orthogonal projector onto the closure of the space $P(H)+Q(H)$. When $P$ and $Q$ commute these notions of AND and OR reduce to the standard ones. However, with the extended definitions, ${\cal P}(H)$ becomes a lattice in the proper mathematical sense, where the partial order relation is given by the standard inclusion of closed subspaces ($P \geq Q$ means $P(H) \supset Q(H)$). The point is that the physical interpretation of this extension of AND and OR is not clear. The resulting lattice is however non-Boolean. In other words, for instance, these extended AND and OR are not distributive as the standard AND and OR are (this reveals their quantum nature). However, also keeping the definition of "NOT $P$" as the orthogonal projector onto $P(H)^\perp$, the found structure of ${\cal P}(H)$ is well known: A $\sigma$-complete, bounded, orthomodular, separable, atomic, irreducible and verifying the covering property, lattice. Around 1995 it was definitely proved, by Solér, a conjecture due to von Neumann stating that there are only three possibilities for practically realizing such lattices: The lattice of orthogonal projectors in a separable complex Hilbert space, the lattice of orthogonal projectors in a separable real Hilbert space, the lattice of orthogonal projectors in a separable quaternionic Hilbert space.
Gleason's theorem is valid in the three cases. The extension to the quaternionc case was obtained by Varadarajan in his famous book 1 on the geometry of quantum theory, however a gap in his proof has been fixed in this published paper I have co-authored 2
Assuming Poincaré symmetry, at least for elementary systems (elementary particles), the case of real and quaternionic Hilbert spaces can be ruled out (here is a pair of published works I have co-authored on the subject: 3 and 4).
ADDENDUM2. After a discussion with Harry Johnston, I think that an interpretative remark is worth to be mentioned about the probabilistic content of the state $\mu$ within the picture I illustrated above. In QM $\mu(P)$ is the probability that, if I performed a certain experiment (in order to check $P$), $P$ would turn out to be true. It seems that there is here a difference with respect to the classical notion of probability applied to classical systems. There, probability mainly refers to something already existent (and to our incomplete knowledge of it). In the formulation of QM I presented above, probability instead refers to that which will happen if...
ADDENDUM3. For $n=1$ the theorem of Gleason is valid and trivial. For $n=2$ there is known counterexample. $\mu_\nu(P)= \frac{1}{2}(1+ (v \cdot n_P)^3)$ where $v$ is a unit vector in $\mathbb R^3$ and $n_P$ is the unit vector in $\mathbb R^3$ associated to the orthogonal projector $P: \mathbb C^2 \to \mathbb C^2$ in the Bloch sphere: $P= \frac{1}{2} \left(I+\sum_{j=1}^3 n_j \sigma_j \right)$.
ADDENDUM4. From the perspective of quantum probability, the von Neumann-Luders reduction postulate has a very natural interpretation. Suppose that $\mu$ is a probability measure over the quantum lattice ${\cal P}(H)$ representing a quantum state and assume that the measurement of $P \in {\cal P}(H)$, on that state, has outcome $1$. The post measurement state is therefore represented by $\mu_P(\cdot) = \mu(P \cdot P)$, just in view of the aforementioned postulate.
It is easy to prove that $\mu_P : {\cal P}(H) \to [0,1]$ is the only probability measure such that $$\mu_P(Q) = \frac{\mu(Q)}{\mu(P)} \quad \mbox{if $Q \leq P$}\:.$$
Having given it some more thought, there is an unambiguous philosophical difference, with practical implications. The two-slit experiment provides a good example of this.
In a classical universe, any particular photon that hits the screen either went through slit A or through slit B. Even if we didn't bother to measure this, one or the other still happened, and we can meaningfully define $P(A)$ and $P(B)$.
In a quantum universe, if we didn't bother to measure which slit a photon went through, then it isn't true that it went through one slit or the other. You might say it went through both, though even that isn't entirely true; all we can really say is that it "went though the slits".
(Asking which slit a photon went through in the two-slit experiment is like asking what the photon's religion is. It simply isn't a meaningful question.)
That means that $P(A)$ and $P(B)$ just don't exist. Here's where one of the practical implications comes in: if you don't understand QM properly [I'm lying a bit here; I'll come back to it] then you can still calculate a probability that the particle went through slit A and a probability that it went through slit B. And then when you try to apply the usual mathematics to those probabilities, it doesn't work, and then you start saying that quantum probability doesn't follow the same rules as classical probability.
(Actually what you're really doing is calculating what the probabilities for those events would have been if you had chosen to measure them. Since you didn't, they're meaningless, and the mathematics doesn't apply.)
So: the philosophical difference is that when studying quantum systems, unlike classical systems, the probability that something would have happened if you had measured it is not in general meaningful unless you actually did; the practical implication is that you have to keep track of what you did or did not measure in order to avoid doing an invalid calculation.
(In classical systems most syntactically valid questions are meaningful; it took me some time to come up with the counter-example given above. In quantum mechanics most questions are not meaningful and you have to know what you're doing to find the ones that are.)
Note that keeping track of whether you've measured something or not is not an abstract exercise restricted to cases where you are trying to apply probability theory. It has a direct and concrete impact on the experiment: in the case of the two-slit experiment, if you measure which slit each photon went through, the interference pattern disappears.
(Trickier still: if you measure which slit each photon went through, and then properly erase the results of that measurement before looking at the film, the interference pattern comes back again.)
PS: it may be unfair to say that calculating a "would-have" probability means that you don't understand QM properly. It may simply mean that you're consciously choosing to use a different interpretation of it, and prefer to modify or generalize your conception of probability as necessary. V. Moretti's answer goes into some detail about how you might go about doing this. However, while this sort of thing is interesting, it does not appear to me to be of any obvious use. (It isn't clear that it gives any insight into the disappearance and reappearance of the interference pattern as described above, for example.)
Addendum: that has become clearer following the discussion in the comments. It seems that it is thought that the alternative formulation may have advantages when dealing with more complicated scenarios (QFT on curved spacetime was mentioned as one example). That is entirely plausible, and I certainly don't mean to imply that the work lacks value; however, it is still not clear to me that it is pedagogically useful as an alternative to the conventional approach when learning basic QM.
PPS: depending on interpretation, there may be other philosophical differences related to the nature or origin of randomness. Bayesian statistics is broad enough, I believe, that these differences are not of any great importance, and even from a frequentist viewpoint I don't think they have any practical implications.
The probabilities in QM are given by the square amplitudes of the relevant terms in the wavefunction, or by by the expectation value of the relevant projector or POVM. However, it is not the case that those numbers always act in a way that is consistent with the calculus of probability.
For example, if there are two mutually exclusive ways for an event to happen then the calculus of probability would say that the probability for that event is the sum of the probabilities of it happening in each of those ways. But in single photon interference experiments this doesn't seem to work. There are two routes through the interferometer, the photon cannot be detected on both routes at once, so they are mutually exclusive, right? So then to get the probability of the photon emerging from a particular port on the other end you should just add the probability of it going along each route. But that calculation gives the wrong answer: you can get any probability you like by changing the path lengths see:
http://arxiv.org/abs/math/9911150.
So then you have the problem of explaining under what circumstances the calculus of probability applies.
You ask about frequentist approaches to quantum probability. There are some such approaches, e.g. - Hugh Everett's 1957 paper and his PhD. thesis:
http://www-tc.pbs.org/wgbh/nova/manyworlds/pdf/dissertation.pdf.
I think these arguments don't work because the frequency approach itself doesn't work. Why would the relative frequency over an infinite number of samples have anything to do with what is observed in a laboratory? And if there is some explanation, then why are we bothering with this relative frequency stuff rather than using the actual explanation? The best explanation of why it is applicable is the decision theoretic approach:
http://arxiv.org/abs/quant-ph/9906015
http://arxiv.org/abs/0906.2718.
The best attempt at explaining the circumstances under which it holds is given by the requirements that quantum mechanics imposes on the circumstances under which information can be copied:
http://arxiv.org/abs/1212.3245.