Isn't the detector always measuring, and thus always collapsing the state?
Good question. The textbook formalism in Quantum Mechanics & QFT just doesn't deal with this problem (as well as a few others). It deals with cases where there is a well-defined moment of measurement, and a variable with a corresponding hermitian operator $x, p, H$, etc is measured. However there are questions which can be asked, like this one, which stray outside of that structure.
Here is a physical answer to your question in the framework of QM: Look at the a position wave function of the decayed particle $\psi(x)$ (*if it exists: see bottom of post if you care). When this wave function "reaches the detector" (though it probably has some nonzero value in the detector the entire time) the Geiger counter registers a decay. Using this you get a characteristic decay time. This picture is a good intuition, but also an inexact/insufficient answer, because the notion of "reaches the detector" is only heuristic and classical. A full quantum treatment of this problem should give us more: a probability distribution in time $\rho(t)$ for when the particle is detected. I will come back to this.
So what about the Zeno effect? Based on the reasoning you gave, the chance of decaying is always zero, which is obviously a problem! Translating your question to position space $\psi(x)$, your reasoning says the wave function should be projected to $0$ in the region of the detector at every moment in time that the particle hasn't been found. And in fact you're right - doing this does cause the wave function to never arrive at the detector! (I actually just modeled this as part of my thesis). This result is inconsistent with experiment, so we can conclude: continuously-looking measurement cannot be modeled by straightforward projection inside the detector at every instant in time.
A note, in response to the comments of Mark Mitchison and JPattarini: this "constant projection" model of a continuous measurement can be rescued, by choosing a nonzero time between measurements $\Delta t \neq 0$. Such models can give reasonable results, and $\Delta t$ can be chosen based on a characteristic detector time, but in my view such models are still heuristic and a deeper, more precise explanation should be aspired to. Mark Mitchison gave helpful replies and linked sources in the comments for anyone who wants to read more on this. Another way to rescue the model is to redefine the projections to be "softer", as in the sources linked by JPattarini.
Anyway, despite the above discussion, there is still a gaping question: If continuous projection of the wave function is wrong, what is the correct way to model this experiment? As a reminder, we want to find a probability density function of time, $\rho(t)$, so that $\int_{t_a}^{t_b}\rho(t)dt$ is the probability that the particle was detected in time interval $(t_a, t_b)$. The textbook way to find a probability distribution for an observable is to use the eigenstates of the corresponding operator ($|x\rangle$ for position, $|p\rangle$ for momentum, etc) to form probability densities like $|\langle x | \psi \rangle|^2$. But there is no clear self-adjoint "time operator", so textbook quantum mechanics doesn't give an answer.
One non-textbook way to derive such a $\rho(t)$ is the "finite $\Delta t$ approach" mentioned in the note above, but besides this there are a variety of other methods which give reasonable results. The issue is, they don't all give the same results (at least not in all regimes)! The theory doesn't have a definitive answer on how to find such a $\rho(t)$ in general; this is actually an open question. Predicting "when" something happens in Quantum Mechanics (or the probability density for when it happens) is a weak point of the theory, which needs work. If you don't want to take my word for it, take a look at Gonzalo Muga's textbook Time in Quantum Mechanics which is a good summary of different approaches on time problems in QM which are still open to be solved today in a completely satisfactory way. I am still learning more about these approaches, but if you are curious, the one I found most clean so far uses trajectories in Bohmian Mechanics to define when the particle arrives at the detector. That said, the measurement framework in QM in general is just imprecise, and I would be very happy if a new way of understanding measurement were found which gives a higher level of understanding of questions like this one. (yes I am aware of decoherence arguments, but even they leave questions like this unanswered, and even Wojciech Zurek, the pioneer of decoherence, does not argue that it fully solves problems with measurement)
(*note from 2nd paragraph): Sure you can in principle hope to position representation to get a characteristic decay time like this, but it might not be as easy as it sounds because QFT has issues with position space wave functions, and you'd need QFT to describe annihilation/creation of particles. Thus even this intuition doesn't always have mathematical backing.
No, the detector is not always collapsing the state.
When the particle is in an undecayed state its wave function is physically localised with a vanishingly small amplitude in the region of the detector, so the detector doesn't interact with it and isn't 'always' measuring it. It is only when the particle's state evolves to the point at which it has a significant amplitude in the vicinity of the detector that the counter clicks.
My take on this is that in the original thought experiment, you don't get to monitor the detector. When the detector detects, it kills the cat. But it doesn't tell you then. You only find out when you open the box.
If it tells you immediately, then you know immediately. And then there's the question whether the detector detects 100%.
If the Geiger counter detects 100%, then you could have 100 Geiger counters or 10000, and they would all detect the particle decaying. If they were all the same distance, they should all detect it at the same time. (Assuming the particle was not moving relative to them. Otherwise relativity might give them different times which would be 100% predictable.
I think it's more plausible that each detector detects a different photon. And the first single detector might easily miss a particular gamma ray photon.
So if there is only one radioactive particle, then if the geoger counter does detect it, then you know it's been detected and you know pretty much when. But if it hasn't detected it yet, there's an increasing chance with time that the particle has decayed and the geiger counter did not detect it and will never detect it.