Why can't virtual images form on a screen?

There seems to be some fundamental confusion here. An image is formed on a screen when light rays emanating from an object converge there. If there is no convergence of rays, then there is no image on a screen.

Think about a portrait located positioned on the left side of the lens. The light emanating from a point on the tip of the nose focuses to the (single) corresponding point on the image. The same is true of all the neighboring points, so there is a one-to-one correspondence between points on the image and points on the object, and the image is clear.

enter image description here

On the other hand, if you position a screen at a different location, then the light emanating from the tip of the portrait's nose will be spread over a whole region of the screen. The light from the neighboring points on the object will overlap, and the result will be a blurred image.

enter image description here

The conclusion is that the calculated image distance is where you will get a clear image; if you put your screen anywhere else, then an image will not form. Now consider what you'd get with a diverging lens.

enter image description here

The blue dotted lines are obtained by tracing the rays on the right hand side backward and pretending the lens wasn't there. The virtual image is the location from which the rays appear to be emanating from the perspective of somebody on the right-hand side of the lens. However, there are no actual light rays which converge there. If you place a screen at the location of the virtual image, can you see why you don't get a nice picture?

enter image description here

From your statement

would create an image on the screen, just blurry.

I suppose you think of an image as some distribution of light that can be registered by a photo sensor. This is an intuitively obvious definition. But that's not what the technical term "image" means in optics. Citing Wikipedia:

In optics, an image is defined as the collection of focus points of light rays coming from an object.

Image in optics is the term opposed to object, and when an image is formed, it means that we basically have a kind of optical equivalent to the set of points an object is made of—from the point of view of ray propagation. I.e. if you take the rays emitted from an image, and pass them into a well-focused optical instrument, the image will be indistinguishable from an object placed at image's position—as seen by that instrument.