Real images and their formation
Whether the screen is there or not the image is there but the problem is focussing your eye on a region of air where the image is formed without a screen.
Try the following set up:
Illuminate a 35 mm slide with a light bulb and adjust the lens so that a sharp real inverted image is formed on a screen with the eye to the left of the screen.
Now replace the screen with a sheet of tissue paper (something which allows light through and will at the same time produce a visible image) and form a sharp image on the tissue paper.
Now observe the image from the other side of the tissue paper (to the right of it as in the diagram).
Your eye will have to be at least 25 cm from the screen if you have normal eyesight.
Keep focussing on the image and slowly move the tissue paper slowly to one side so that some of the image is on the tissue paper and some in "mid air".
With a little practice you should be able see the image of the 35 mm slide without the tissue paper being there at all.
The tissue paper was used to enable you to focus on the correct area of space to view the sharp image.
Update as a result of some comments
Set up with a $3.5\,\rm cm$ focal length hand magnifier as the converging lens.
The object is a pin (white) illuminated by a torch which is switched off for the photograph to be taken without contrast problems.
The other pin (red) will be used to located the image of the white pin.
A white screen was placed next to the image pin to show the real inverted image formed by the lens.
Viewing point now from the top of the first picture ie on the other side of the lens from the position of the object pin.
Image is distorted due to a variety of cheap lens defects.
The position of the real image can be conformed by moving ones eye up and down and seeing that the tips of the image and the tip of the image pin do not move relative to one another - a position of no parallax.
In the end if you know what you are looking for and approximately where to look just looking through the lens on the side remote from an object will enable you to see the real image in mid air.
This is more difficult if the image is highly magnified.
Further update
Note that in the third photograph the image is in focus so the camera "knew" where the image was.
The image pin did help with location but by telling the camera to focus at a certain distance away and no image location pin I would still have been able to get a sharp image on the photograph.
Consider the diagram below which shows the formation of an image of an object $ABC$ on the retina of your eye.
A sharp image is formed if all the light which leaves point $B$ on the object arrives at the same point on the retina $B'$.
So all the rays in the cone of light with apex $B'$ shown in the diagram arrive at the same point on the retina $B'$.
The same being true of all points on the object eg $A$ and $C$ which will arrive ar $A'$ and $C'$.
The light from object $ABC$ originates either from the object itself or as a result of light which has been reflected off it.
So you "see" object "ABC" with your eye.
Now how is that different from the arrangement below?
What was the object in the first diagram is now an image of the object $A''B''C''$ formed by the converging lens.
That intermediate image $ABC$ forms an image $A'B'C'$ on the retina which is no different to that in the first diagram.
You "see" intermediate image $ABC$.
Because there are no reference points around it (just air) it is difficult to decide exactly where that image is and you do indeed see it as though it is "in the lens".
By using an image pin (or your finger) you can easily show by the method of no parallax that you are actually looking at an image in mid air.
When you use an optical instrument you are looking at an image in the air but you have the advantage of being allowed to move the eyepiece to form an image of that image in the air as well as possibly having some cross hairs which are in the image plane.