Comparing cold-start to warm start
How did you profile your code? Not all profiling methods are equal and some find hotspots better than others. Are you loading lots of files? If so, disk fragmentation and seek time might come into play.
Maybe even sticking basic timing information into the code, writing out to a log file and examining the files on cold/warm start will help identify where the app is spending time.
Without more information, I would lean towards filesystem/disk cache as the likely difference between the two environments. If that's the case, then you either need to spend less time loading files upfront, or find faster ways to load files.
Example: if you are loading lots of binary data files, speed up loading by combining them into a single file, then do a slerp of the whole file into memory in one read and parse their contents. Less disk seeks and time spend reading off of disk. Again, maybe that doesn't apply.
I don't know offhand of any tools to clear the disk/filesystem cache, but you could write a quick application to read a bunch of unrelated files off of disk to cause the filesystem/disk cache to be loaded with different info.
It would be hard to truly simulate a reboot in software. When you reboot, all devices in your machine get their reset bit asserted, which should cause all memory system-wide to be lost.
In a modern machine you've got memory and caches everywhere: there's the VM subsystem which is storing pages of memory for the program, then you've got the OS caching the contents of files in memory, then you've got the on-disk buffer of sectors on the harddrive itself. You can probably get the OS caches to be reset, but the on-disk buffer on the drive? I don't know of a way.
@Morten Christiansen said:
One way to make apps start cold-start faster (sort of) is used by e.g. Adobe reader, by loading some of the files on startup, thereby hiding the cold start from the users. This is only usable if the program is not supposed to start up immediately.
That makes the customer pay for initializing our app at every boot even when it isn't used, I really don't like that option (neither does Raymond).
As for simulating reboots, have you considered running your app from a virtual PC? Using virtualization you can conveniently replicate a set of conditions over and over again.
I would also consider some type of profiling app to spot the bit of code causing the time lag, and then making the judgement call about how much of that code is really necessary, or if it could be achieved in a different way.