What determines the frequency the Wayback Machine crawls one's website?
The Wayback Machine archive is a combination of data from a large number of different crawls:
- Alexa crawls, which appear after a 6 month delay
- Our own crawls, which are seeded from the Alexa top million list and others
- ArchiveTeam crawls, done by volunteers
- ArchiveIt crawls, done by our 400+ partners, mostly libraries, many of which allow their data to be included in the general Wayback Machine
We have an experimental Wayback Machine search and explore interface at https://web-beta.archive.org/ which makes visible why each capture was made.