Why does arXiv have a multi-day lag between submission and publication?
I guess the arXiv staff have to approve the article before is being published .... is my guessing correct?
Yes, arXiv moderates all submissions before posting them (there is some description of this here). This is not anywhere near as rigorous as peer review; rather, someone just flips through it to ensure that the submission is (1) not complete junk and (2) classified correctly. This approach is used by many pre-print platforms (SSRN, socarXiv, etc.).
To complement the other answer. All submission to ArXiv pass through a rudimentary moderation process, which checks that submitted manuscripts satisfy the ArXiv's basic technical and content guidelines. In general, this process is quite fast and this by itself does not account for the 4 day lag observed.
In general, submissions before 14:00 (Eastern time US) appear online at 20:00 the same working day. Note that no new articles are announced over the weekend, so some of the articles appearing on Monday were submitted on Friday.
Publish time (Eastern US) | Submission time |
---|---|
Monday 20:00 | Friday 14:00 - Monday 14:00 |
Tuesday 20:00 | Monday 14:00 - Tuesday 14:00 |
Wednesday 20:00 | Tuesday 14:00 - Wednesday 14:00 |
Thursday 20:00 | Wednesday 14:00 - Thursday 14:00 |
Friday 20:00 | Thursday 14:00 - Friday 14:00 |
In fact, since the articles appear in order of submission time in the daily digests, some authors make appoint of submitting directly after the cut-off time. This ensures that their article appears at the top of the list of new articles for the next announcement. This means that it is common to see articles that were submitted Friday at 14:00 + a few seconds in the Monday new articles.
The specific example given by the OP appeared on a Tuesday at 14:00 and was submitted on Friday at 15:18:37. So the usual delay does not account for the 4 day lag in this case.
Some submissions (about 15% according to this blog post) are flagged for additional moderation checks. There are various reasons for a submission to be flagged. Some common ones include one of the authors having a problematic previous submission history, and the submission being flagged by automated plagiarism checks. This must have happened to the example given by the OP, most likely due to it being a PDF only submission (which means that some of the automated checks done for TeX submissions are not available).
This is mostly, but nor entirely, because of the weekend.
The real delay from moderation is very short: you need to submit before 2pm Eastern time in order to be included in the following day's mailing. However, since there are no mailings at weekends, that means you need to submit before 2pm Eastern on Friday to be included in Monday's mailing. So it is quite normal for a paper to be submitted on a Friday afternoon (e.g. 4th Dec 2020) and appear on a Tuesday (e.g. 8th Dec 2020).
In this case, however, the paper was submitted on Friday morning, so would normally have appeared on the 7th, and there must have been some other reason, specific to this submission, for the additional one-day delay.