NTP server architecture
Depending on how critical time keeping is in your environment, you may not want server1 to be a single point of failure. If you have to take it offline for maintenance or repair for an extended period of time, its peers will stop syncing. It is all downhill from there.
Why not have server1, server2, server3, server4 all sync to 4 or 5 Internet peers. Then, your internal network can reference these systems?
Conventional wisdom says that 3 is what you need for quorum, but you need to be tolerant of at least one being determined as a falseticker or going offline.
Please see; 5.3.3. Upstream Time Server Quantity
Additionally, you mention weirdness and issues with your current configuration. It would help to see the output of ntpq -p
for the relevant hosts.
While it's not strictly true that 2 servers is no use, the Best Current Practices RFC draft recommends 4 as a minimum. NTP's intersection algorithm doesn't depend merely on quorum in the number of servers, but also in the quality of the time which they return - and you can't predict that. So the more the better. There is no problem having up to 10 upstream NTP servers.
As Aaron mentioned, your proposed servers 1-4 should all point to upstream NTP servers, and your internal systems should point to all 4 of them. Servers 1-4 can also peer with each other (in symmetric mode), but that's not strictly required.
It's important to understand why you shouldn't funnel NTP through a single server at any point in your architecture: NTP requires multiple servers for accuracy, not just redundancy (see the NTP docs for a description of the algorithms, which explains why). (Shameless plug: I've written more about this elsewhere, including suggestions for architecture.)