DNS Failover — automatic switch to backup when the main server fails

No server runs forever: hardware failures, software bugs, DDoS attacks or even routine maintenance can temporarily stop the service. For large businesses and online stores, every minute of downtime means lost money, customers and reputation. DNS Failover is built to solve exactly this problem by constantly watching the state of the primary server and routing users to the backup when an error is detected.

How DNS Failover works

The core idea of the technology is simple: several A or AAAA records are set up for the DNS server and the monitoring system regularly checks the state of each IP address. The check is usually done via an HTTP or HTTPS request and if the answer does not come within a set time the server is marked as down. After that the DNS record is automatically switched to the backup IP address.

During the switch the TTL value plays a major role. If TTL is set to 3600 seconds, users may not see the new record for up to one hour and will see a broken site during that time. That is why for failover zones the TTL is recommended to be lowered to 60-300 seconds, so the changes spread across the internet in just a few minutes.

Monitoring and health checks

A quality failover service checks the server state every 30-60 seconds and checks are done from different regions of the world. This is important because a single observation point is not reliable: the server may be unreachable from Tashkent but working perfectly from Frankfurt. That is why most services run checks from 3 or more points and failover only triggers when the error is confirmed by several sources.

Check methods can also vary: simple ping, TCP port scan, HTTP response code, even the presence of a specific string on the page. For complex projects smart checks are used, for example, the opening of the home page and the connection to the database are evaluated together and only then is a decision made.

Active-Passive and Active-Active modes

Failover can run in two main modes. In Active-Passive mode all traffic goes to the primary server and the backup turns on only when the primary fails. This is a simple and cheap solution but the backup server resources usually sit idle which is partly wasteful.

In Active-Active mode both servers handle traffic at the same time and the load is split evenly. If one server fails the other takes the full load and the service does not stop. This mode requires more resources and is harder to set up, but for high-traffic sites it is the ideal solution.

Mistakes in failover setup

The most common mistake is the backup server not actually being ready. If the primary fails and the backup runs with old data, users will see a non-functional site. That is why the backup server must be constantly updated and the database kept in sync.

Another mistake is making failover too sensitive. If a one-off network glitch or temporary slowdown triggers it, users will be flipped between servers and sessions will break. That is why failover rules require the error to repeat several times or the problem to last a certain time.

Sayt.uz practice

For Sayt.uz clients the DNS Failover service is part of business packages and costs 240,000 sums per year. Last year 47 of 340 clients using this service had their main server go down, and on average traffic switched to backup in 18 seconds. Active-Active configuration is included in the premium package at 650,000 sums per year and is essentially mandatory for online stores. These client sites delivered 99.97 percent reliability over the year and sales losses were minimal.