The Double-Edged Sword of Docker: Balancing Benefits and Risks -

As a systems administrator, I am deeply concerned about the consequences of the current widespread adoption of technologies like Docker. Having been a proponent and early adopter of containerization for many years, I recognized its potential early on and have been advocating for its use in many of the Linux-based setups I manage.

Initially, this relieved me of some headaches. One recurring issue was dealing with developers requesting “exotic” setups—by exotic, I mean specific (sometimes multiple) versions of PHP on the same VPS, or unique combinations of PHP and MySQL (or MariaDB) that required adding external repositories of all sorts—creating future problems when one of these repositories ceases to exist or be updated, leaving us with an unstable, dangerous, or unupgradable system.

In many cases, I resolved these issues by partitioning components into FreeBSD jails (one jail per service, one for data, with bind mounts as needed—perfect efficiency, excellent upgradability and stability, maximum security). However, this wasn’t always feasible. Sometimes, the explicit use of Linux was required, prompting the need for an alternative solution. In the past, I separated components using LXC, similar to FreeBSD jails, but then Docker arrived, and the approach changed.

At that point, the problem seemed solved: I just needed to provide a VPS with Docker, handle backups, data, monitoring, etc., but leave developers the freedom to include the specific versions of components they needed in their setups.

But…

Developers often make poor system administrators. And rightly so, because system administrators are often poor developers. However, this leads to a series of medium-term problems:

Continued use of outdated (or conversely, bleeding-edge and thus unstable) component versions in Dockerfiles, creating stability issues or, worse, security vulnerabilities.
A habitual approach to software crashes as if they were normal. Well-developed software should not crash but autonomously manage issues. When a crash is inevitable, it should indicate a situation so severe that it requires a system administrator’s intervention. Instead, the world is full of unstable stacks that crash at the slightest exception, with the mentality, “the container will just restart.” This, to me, is unacceptable.
Lack of optimization: I often hear, “I’ve maxed out the resources on MySQL, we need to scale up.” But upon reviewing, I realize that there has been no tuning of its configuration. After some adjustments, the load often decreases by 90%, making it entirely manageable. Yet, we are in an era dominated by major cloud players whose goal is not to optimize our costs (as they claim) but to make us spend more, seemingly simplifying tasks with tools like Kubernetes (and autoscaling) but actually encouraging us to unnecessarily complicate our infrastructure and spend more. Using more resources while contradicting the ecological awareness that marks our times.
Lack of big-picture thinking: As system administrators managing the overall system, we always have a holistic view. Developers, rightly focused on their projects, often lack the depth of understanding to identify the bottleneck in the entire setup. A typical comment I hear is, “the site is slow, we need a more powerful server.” In 90% of cases, this is unnecessary and would be completely ineffective. A misimplemented feature launching 50 concurrent long PHP processes wouldn’t be solved by increasing from 4 to 8 cores. It would help, sure, but it wouldn’t be a solution. Solving it by reducing the processes to two would change everything.
Lack of backup strategy: The average developer focuses on the reproducibility of their setup, maybe keeping a database dump (not always), but seldom addresses the issue of recovery time. I recently had a discussion with a colleague (who calls himself a DevOps) who told me he had “production database dumps, a .tar.gz of individual web app directories, and notes on how he set up that server.” When I asked how many systems he managed, he said “over 100, on the same cluster.” Asked about disaster recovery, he believed it was “impossible” for such a cluster to be unreachable for long (though the OVH Strasbourg fire should have taught us that nothing is impossible when data is concentrated in one place). Nevertheless, he thought he could restore operations in “about 2 hours per server”—thus, 100 servers would require 200 hours of work. For 99% of setups, these would be totally unacceptable timelines.

Today, we have hardware so powerful that it can handle unimaginable loads from just a few years ago. A bit of planning, optimization, and design can greatly reduce costs, and increase productivity, stability, and system reliability.

Thus, I remain in favor of solutions like Docker, but the turn the entire IT industry is taking towards such solutions worries me because it might improve some aspects but will worsen others. We are simply shifting the problem elsewhere.

In my view, there is no one-size-fits-all solution to any problem; each requires its own study and implementation.

The solution to all the problems we have known was one: 42. And we all know how that turned out.

See also