Mostafa-DE

Security Defaults That Save You Before You Need Them

Mostafa-DE — Sat, 11 Oct 2025 16:53:34 GMT

Most security conversations start with “how do we protect this?” as if protection is something you add later.
But security isn’t an afterthought. It’s the sum of the defaults you began with.

Every new system, no matter how small, starts with assumptions: which ports stay open, who gets access, what gets logged, and what alarms exist. Those defaults decide whether you’ll sleep well when something breaks.

If you start new infra tomorrow, don’t think “how do I secure this?” think “what defaults would save me next time?”

Monitoring Comes First

Monitor everything. Big things, small things, boring things. If it can change state or affect users, it’s worth observing.

Metrics for health and capacity (rates, errors, latency, saturation).
Logs for actions and decisions (who, what, when, where).
Traces for flow across boundaries (services, queues, data stores).
Alarms for deviations (not noise).

You’ll thank yourself later. Detection beats speculation.

A security checklist is only as strong as the next engineer who forgets to read it.

That’s why monitoring and alerts must be defaults, created automatically with every new component, not a to-do you remember later.

Trust Nobody (Including Yourself)

Work cultures thrive on trust and ownership, but security has a different rule:

Assume compromise is possible.
Assume you’ll make mistakes.
Design so mistakes are contained.

Concretely:

Least privilege by default. Roles grant the minimum required, nothing more.
Access expires by default. Temporary elevation beats permanent power.
Separation of duties. No single actor can deploy, approve, and access sensitive data.
Logs and alarms everywhere. Trust is verified by evidence, not optimism.

Log Actions. All of Them.

Logs are your closest friends. Keep them around.

Log every action that changes state or touches sensitive paths.
Make logs structured (machine-readable keys, not prose).
Centralise and retain with sensible lifetimes for forensics.
Alert on behaviour, not just errors (for example, unusual access patterns).

Log everything that’s an action. Not secrets, not personal data, evidence of decisions. That’s your paper trail when you need it most.

Automate Principles, Not Tools

Security that depends on memory is security that won’t scale.

Infrastructure-as-Code modules or templates should bake in secure defaults (deny-by-default networking, baseline logging, required alarms).
Service or application scaffolds should enable safe configs out of the box (sane auth flows, rate limits, health checks).
Pipelines or policies should enforce reviews for risky changes (identity, network, data).

The message: make the right thing the easy thing.

Beware Security Theater

More rules can mean more latency, more cost, more complexity, and sometimes less clarity.

If twenty fragile rules slow the system, ask: Can I remove the attack surface instead?
If a mitigation “works” but hides root causes, refactor the design.
Prefer simpler architecture and clear boundaries over complex patchwork.

Be mindful, keep learning, and be willing to replace yesterday’s “must-have” with a cleaner idea tomorrow.

Security Is a Mindset (On and Off the Clock)

Security is a habit loop:

Default to scepticism of new permissions and open surfaces.
Default to visibility (metrics, logs, traces).
Default to containment (blast-radius thinking).
Default to reflection (post-incident learning, not blame).

It’s not just at work. The same instincts, least privilege, scepticism, and good hygiene also make sense in daily life.

A Tool-Agnostic Default Baseline

Identity & Access

Roles are least-privilege and scoped.
Strong authentication and multiple factors; break-glass account exists and is monitored.
Access grants are time-boxed and reviewed.

Network

Deny by default; open only what’s required.
Segment internal vs. external paths; control egress.
Encrypt in transit by default.

Data

Encrypt at rest by default; rotate keys on a schedule.
Backups exist, are isolated, and restores are tested.
Access to sensitive data is logged and alerted.

Runtime / App

Minimal surface area; healthy rate limits; sane timeouts.
Health checks and circuit breakers to avoid cascading failures.
Safe defaults for configuration; secrets never in code or logs.

Observability

Metrics, logs, traces on day one.
Alarms tuned to behaviour, with clear owners and runbooks.
Centralized storage and practical retention.

Process & Culture

Reviews required for identity, network, and data changes.
Incident channel, calm communication, and blameless postmortems.
Regular drills for recovery and access revocation.

Security isn’t a set of tools. It’s a set of defaults that reflect how seriously you take mistakes before they happen. The stronger your defaults, the less you need to think about security day to day.

If you start new infra tomorrow, don’t think “how do I secure this?” think “what defaults would save me next time?”

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Setting Up Celery for Production: Isolated Queues, Autoscaling, and Smarter Task Management

Mostafa-DE — Tue, 20 May 2025 16:44:28 GMT

When deploying Celery in production, using the default configuration can quickly become a bottleneck. It’s tempting to spin up a few workers with a fixed concurrency and assume the job is done. But as the number of background tasks grows, so do the challenges: blocked queues, resource spikes, and unpredictable performance. In this article, I’ll walk you through how I structured Celery for production, focusing on queue isolation, autoscaling, and smarter load handling.

The Default Celery Setup

Most tutorials show you how to get Celery running with a single worker and a shared queue. That works for a while. But in practice, this is what you're really getting:

One or two Celery workers with a fixed number of processes
Tasks from all queues are being pulled in a round-robin fashion
No prioritization, critical tasks, and background jobs are treated the same

This might be fine at the beginning. Tasks get processed, and everything seems okay. However, once your application grows and starts generating hundreds or thousands of background jobs, problems appear fast.

The fixed workers continue to pull from all queues, blindly, regardless of task importance. If you suddenly have a spike in long-running tasks like reports, it can block urgent operations like order processing or payment confirmation. Your most critical tasks now have to wait in line behind non-essential ones, and that’s a problem.

Worse, if all tasks are routed into just one or two queues and those queues feed into a single shared worker, there’s no way to prioritize one over the other. In this kind of setup, your users will feel the impact. Delays become visible, responsiveness drops, and user satisfaction, one of the top priorities for any software engineer, starts to suffer.

This is exactly the kind of setup I started with, and exactly what I wanted to improve.

What I Changed and Why

Before diving into the worker configuration, I focused first on getting the queues set up properly. Since all tasks are pushed into queues and workers simply pull from them, it’s essential to ensure that each task lands in the right queue. Think of it this way: we have multiple queues, let’s call them Q1, Q2, Q3, and Q4, and it’s the application’s responsibility to assign tasks to the appropriate ones.

Each queue should be dedicated to a specific type of workload. For example, report-related tasks should go to Q2, background or low-priority tasks to Q3 and Q4, and all other general-purpose tasks to Q1 by default.

With task routing in place, I then moved to the worker configuration. The goal was to ensure that each group of related tasks is processed by a dedicated worker, optimized for the nature and expected volume of tasks in that queue.

I created three separate workers:

default: Handles high-priority operational tasks like (orders, payments, etc...), (tasks in Q1).
reports: Reserved for generating large or time-consuming reports (tasks in Q2).
others: Manages low-priority or background jobs (e.g., syncing, logs) (tasks in Q3 & Q4).

This separation ensures that no queue can interfere with another. For example, a burst in report generation won’t delay time-sensitive updates from the default queue, and that would make things way faster.

So here is the entire architecture shown in the image below:

Each worker is assigned only the queues it’s responsible for:

default handles only Q1
reports handles only Q2
others handles Q3 and Q4

This explicit routing prevents low-priority queues from blocking more urgent ones and keeps the system predictable.

Now that each worker is assigned to specific queues, it only processes tasks from those queues, and nothing else. But what happens when there are no tasks available for a given worker? Simply put: nothing.

The worker becomes idle, it doesn't consume unnecessary CPU or memory beyond what’s needed to stay alive. This is a huge advantage of queue isolation, each worker is scoped to a known workload, and it's not competing or grabbing tasks from unrelated queues. This keeps the system organized and minimizes unnecessary load.

But there’s more to it. What if you suddenly get a spike in one queue, say a flood of report generation requests? You’d want the system to respond dynamically, without you manually increasing concurrency or restarting workers. That’s exactly where the --autoscale flag becomes valuable.

The --autoscale option allows each worker to scale its concurrency between a minimum and maximum number of worker processes, depending on task volume. In practice, it means Celery will spin up more child processes when there's a queue buildup and scale them back down when demand drops.

Here’s how I configured autoscaling for each worker:

default: 1 to 4 concurrent processes
reports: 1 to 4 concurrent processes
others: 1 to 2 concurrent processes

This gives each worker enough headroom to handle short-term spikes without wasting resources during quiet periods. Instead of running 4 workers all the time, we start with 1 and let Celery increase concurrency only when needed. It’s a balance between responsiveness and efficiency.

Managing Multiple Workers with `celery multi`

With queue isolation and autoscaling in place, the next challenge is: how do you manage multiple workers efficiently?

You could run each worker with a separate systemd service file, but that quickly becomes tedious and error-prone. Instead, I opted to use celery multi, which is built specifically for running and controlling multiple named Celery workers as a group. It gives you a way to start, stop, and monitor all your workers together using a single service.

Each worker is defined by name and configured independently through celery multi, including its queues, concurrency, autoscaling settings, and individual PID/log files.

Here’s how I configured this using systemd:

`celery.service`

[Unit]
Description=Celery Multi-Worker Service
After=network.target

[Service]
Type=forking
User=ubuntu
Group=ubuntu
EnvironmentFile=/etc/conf.d/celery
WorkingDirectory=/
ExecStart=/bin/bash -c "${CELERY_BIN} -A ${CELERY_APP} multi start default reports others --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} -Q:default Q1 --autoscale:default=4,1 --time-limit:default=3000 -Q:reports Q2 --autoscale:reports=4,1 --time-limit:reports=3000 -Q:others Q3,Q4 --autoscale:others=2,1 --time-limit:others=10000"
ExecStop=/bin/bash -c "${CELERY_BIN} multi stopwait default reports others --pidfile=${CELERYD_PID_FILE}"
ExecReload=/bin/bash -c "${CELERY_BIN} -A ${CELERY_APP} multi restart default reports others --pidfile=${CELERYD_PID_FILE} --logfile=${CELERYD_LOG_FILE} --loglevel=${CELERYD_LOG_LEVEL} -Q:default Q1 --autoscale:default=4,1 --time-limit:default=3000 -Q:reports Q2 --autoscale:reports=4,1 --time-limit:reports=3000 -Q:others Q3,Q4 --autoscale:others=2,1 --time-limit:others=10000"
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target

This starts three named workers: default, reports, and others. Each worker is isolated and independently configured:

--pidfile / --logfile: Uses %n in the config file /etc/conf.d/celery to assign a unique file per worker.
-Q: : Each worker is assigned to specific queues.
--autoscale:X,Y: Enables dynamic scaling of concurrency between Y (min) and X (max).
--time-limit: Sets a hard timeout (in seconds) for task execution. If a task exceeds this time, the worker is killed and the task is marked as failed.

Why This Works Well in Production

With this setup:

Each worker has its own name, queue bindings, autoscale policy, and log/PID files.
You can manage all workers together using simple systemd commands (start, stop, restart).
You don’t need multiple .service files, which keeps your deployment clean.
It scales seamlessly with your queue design, you can always add another named worker later.
Because you are using one service now, “self-healing“ with Restart=always and RestartSec=5 works perfectly, so you will not face any problem in production related to celery not being active for some reason.

This structure not only improves reliability and performance, but also simplifies operational overhead when you need to update or monitor your task workers.

I didn’t want to write yet another article on how to install Celery. Instead, this post dives into the real improvements that make a difference when your system starts scaling. By isolating queues, autoscaling intelligently, and assigning proper limits, Celery becomes a stable and predictable part of your infrastructure.

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Tricky Apache & mod_wsgi Issues I’ve Run Into (So You Don’t Have To)

Mostafa-DE — Tue, 29 Apr 2025 18:49:49 GMT

I always say this (Thank god we have default values). They save thousands of developers every day. Imagine if we didn’t have default values that were set based on the common cases, we would face a really hard time doing everything over and over.

But not for everything. In my opinion, you should never trust default values. Notice that I didn’t say don't use them; I said don't trust them. You may still use default values, but never blindly trust them. One day, they will fail you if you don't properly understand what you are using.

I've been there and I saw productions used default values for most of the things, but also I saw these values most of the time fail. This is really important to understand, not only for this article topic, but is beyond that. You should never trust default values, especially in the DevOps world, you should always understand what you are configuring and put it under proper testing, otherwise you are just taking unnecessary risks.

With that said, let’s start on our topic for this article

Running mod_wsgi in daemon mode with MPM Event is a great setup, but it’s also easy to misconfigure Apache in ways that tank performance, create stability issues, or even break your application. A lot of these issues come from poorly tuned process/thread settings, misaligned worker configurations, or because you are using the default values.

This article goes over real-world examples of bad configurations, why they fail, and how to properly configure Apache and mod_wsgi for efficient production workloads.

Common Misconfigurations & How to Fix Them

→ Misaligned `StartServers`, `MaxRequestWorkers`, `ThreadsPerChild` Underutilisation of Resources

A common mistake is setting MaxRequestWorkers too low compared to available worker processes and threads, leaving potential performance on the table.


    StartServers 3
    MinSpareThreads 25
    MaxSpareThreads 75
    ThreadsPerChild 50
    MaxRequestWorkers 75


WSGIDaemonProcess myapp processes=4 threads=25
WSGIProcessGroup myapp

What's Wrong?

StartServers used to tell Apache how many workers to create when it first starts, it is good for making sure that it can handle a lot of requests when it starts
ThreadsPerChild used to specify how many threads each worker can use to handle requests
- In simple terms, you can say that each thread eventually will handle one request at a time
MinSpareThreads and MaxSpareThreads used to specify the min and max of how many threads should be created by Apache and kept for future use.
MaxRequestsWorkers The most important config that you need to be careful with once you set, it is used to tell Apache the maximum number of requests it can serve at the same time
- This is used to limit Apache from utilizing the entire resources
- When I say (be careful) when setting this, I mean it, and here is why?

When you set StartServers to a specific value, Apache will start creating workers, each worker will start creating threads based on the specified value, but the problem here when you set MaxRequestsWorkers you basically telling Apache how many requests it can handle at once, so if you set it a low number, Apache will see that there are 3 workers and each one can handle 50 requests at a time, so the total is 3 × 50 = 150 which means when you set 75 as a maximum, it will start killing threads to reach the limit that you specified which will defiantly will cause problems and issues like (misutilisation, Zombie processes) and so on, so be careful.

MaxRequestWorkers (75) too low. MPM Event can handle 3 processes × 50 threads = 150 requests, but it's capped at 75.

Wasted capacity: Apache can handle more concurrent requests, but it's artificially restricted.

→ To Fix This: Align Settings for Proper Utilization


    StartServers 3
    MinSpareThreads 25
    MaxSpareThreads 75
    ThreadsPerChild 50
    MaxRequestWorkers 150  # Matches 3 processes × 50 threads


WSGIDaemonProcess myapp processes=3 threads=50  # Matches MPM Event
WSGIProcessGroup myapp

→ Not Setting `--processes` For mod_wsgi, blocking Requests Due to Slow Processing

Running Apache in Daemon mode means it acts as a proxy, which means the requests are forwarded to WSGI to be handled on the application level, so imagine you have a slow application with huge requests to process, that means you have slowness, requests being dropped, and in most cases outage.

One way to mitigate this is to increase the number of servers to handle requests, but that’s not helpful if your server is not utilised 100%. Imagine this scenario:

You have a server with 4 cores and 16 RAM, this server is considered mid-level, which is really good for most applications
- But if you have 1 application running on that server and that is processing everything coming from Apache, at some point, things will get delayed, and your application can’t follow up with Apache
- In cases like this, you can just add another server to help, but what about the cost? Why pay money for something that you can easily avoid? You are not fully utilising the server that you already have, so instead of adding servers, why not try to fully utilise what we have instead
This is my argument, you should always try to figure out a way to get use of what you already have instead of introducing something new to the equation.

→ Missing or incorrect `listen-backlog`, too Many Dropped Requests

By default, mod_wsgi sets listen-backlog to 128 by default, which means dropped connections during traffic spikes. I saw many production configs not using this, although to be honest it is not that big of deal, the default is already 128 which is fine in most cases specially if you production uses 2 servers that are running all the time, but in case where you app is running on only one server I would recommend to increase it to a higher number. I won’t know the exact number to set, as always, this depends on your app and resources and so on. With that said, I would recommend not setting a really high number as this could make things worst during spikes if your application is slow, so find what number work with your app.

→ To Fix: Increase `listen-backlog`

WSGIDaemonProcess myapp processes=4 threads=25 listen-backlog=500
WSGIProcessGroup myapp

Setting listen-backlog to 500 ensures that requests are queued and aren’t rejected immediately, which helps your app reputation (No one likes to see 502/503 errors or something similar once hitting any application).

→ `mod_remoteip` can't parse Client IP

When using Apache behind an AWS ALB, the X-Forwarded-For header forwarded by the ALB includes both the IP and the port (e.g., 203.0.113.1:54321). This causes the mod_remoteip module to fail when trying to parse the real client IP.

What's Wrong?

mod_remoteip expects just the IP (without the port). When it can't parse the header, it defaults to logging and treating the request as if it came from the ALB IP.
All requests appear to originate from the load balancer, which completely breaks IP-based logic.

And of course, this will lead to:

Incorrect logging: All client logs will show the ALB IP, not the actual user IP.
Broken rate limiting: Tools like mod_evasive or WAFs that rely on accurate IP tracking will see all requests coming from one source (the ALB) and could start blocking legitimate users.

→ To Fix: Prevent ALB from Appending Port

Modify the ALB configuration and disable port forwarding:

enable_xff_client_port = false

This ensures X-Forwarded-For contains only the IP. Once the ALB is fixed, make sure Apache knows how to extract the real IP:

RemoteIPHeader X-Forwarded-For
RemoteIPTrustedProxy

This way, Apache replaces the client IP with the value from X-Forwarded-For, but only if it came from your trusted ALB. This restores proper logging, WAF accuracy, and rate limiting logic.

→ mod_remoteip and Multi-Proxy Setups: How Misconfiguration Can Expose You to IP Spoofing

When using mod_remoteip with reverse proxies like an ALB, Cloudflare, or NGINX, many setups mistakenly trust whatever is in the X-Forwarded-For header, without validating who sent it. This makes it easy for attackers to spoof IPs.

What's Wrong?

X-Forwarded-For can contain multiple IPs (e.g. client, proxy1, proxy2).
If Apache isn’t explicitly told which proxy IPs it should trust, it may trust attacker-controlled headers.
That means your logs, WAF rules, rate limiters (e.g., mod_evasive), and even audit trails might all log fake IPs.

→ To Fix: Define Who You Actually Trust

1. Use `RemoteIPHeader` and `RemoteIPInternalProxy` properly

RemoteIPHeader X-Forwarded-For
RemoteIPInternalProxy 10.0.0.0/8 192.168.0.0/16  # Only trust your internal proxy network

This tells Apache: “Only trust this header if the request came from one of these trusted internal proxies.”
If you're using an ALB, make sure to list its private IP or CIDR block.
If you're behind multiple layers (e.g., Cloudflare → ALB → Apache), configure this carefully.

Bonus Tip: Use Custom Logging Formats

You’re not stuck with the defaults. You can actually define your own log formats and log extra info that’s helpful when debugging.

LogFormat "%h %l %u %t \"%r\" %>s %b %{ms}T" custom
CustomLog /var/log/apache2/access.log custom

In the example above, %{ms}T logs the request duration in milliseconds—super helpful when tracking performance issues.
You can log headers, cookies, query strings—whatever helps you trace problems.

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Understanding Apache Worker Types: Prefork, Worker, and Event

Mostafa-DE — Tue, 21 Jan 2025 16:44:55 GMT

When deploying applications with Apache, understanding the differences between worker types is critical. Many developers configure Apache without fully considering the impact of their choice, which can lead to poor performance, resource exhaustion, and even downtime. In this article, I’ll explain the three main Apache Multi-Processing Module (MPM) worker types (Prefork, Worker, and Event) and share my perspective on why Event is the best choice for most modern setups.

Overview of Apache Worker Types

Prefork

Prefork is a process-based MPM where each connection is handled by a dedicated process.

How It Works: Every request spawns a separate process, each handling one connection at a time.
Pros:
- Non-threaded, making it ideal for non-thread-safe applications.
- Simple to debug and configure.
Drawbackkons:
- Extremely resource-intensive due to the high memory overhead of each process.
- Poor scalability for handling large numbers of simultaneous connections.

Worker

Worker is a hybrid MPM that uses threads within processes to handle connections.

How It Works: Each process can handle multiple threads, and each thread manages a single connection.
Pros:
- More efficient than Prefork, with significantly reduced memory usage.
- Requires thread-safe applications and libraries.
Cons:
- Threads are preserved for specific connections, which can lead to inefficiency when connections are idle.

Event

Event is an improvement over Worker, designed to handle idle connections more efficiently.

How It Works: Idle connections (e.g., KeepAlive) are moved to an event queue, freeing up threads to handle new requests.
Pros:
- Reduces resource usage by avoiding tying up threads for idle connections.
- Highly scalable and optimized for modern workloads.
Cons:
- Requires thread-safe applications, similar to Worker.
- Configuration can be slightly more complex.

My Experience with Worker and Event Modes

Before Apache 2.4 introduced the Event MPM, Worker was often the default choice. However, Event has since proven to be far superior, particularly when Apache is used as a proxy (e.g., with mod_wsgi for Python applications). The key advantage of Event lies in how it handles idle connections—by freeing up threads for new requests, it minimizes CPU and memory utilization. This makes it ideal for high-traffic environments.

One major issue about Worker mode is how threads are preserved for specific connections, even if those connections are idle. This leads to wasted resources, especially under heavy loads. Event eliminates this inefficiency, making it the clear choice.

Thread safety is often a reason to avoid Worker or Event modes. However, I’ve found that most applications can be adjusted to handle thread safety issues at the application level. Modern frameworks and libraries typically offer thread-safe options or ways to modify behavior to ensure compatibility.

Many developers default to Prefork without realizing the performance overhead it introduces. Using Prefork for high-traffic applications or as a proxy is a recipe for disaster. In my experience, it’s not uncommon to wake up to a crashed server or resource spikes caused by poor scalability with Prefork.

Some assume Worker and Event offer similar benefits, but the difference in handling idle connections is significant. Event is far better suited for scenarios with KeepAlive enabled, as it can efficiently manage idle connections and prevent resource bottlenecks.

Choosing the Right Worker Type

When to Use Each Type

Prefork: Only use Prefork if your application requires non-thread-safe modules and cannot be adjusted to work with threads.
Worker: Suitable for thread-safe applications in environments where KeepAlive is not heavily used.
Event: The go-to choice for modern setups, especially when using Apache as a proxy or handling high-traffic applications with KeepAlive enabled.

Always aim for Event mode if your setup allows. It offers the best balance of scalability, resource efficiency, and modern functionality. If thread safety is a concern, look for ways to address it within your application rather than defaulting to Prefork.

In future articles, I’ll dive into configuration and tuning tips for Event mode to help you get the most out of your Apache setup.

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Should You Always Use Daemon Mode?

Mostafa-DE — Tue, 07 Jan 2025 17:49:11 GMT

mod_wsgi is a great tool for deploying Python web applications with Apache. While it offers two modes embedded and daemon, understanding how they work is essential before deciding which one to use. Spoiler alert: daemon mode is the clear winner for most use cases. In this article, we’ll explain both modes, how they operate, and why daemon mode is your best bet for scalable and reliable deployments.

Note sure what WSGI or mod_WSGI mean? check out Understanding WSGI and mod_wsgi

Embedded Mode

In embedded mode, mod_wsgi runs your Python application directly within Apache’s worker processes. Essentially, the application becomes part of Apache itself, sharing resources and processes.

How Embedded Mode Works

Apache receives an HTTP request.
The worker process handling the request also loads and executes your Python application code.
The response is returned to the client.

Client -> Apache Worker (Runs Python Application) -> Response to Client

Applications run within Apache’s processes.
All applications share the same memory space.

Daemon Mode

In daemon mode, mod_wsgi spawns dedicated processes to run your Python application. These processes are isolated from Apache’s main worker processes, providing better control and flexibility.

How Daemon Mode Works

Apache receives an HTTP request.
The request is passed to a separate process managed by mod_wsgi.
The Python application processes the request and returns the response to Apache, which then delivers it to the client.

Client -> Apache -> mod_wsgi Daemon Process (Runs Python Application) -> Response to Client

Applications run in isolated processes.
Each application can have its own environment, dependencies, and configuration.

Why You Should Always Use Daemon Mode

The Downsides of Embedded Mode

First, Apache handles both HTTP requests and runs the Python application, leading to resource bottlenecks.
Since all Apache workers share the same memory, scaling up for traffic spikes becomes challenging. However, I'm not saying it will not scale but compared with Daemon mode you can scale way better and use fewer resources
Also since the entire app deployed and managed by Apache, any error in your Python application can bring down the entire server.

The Advantages of Daemon Mode

(Isolated Environments) Since each application runs in its own process, avoiding conflicts and improving stability.
Because You can control the number of processes and threads independently of Apache this means better Scalability.
Crashes or errors in the application won’t affect Apache’s ability to handle HTTP requests.
Starting with daemon mode ensures you’re ready to scale when needed.

As Graham Dumpleton, the creator of mod_wsgi, puts it: “Friends don’t let friends use embedded mode.” Even for small applications, daemon mode sets you up for success.

Configuring mod_wsgi for Daemon Mode

Here’s how you can set up daemon mode for your application:

:80>
    ServerName example.com

    WSGIRestrictEmbedded On
    WSGIDaemonProcess myapp user=www-data processes=2 threads=25 python-home=/ .../env
    WSGIProcessGroup myapp
    WSGIScriptAlias / /path/to/app.wsgi

    
        Require all granted

I will explain more about this configuration and talk about the best way to set up and tweak Apache and mod_wsgi, but I will leave this for another article

Quick tips for you:

Always keep an eye on CPU and Memory utilization to adjust the number of processes and threads (More about this in another article)
For CPU-intensive tasks, reduce the thread count. For I/O-heavy tasks, increase the number of threads.
Always use a virtual environment to isolate dependencies to avoid conflicts

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Understanding WSGI and mod_wsgi: A Simple Guide

Mostafa-DE — Tue, 24 Dec 2024 22:51:44 GMT

Introduction

Deploying Python applications hasn't always been as straightforward as it is today. In the early days, web developers faced significant challenges integrating their Python applications with web servers. There was no standard way for web servers to communicate with Python applications. In this article we are going to discuss some topics such as WSGi/mod_wsgi and see some examples of why deploying Python app is considered a challenge.

Before WSGI, deploying Python web apps was extremely hard and challenging especially because of the limitation of resources, even these days you don't find that many articles related to deployment or DevOps to find anything you need to dig deeper and read a lot of old articles, books, and watch old conferences just to get your head around.

Not to mention, Python itself can be a challenging language to work with. The Global Interpreter Lock (GIL) introduces complexities, and it requires a carefully configured environment to run properly. Add to that the package management and the potential for dependency conflicts. Because of these factors, deploying a Python app can be tricky. If you’re not sure what you’re doing, it could lead to a disaster once your app hits production and starts serving users around the globe.

What is WSGI?

WSGI stands for (Web Server Gateway Interface). It is essentially a specification that outlines how to implement web server gateways. Remember, it is a specification, not an implementation. When someone mentions that an application is deployed on WSGI, they usually mean it is deployed using a web server implementation like mod_wsgi, Gunicorn, etc.

It's crucial to understand the distinction between implementation and specification. A specification provides the guidelines and rules for implementing something specific, while an implementation is the actual execution of those guidelines.

What is mod_wsgi?

mod_wsgi is an Apache module that implements the WSGI specification, enabling seamless deployment of Python applications on Apache web servers.

Why mod_wsgi Became Popular

Integration with Apache: Apache was (and still is) a widely used web server. mod_wsgi’s integration with Apache made it a natural choice for deploying Python applications.
Performance: Compared to others, mod_wsgi offered significant performance improvements by using persistent processes to handle multiple requests.

How mod_wsgi Works

At a high level, mod_wsgi acts as a bridge between Apache and your Python application, you can say it acts as a Proxy in the middle (Although I don't like this term as it has a different meaning) but anyway, let's put it in this way:

Apache receives an HTTP request.
mod_wsgi translates the request into a format that the Python application understands (using the WSGI standard).
The application processes the request and returns a response.
mod_wsgi sends the response back to Apache, which delivers it to the client.

Although this way of interaction is specific to the daemon mode, we will discuess the different types of modes in mod_wsgi later in a different article but for now let's just stick with the daemon mode.

That's all for now. If you're interested in more advanced content about WSGI, Apache, or other DevOps topics, there are many articles in the DevOps series that you might find interesting.

Mostafa-DE Fayyad

Software Engineer

Mostafa-DE

Security Defaults That Save You Before You Need Them

Monitoring Comes First

Trust Nobody (Including Yourself)

Log Actions. All of Them.

Automate Principles, Not Tools

Beware Security Theater

Security Is a Mindset (On and Off the Clock)

A Tool-Agnostic Default Baseline

Identity & Access

Network

Data

Runtime / App

Observability

Process & Culture

Setting Up Celery for Production: Isolated Queues, Autoscaling, and Smarter Task Management

The Default Celery Setup

What I Changed and Why

Managing Multiple Workers with celery multi

celery.service

Why This Works Well in Production

Tricky Apache & mod_wsgi Issues I’ve Run Into (So You Don’t Have To)

Common Misconfigurations & How to Fix Them

→ Misaligned StartServers, MaxRequestWorkers, ThreadsPerChild Underutilisation of Resources

What's Wrong?

→ To Fix This: Align Settings for Proper Utilization

→ Not Setting --processes For mod_wsgi, blocking Requests Due to Slow Processing

→ Missing or incorrect listen-backlog, too Many Dropped Requests

→ To Fix: Increase listen-backlog

→ mod_remoteip can't parse Client IP

What's Wrong?

→ To Fix: Prevent ALB from Appending Port

→ mod_remoteip and Multi-Proxy Setups: How Misconfiguration Can Expose You to IP Spoofing

What's Wrong?

→ To Fix: Define Who You Actually Trust

1. Use RemoteIPHeader and RemoteIPInternalProxy properly

Bonus Tip: Use Custom Logging Formats

Understanding Apache Worker Types: Prefork, Worker, and Event

Overview of Apache Worker Types

Prefork

Worker

Event

Why I Recommend Event Mode

My Experience with Worker and Event Modes

Choosing the Right Worker Type

When to Use Each Type

Should You Always Use Daemon Mode?

Embedded Mode

How Embedded Mode Works

Daemon Mode

How Daemon Mode Works

Why You Should Always Use Daemon Mode

The Downsides of Embedded Mode

The Advantages of Daemon Mode

Configuring mod_wsgi for Daemon Mode

Understanding WSGI and mod_wsgi: A Simple Guide

Introduction

What is WSGI?

What is mod_wsgi?

Why mod_wsgi Became Popular

How mod_wsgi Works

Managing Multiple Workers with `celery multi`

`celery.service`

→ Misaligned `StartServers`, `MaxRequestWorkers`, `ThreadsPerChild` Underutilisation of Resources

→ Not Setting `--processes` For mod_wsgi, blocking Requests Due to Slow Processing

→ Missing or incorrect `listen-backlog`, too Many Dropped Requests

→ To Fix: Increase `listen-backlog`

→ `mod_remoteip` can't parse Client IP

1. Use `RemoteIPHeader` and `RemoteIPInternalProxy` properly