Zero Downtime Restarts with PM2

When running a live application cluster with PM2, it’s critical to ensure that code changes can be deployed without causing downtime for users. PM2 provides a powerful feature called zero downtime reload, allowing seamless server updates. Here's how to perform zero downtime restarts effectively.


The Scenario:

Your server cluster is running, and you need to make a change in your server code without affecting the live users. For example:

  • Change the timer sound from "ding" to "beep".

  • Reduce the timer delay from 9 seconds to 4 seconds.

Updating the server code and restarting the application traditionally would result in downtime, causing inconvenience to users. Instead, we’ll use PM2’s reload feature to apply changes with zero downtime.


Steps to Implement Zero Downtime Restarts:

  1. Make the Code Changes: Update your server code as required. For example:

     // Old: Ding sound and 9-second delay
     app.get('/timer', (req, res) => {
         setTimeout(() => {
             res.send(`Ding Ding Ding ${process.pid}`);
         }, 9000);
     });
    
     // New: Beep sound and 4-second delay
     app.get('/timer', (req, res) => {
         setTimeout(() => {
             res.send(`Beep Beep Beep ${process.pid}`);
         }, 4000);
     });
    

    Save your changes to the server file (e.g., server.js).

  2. Verify Current Code Behavior: Before applying changes, confirm the current server behavior. For example, making a request to the endpoint might still take 9 seconds due to the old code.

  3. Avoid Traditional Restart: Using a standard restart (e.g., pm2 restart server) would:

    • Terminate all processes at once.

    • Cause downtime while the server restarts.

  4. Perform Zero Downtime Reload: Use the pm2 reload command to restart processes one by one:

     pm2 reload server
    
    • This ensures that at least one process remains online at all times.

    • Users can continue accessing the application without interruption.

  5. Monitor the Reload Process:

    • Use the pm2 monit command to observe the reload in real-time.

    • Each process will restart sequentially, and the uptime will reset for individual processes.

  6. Verify Updated Behavior: After the reload, confirm that the updated server code is running. For example:

    • Make a request to the server endpoint.

    • Observe the response time (4 seconds) and the new behavior ("Beep Beep Beep" sound).


Benefits of Zero Downtime Reload:

  • Ensures continuous application availability, even during code updates.

  • Ideal for time-sensitive or high-traffic applications.

  • Enhances user experience by avoiding scheduled or unscheduled downtime notifications.


Monitoring and Debugging:

  • Use pm2 monit for a live dashboard of CPU, memory usage, and process status.

  • Logs are stored in the specified log file (e.g., logs.txt) for troubleshooting.


By leveraging PM2’s reload feature, you can deploy updates confidently without disrupting user access, maintaining a reliable and professional application experience.