Common Questions (FAQ)

If I haven’t answered your question here, email me!

Plans & Installation

What kinds of apps can use Rails Autoscale?

You must have a Ruby web application running on Heroku. It doesn’t have to be Rails, but it does need to use Rack.

How do I install Rails Autoscale?

Just run heroku addons:create rails-autoscale from a terminal. Check out the Getting Started guide for step-by-step instructions.

What’s the difference between Rails Autoscale plans?

Every Rails Autoscale plan supports the same features and the same service. The only difference is the maximum number of dynos supported.

For example, if you might need to autoscale to four dynos or more, you’ll at least need the Silver plan. For more detail, see the plans & pricing docs.

How do I use Rails Autoscale on multiple Heroku apps?

Rails Autoscale does not support attaching to multiple apps. You must install the add-on separately for each app.

Can I use other autoscalers together with Rails Autoscale?

You can use different autoscalers for different processes. For example, you could use Heroku’s native autoscaling for web dynos and Rails Autoscale for worker dynos.

Do not use multiple autoscalers on the same process. This results in very unpredictable scaling behavior.

How is Rails Autoscale different from Heroku’s autoscaler?

Heroku offers a native autoscaling solution that’s worth a try if you run performance dynos and you only need to autoscale web dynos. Here’s what makes Rails Autoscale different:

  • Web autoscaling based on request queue time instead of total response time. This means more reliable autoscaling.
  • Worker autoscaling for Sidekiq, Delayed Job, and Que.
  • Works great on standard and performance dynos.
  • Personalized customer support from the developer who built it.

Can I use Rails Autoscale with a non-Rails app?

You must be running a Rack-based Ruby app on Heroku. If your Rack-based app is not running Rails, see these instructions on setting up rails_autoscale_agent.

What is the performance impact of the middleware agent?

The agent has no noticeable impact on response time. It collects the queue time for each request in memory—a very simple operation—and an async reporter thread periodically posts those queue times to the Rails Autoscale service. Check out the middleware code on GitHub if you’re interested.

Why is the autoscale agent not running?

Perhaps you’re running a worker-only app, or an app with very little web traffic? If not, check out the troubleshooting guide.

What happens if I change my RAILS_AUTOSCALE_URL config var?

Bad Things will happen. The Rails Autoscale add-on manages this config var, so it’s best to leave it alone.

Also note that if you fork a Heroku app, it will copy the config vars, including RAILS_AUTOSCALE_URL. This also results in Bad Things, because Rails Autoscale doesn’t know about the forked app. If you do fork a Heroku app with Rails Autoscale installed, be sure to remove the RAILS_AUTOSCALE_URL config var.

Dashboard & Settings

Why don’t I see any data in my dashboard?

Most likely, the agent is not running. The troubleshooting guide will help you resolve this.

Can I have different autoscaling behavior on the weekend (or some other schedule)?

Rails Autoscale does not support this natively. This request is usually a desire to have a minimum number of dynos running during busy times and scale down further during quiet times. My recommendation here is to allow your app to scale down, even during busy times. If you scale down too far—or if traffic picks up—you’ll immediately scale back up. Trust the autoscaler and give it a try!

If you really need different settings on some kind of schedule, you can create something yourself using the Rails Autoscale API alongside Heroku Scheduler or some other job scheduler.

Web Autoscaling

What is request queue time?

Put simply, request queue time is the time between Heroku’s router receiving a request and your app beginning to process the request. It includes network time between the router and application dyno, and it includes time waiting within the dyno for an available application process. The latter is what we care about—if requests are waiting for more than a few milliseconds, there’s a capacity issue.

This is why Rails Autoscale only scales based on queue time. Web requests can be slow for lots of reasons, but queue time always reflects capacity.

How quickly does Rails Autoscale respond to a capacity issue?

When your request queue time breaches your upscale threshold, Rails Autoscale will send an upscale request to Heroku within 20 seconds. The agent reports metrics every 10 seconds, and it can take up to 10 more seconds for this data to be processed.

After sending the request to Heroku, it’ll take between 20 and 60 seconds (depending on the startup time for your app) for your new dyno to begin receiving requests.

What if my app struggles to recover from a capacity issue?

Apps that receive steep spikes in traffic should consider scaling up by multiple dynos at a time. This option is available in your advanced settings.

Why is the queue time in Rails Autoscale so much higher than what I see in New Relic or Scout?

Most APM tools like New Relic and Scout are showing you the average for a given metric. Averages might provide smoother charts for overall trends, but they aren’t useful for detecting a capacity issue. Rails Autoscale uses the 95th percentile, so it will always be higher.

Why does my queue time spike whenever I deploy my app?

Unless you’re using Heroku’s Preboot feature, your app will be temporarily unavailable while it boots, such as during deploys and daily restarts. During this time, requests are routed to your web dynos, where they wait. All this waiting is reflected in your request queue time, which will likely cause an autoscale for your app.

This is not a bad thing! Your app autoscaling during a deploy means it’ll quickly recover from the temporary downtime during boot, and of course, it’ll autoscale back down once it catches up.

Why isn’t my app scaling down when I have almost no traffic?

Two possible reasons:

  • The autoscale agent starts up when your app receives its first request. If you’ve recently deployed or restarted—and your app is not receiving traffic—then the agent is probably not running.
  • Even if the agent is running, Rails Autoscale has a safeguard in place that prevents downscaling an app that hasn’t reported any web traffic for five minutes. This prevents downscaling an app that might be having a problem with the reporting agent.

Both of these issues are most common on staging/demo apps. The expectation is that your app is receiving some (even if just a trickle) regular traffic, as is the case for most production apps. If you need a workaround, the best approach is to use an uptime monitor to regularly ping your app, or you can do it manually if you’re testing on a staging app.

Worker Autoscaling

Can I use Rails Autoscale for a worker-only app (no web process)?

The Rails Autoscale agent only runs in a web process, so you must be running at least one web dyno. Even worker metrics are collected from the agent running in your web process.

Also note that a web request is what initially starts the agent process. If your app receives little or no web traffic, this could result in the agent never starting and never reporting metrics to Rails Autoscale. To work around this limitation, use an uptime monitor (FreshPing is a free option) to continually ping your site.

Can I autoscale multiple worker processes independently?

Yes! If you have multiple worker process types defined in your Procfile, they will all be available for autoscaling. Configuration for each process type is completely independent. Look for “autoscale additional worker dynos” in your settings page.

Which Rails background worker libraries are supported?

Sidekiq, Delayed Job, Que, and Resque (beta) are currently supported.

Why are my long-running jobs being terminated when scaling down?

Anytime you restart or shut down a worker dyno (such as downscaling, deploying, or restarting), you risk killing long-running jobs. Autoscaling often magnifies this issue because you’re shutting down worker dynos much more frequently.

Your worker backend will typically re-enqueue these jobs after being terminated, so you must ensure that your jobs are reentrant—that they can successfully re-run after a previous, interrupted run. You can also configure Rails Autoscale to prevent downscaling during long-running jobs.

I’ve enabled worker autoscaling, why don’t I see any data?

The agent takes a snapshot of job latency (queue time) every 10 seconds. If your job latency frequently hovers at 0 milliseconds, this might look like missing data in Rails Autoscale.

If you do expect to see some worker queue time in Rails Autoscale, it’s possible the agent is not running. Do you see queue time for your web dynos? If not, you’re probably running a worker-only app or an app that receives very little web traffic.

If you do see web queue times but no data for your worker dynos, email [email protected].