Tag Archives: linux

Making the Correct Insanely Difficult

tl;dr

If you’re trying to configure nginx on Elastic Beanstalk to redirect http requests to https, here’s what I learned.

  • During deployment, the nginx configuration for your app is located at this file path: /tmp/deployment/config/#etc#nginx#conf.d#00_elastic_beanstalk_proxy.conf via
  • Using a container command, you can edit that nginx configuration file right before it gets deployed.
  • I used a little perl one-liner to insert the redirect.

Background

So... we're using Amazon Web Services Elastic Beanstalk for one of the apps I'm working on. It's pretty easy to get started, but it's also really easy to find that you’re fighting Elastic Beanstalk to get it to stop doing something stupid.

I was fighting one of those "stupid" things the other day: http-to-https redirect.

Let's say you have a web application that requires users to login with a name and a password. You don't want users' passwords getting sent over the internet without being encrypted, of course. So you enable SSL and serve content over https.

But sometimes, users type your domain name (like, “google.com”) into the address bar, which defaults to http. Or they follow a link to your app that mistakenly uses http instead of https. In any event, you don’t want users who are trying to get to your app to get an error message telling them there’s nothing listening on the other end of the line, so you need to be listening for http requests but redirecting them to https for security.

Now, our app is written in Node.js, and we’ve configured Elastic Beanstalk to point internet traffic to an Elastic Load Balancer, which terminates SSL and proxies traffic to the backing servers, which are running our app behind nginx. This might sound like too many levels of indirection, but nginx is optimized for serving static content, while Node.js is optimized for dynamic content, so this is a pretty common setup.

And this is where Elastic Beanstalk gets stupid.

When we configured our app to listen for both http and https traffic, Elastic Beanstalk directed all of that traffic to nginx — and configured nginx to direct all of that traffic to our app — without giving us any way to redirect http traffic to https.

I imagine lots of apps want to respond to both http and https traffic while redirecting insecure http requests to secure https requests. Maybe I’m wrong.

Anyway, I want to do that. And I found it insanely difficult to accomplish.

Or: How I Learned to Stop Worrying and Love the Memory Leak

I received a "high memory usage" alert. Already panicking, I logged into New Relic and saw this terrifying graph:

Memory Leak?

That's a graph of memory usage, starting from when the server was created. For the uninitiated, when memory usage grows and grows and grows like that, chances are very, very high that you've got a nasty memory leak on your hands. Eventually, your server is going to run out of memory, and the system will start killing processes to recover some memory and keep running -- or just crash and burn.

The funny thing about this particular server is that I had already identified that this server was leaking resources, and I thought I'd fixed it.

Issue Closed

So, I started to investigate.

Running free -m confirmed that nearly all the memory was in use. But top (sorted by MEM%) indicated that none of the server processes were using much memory. Huh?

After some time on Google and Server Fault, I ran slabtop and saw that nearly all server memory was being cached by the kernel for something called dentry. This server has 16GB of RAM -- I'm no expert, but I'm pretty sure it does not need 14GB of cached directory entries. I know I can free this RAM, and with some more help from Google I find the magic incantation is:

echo 2 > /proc/sys/vm/drop_caches

After 5 terrifying seconds during which the server seemed completely locked up, the memory had been freed! But apparently, something about the way this server was acting was causing the kernel to keep all these directory entries cached. In other words, this was probably going to keep happening. I didn't want to have to create a cron job to manually clear the cache every 4 hours, but I wasn't above it.

More reading told me that maybe I was worried about nothing. Looking closely at the peaks of that graph, I saw that the kernel was freeing up memory.

Not a leak!

So maybe I was worried about nothing! Still, I didn't want New Relic alarms going off all the time. And what if the server needs memory more quickly than the kernel can free it? It seemed like something I shouldn't have to worry about.

Yet more Google-noodling, and I found that you can indeed tell the kernel how aggressively to clear the caches. (That latter post captured practically my thoughts exactly, and seemed to trace my experience tracking down this issue to a tee.)

So, after some tweaking, I settled on setting the following sysctl configuration in /etc/sysctl.conf (edit the file, then load it with sysctl -p):

vm.vfs_cache_pressure=100000
vm.overcommit_ratio=2
vm.dirty_background_ratio=5
vm.dirty_ratio=20

It seemed like the higher I set the vm.vfs_cache_pressure, the earlier (lower memory usage) it would free up the cache.

Here's a sweet graph showing three states:

  • [A] untweaked
  • [B] manually clearing the cache with echo 2 > /proc/sys/vm/drop_caches
  • [C] memory usage using the tweaked sysctl configuration

Slab Annotated

Those saw teeth on the right? That's the kernel freeing memory. Just like it was doing before, but more aggressively. This is a "memory leak" I can live with.

Using watch with a bash alias

I love the Unix watch command. On OSX, you can install it easily with Homebrew:

brew install watch

Something I didn't realize until 10 minutes ago is that if you want to watch the output of something in your bash aliases, watch will complain because it cannot find the command. This is because watch evaluates the command you pass to it with 'sh -c', which does not expand aliases. However, if you also create an alias for watch itself, aliases will work. So, you can add the following to your .bashrc:

alias watch='watch '

Note the trailing space inside the quotation marks.

Link:

Fixing Node.js v0.8.2 Build on Linux

There's a nasty gcc bug on RedHat (RHEL 6) and CentOS Linux (and related) that gets triggered when you try to build Node.js v0.8.2: pure virtual method called.

Solution: Run make install CFLAGS+=-O2 CXXFLAGS+=-O2 instead of just make install.

More info:

Resolution of trouble with CentOS 4 + PHP 5.1 + Zend Optimizer

I have a server running CentOS 4.4 updated with the PHP 5.1.6 from the testing repository. When I tried to install Zend Optimizer 3.0.2, I ran into this problem:

Failed loading [path]/ZendOptimizer. so: [path]/ZendOptimizer. so: undefined symbol: match

Then, when I tried to downgrade and install Zend Optimizer 3.0.1, I ran into this problem:

Failed loading [path]/ZendExtensionManager.so: [path]/ZendExtensionManager.so: failed to map segment from shared object: Permission denied

I noticed some talk about SELinux being a possible culprit, but rather than disable SELinux, I decided to try and solve the problem.

This article about SELinux and ColdFusion MX in Red Hat Linux 4 had the information I needed to quickly solve the problem. Basically, I needed to change the security context of the Zend extension manager and optimizer files so that Zend runs in the same security domain as the web server (in my case, Apache).

Here's what I did (note, I installed Zend Optimizer in a non-standard location -- /usr/include/php/Zend rather than /usr/local/Zend):


chdir /usr/include/php/Zend/lib
chcon -R --reference=/usr/sbin/httpd *.so
service httpd restart

Et voila!
Zend Working Screen Cap

Bash script: ted

I just wrote a little bash script, "ted" (for Tracking EDitor), which I am loving. You call ted like you would call your usual text editor, and ted backs up the file you're editing, appending the originial timestamp as a suffix. Then, when you're done editing the file, ted runs diff to keep a running log of changes you've made to the file. If you've made no changes, ted removes the backup file and exits without running diff.

Introducing: ted.