Genx

Born in 1969 I am a member of the so called “Generation X”.

I recall at one point that Madison Ave. had given up on trying to reach Generation X.

But it didn’t really give up - it adapted. Sort of.

But I (and my generation) adapted too. Sort of.

And I buy things I think I need from companies wth whom [1] I might have ethical, moral or other reasons not to do business.

War of the Worlds

I remember as a kid being fascinated by the story around the radio broadcast of “The War of the Worlds”.

It shows how strong media can be in affecting people’s beliefs. Since then, media has only gotten better, more pervasive, more persuasive, more trusted, more ubiquitous. The War of the Worlds incident showed that media has to be careful to not abuse the power of the medium.

And people like stories. We get so fascinated by them . We believe them – we want to believe them. They pull us out of this “real” world to a place that is in any number of ways better.

I guess in small ways people do this all the time on a smaller personal level. We are each like little media outlets - “broadcasting” out a message that we kinda know or hope will get interpreted in a certain way. Sometimes we do this to help the communication, and sometimes to mislead or create a false impression.

This reminds me of the Seinfeld where George realizes that if he acts angry at work then his coworkers will think he’s working really hard and too busy to be bothered with new stuff.

Vim-and-such

I had all these windows open that I intended to read and study. Here is a snip from each one.

  • Git for Computer Scientists In simplified form, git object storage is “just” a Directed Acyclic Graph of objects … So, armed with that knowledge on how git stores the version history, how do we visualize things like merges?
  • Vim Find and replace examples Example 11. Alter sequence number in a numbered list while inserting a new item. This can be achieved by the following vim substitution command. :4,$s/\d\+/\=submatch(0) + 1/
  • Readline shortcuts Transpose
     keys    Action          Scope                   Direction/Place
     Ctrl-t  Transpose/drag  char. before the cursor ↷ over the character at the cursor
     Alt-t   Transpose/drag  word before the cursor  ↷ over the word at/after the cursor
  • iTerm2 keybindings Readline also supports jumping between words, that is re-positioning the cursor before or after the current/next word. The default binding for this is ESC-f for forward jumping and ESC-b for backward jumping.
  • VimGolf Real Vim ninjas count every keystroke - do you? (Danger! Quicksand.)

MMM Master-master-mysql

I starting trying to set up a test environment for feeling out master-master replication with mysql. I can do all this on one machine by running 2 mysqld processes from different configurations, on different ports, sockets, data directories, users etcetera.

Turning on apc.stat

It is common in an LAMP setup to setup APC with the apc.stat = 0 set so that once your opcodes are loaded into the cache they stay there – till you flush the cache or restart the web server. This makes sense if you have relatively infrequent updates. At jobX we are deploying dozens of times a day, but generally, each deploy consists of changes to just a small number of files. Despite this we send Apache a graceful after each deploy which flushes the APC cache. This isn’t causing any big problems but we can do this a bit better.

We actually want apc.stat = 1.

There were all sorts of fear and doubts about doing this. It seemed the exact opposite of what you should do on a large site. We feared that all that all those extra stat calls would slow things down or even cause the site to crash. But rather than rely on our intuition about this we decided to actually try it out and see what actually happens.

First we confirmed that apc.stat (and apc.stat_ctime) would behave the way we expected when we dropped new code on the server. Relatively easy and worked as expected. Next we needed to figure out how big our cache should be and if we wanted a ttl or not. In order to do that I needed a bit of a deeper understanding of how APC operates.

Its pretty well documented that when APC fills up, it dumps the whole cache and starts over. Generally you want to avoid this - for similar reasons as to why we wanted to get away from doing graceful restarts. What was less well documented was how apc.ttl really works. Most of what I read encouraged apc.ttl = 0 to avoid fragmentation.

If you have apc.ttl = 0 apc will not purge anything from your cache (unless the cache fills up, at which point APC will purge everything from the cache and start over). It may not be obvious but with apc.stat = 1 and apc.ttl = 0, that means that APC will hold onto older versions of files that have changed. If you’ve got lots of frequent changes, this could quickly lead to your cache filling up with old versions of your opcodes.

This behavior changes when setting apc.ttl to something larger than zero. APC will not do any cleanup unless it needs to. So if you set your ttl to 3600 that does not mean that apc will automatically dump all files older than an hour. APC will remove items older that ttl…

  • if it needs the space for other new items.
  • if a newer version of the same file gets inserted and an old version is older than the ttl.

For example, your config.php is cached and you make a change and deploy a new config.php. APC will cache the new version of the file and it will also hold onto the previous version of the file. However, if apc.ttl is > 0, apc will opportunistically check to see if it can clear out any of those old versions of that file. This helps keep cache size in check but it comes at a cost, namely fragmentation.

As old items are purged from the cache you start to get fragmentation. If fragmentation gets too high (more than 50%) you’re likely to start to see some performance impact. One way to keep fragmentation in check is to have a large enough cache. Some of the research I found suggests that a cache size about twice as large as the size of your files will go a long way towards keeping fragmentation low. I’m not sure but it sounds like APC might do some relocation of items in the cache which goes easier when it has large empty blocks to use for reshuffling.

Armed with this understanding I took a look at our historical graphs of APC (you do have graphs for everything, right?) to see how big the cache should be. Over the weekends, when there generally aren’t any deploys, the cache grows to just under 200M. Our cache was sized at 384M to start. I wanted to keep the cache comfortably large so I started ours at 500M.

To get a ttl to start with I looked at our deploy frequency, which files change most frequently and how many files change per deploy and went with 3600.

Here are a set of articles with details about apc:

Apc

I had to do a fairly deep dive into apc for some setup changes at work. These were related to our deployment process which requires us to gracefully restart apache after each deploy to deal with clearing apc. We’ve got a large code base and lots of deploys per day (30+), so while lots of files change every day, there are more that stay the same. Allowing the cache to endure through deloys is what we want so we actually want to change from apc.stat = 0 to apc.stat = 1. With that comes making sure our cache size is large enough and setting a reasonable ttl.

Here are a set of articles with details about apc:

On a pool of 75, 3 soon started showing increased fragmentation. Turns out those 3 servers didn’t have our app loaded on the ram disk (tmpfs) but were reading from the disk.

Chef

I set up this server to be managed (or manageable) by chef using the hosted opscode platform. This post is for some notes about how things are set up here.

I basically followed the quick-start guide from OpsCode.

I cloned the OpsCode repo skeleton and its sitting in /home/jkolber/chef-repo. All the settings are in .chef directory inside of that dir: /home/jkolber/chef-repo/.chef