Cruiskeen Consulting LLC

Drupal time in the midwest

It's that time of year again, when Drupal camps spring up around the midwest.  I'm happy to say that Cruiskeen Consulting is sponsoring two different Drupal camps in the area, and that I am presenting at Drupal Camp Wisconsin tomorrow.

Drupal Camp Wisconsin is taking place this Friday and Saturday (July 25 and 26). Drupal Camp Wisconsin is always great fun, with good talks, a nice party, and a great opportunity to meet with other Drupalistas.  And it's always free.  More information about Drupal Camp Wisconsin can be found at http://drupalcampwi.com/ .  I will be presenting a talk tomorrow about Drupal Migrate for the non-coder. 

Drupal Camp Twin Cities is also happening soon -August 7-10.  The Twin Ciites camp is a bigger affair, with more attendees and a really high-powered lineup of first-rung Drupal folks. I'm not speaking there, but am an indiividual sponsor and will be there soaking up as much Drupal knowledge as I can. Also a spectacular party every year thrown by Advantage Labs.  I'd love to meet you at either venue.

Regression in Drupal 7.29 can lead to data loss

Thanks to Acquia for the heads-up.   There's a serious reported regression in Drupal 7.29 that can lead to data loss.

Because of this, we are at this time not doing automatic Drupal upgrades for clients who are on maintenance and are running the Drupal 7 branch. We're waiting until this problem is more fully understood and, we hope,  a patch or new releasse is available. We think the potential security risk of not doing the upgrade for a few days is less than the risk of clients destroying data on their sites. We want to at least understand the issue better.

We ARE in the process of doing Drupal 6 upgrades and filefield module upgrades for maintenance service customers, however.

A simple rule to block some Drupal spammers

We've all been there - your Drupal site is being overwhelmed by people and robots trying to post to your site, or trying to create accounts.  These attempts can really hammer your web server, and it's the sort of traffic that isn't going to be helped any by the thirty-five different kinds of caching you have set up on your site.  

We've seen thsi recently get much worse with some of our clients, so we set out to do something about it that's simple and reliable. We currently use fail2ban on all of our servers to try to cut back on the attempts at breaking in to the servers through ssh, ftp, etc.   Fail2ban reads your server logs, and looks for matching patterns.  If a pattern matches, the server can carry out any number of activities, including temporarily banning an IP address through iptables.  We thought "why not use fail2ban to block bots that are hammering the server?"  So - we came up with this simple plan.

On our servers that have been having particular difficulty, we have implemented a couple of fail2ban rules designed to protect the web server from incoming spam traffic and email address crawlers.  The email address crawler rule is part of the fail2ban package, and we just enabled it.  

We wrote a customized rule that reads our Varnish logs on the server (though you can use the same rule with sligiht modification to read the web server logs as well).  his file was placed into the /etc/fail2ban/filter.d dirctory - it reads the varnish log and looks for  attempts to post to comments, posts to the registration page, and posts to a particular domain (this is the reverse lookup domain from our colocation site- - we find that a certain amount of posts come in to that domain, and this is NOT normal traffic).  The file looks like this:

----------------------------------------------------------------

# Fail2Ban configuration file

#

# Author: Cyril Jaquier

#

# $Revision: 569 $

#

 

[Definition]

 

# Option:  failregex

# Notes.:  regex to match the password failure messages in the logfile. The

#          host must be matched by a group named "host". The tag "<HOST>" can

#          be used for standard IP/hostname matching and is only an alias for

#          (?:::f{4,6}:)?(?P<host>\S+)

# Values:  TEXT

#

 

failregex = ^<HOST> -.*POST.*comment/reply

        ^<HOST> -.*POST.*register

        ^<HOST> -.*POST.*airstreamcomm

# Option:  ignoreregex

# Notes.:  regex to ignore. If this regex matches, the line is ignored.

# Values:  TEXT

#

ignoreregex =

------------------------------------------------------------------------- We then set up another rule that uses this set of regexes - This is edited into the jail.conf file in /etc/fail2ban.  _____________________ [apache-posts] enabled  = true filter   = apache-posts action   = iptables-multiport[name=posts, port="http,https"]            sendmail-buffered[name=posts, lines=5, dest=shanson@cruiskeenconsulting.com] logpath  = /var/log/varnish/varnishncsa.log bantime  = 172800 maxretry = 5 ____________________________ The gist of all of this is that if one of the expressions in the first rule set is triggered more than 5 times (maxretry=5) in 10 minutes, then we ban the address for 24 hours.   This is fairly draconian, but has worked quite well for us.   In the grand scheme of things that  is a lot of attempts to post from a single IP, but your mileage may vary.  I'd suggest you start with a considerably higher maxretry setting, and then slowly cinch it down to make sure you're not causing yourself a lot of grief. Whenever an IP is banned it fires off an email to me (which gets automatically foldered in my email account).  This is helpful for tracing back in cases of a problem with banning an IP by accident. The fail2ban system also produces logs, which are also helpful.  Read about fail2ban at the link at the beginning of the article.  Happy IP blocking!