Wednesday, 12 June 2013

MySQL: "Truncated incorrect DOUBLE value" issue

Man oh man this had me going for quite a while.
Guess I must be getting old.
Anyway, I have a rails site that uses ActiveRecord to a MySQL db.
One table has two columns that are integers, but have to be unsigned as they hold values > 2147483647.
(See http://dev.mysql.com/doc/refman/5.5/en/integer-types.html for more detail)
Anyway, the insertions went just fine.
An example output of the table structure and last few rows is as follows:

mysql> show columns from ranges;
+-----------+------------------+------+-----+---------+----------------+
| Field     | Type             | Null | Key | Default | Extra          |
+-----------+------------------+------+-----+---------+----------------+
| id        | int(10) unsigned | NO   | PRI | NULL    | auto_increment |
| starts_at | int(10) unsigned | NO   | PRI | NULL    |                |
| ends_at   | int(10) unsigned | NO   | PRI | NULL    |                |
| iso3_code | varchar(50)      | YES  |     | NULL    |                |
+-----------+------------------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

mysql> select * from ranges order by id desc limit 5;
+--------+------------+------------+-----------+
| id     | starts_at  | ends_at    | iso3_code |
+--------+------------+------------+-----------+
| 126602 | 4278190080 | 4294967295 | ZZZ       |
| 126601 | 4261412864 | 4278190079 | ZZZ       |
| 126600 | 4244635648 | 4261412863 | ZZZ       |
| 126599 | 4227858432 | 4244635647 | ZZZ       |
| 126598 | 4211081216 | 4227858431 | ZZZ       |
+--------+------------+------------+-----------+
5 rows in set (0.00 sec)

My code gets a value and does a search for the row that includes that number.
So if for instance I want the iso3_code for value 4261412864:

mysql> select * from ranges where starts_at >= 4261412864 
order by starts_at limit 1;
+--------+------------+------------+-----------+
| id     | starts_at  | ends_at    | iso3_code |
+--------+------------+------------+-----------+
| 126601 | 4261412864 | 4278190079 | ZZZ       |
+--------+------------+------------+-----------+
1 row in set (0.00 sec)

The problem was the generated SQL from ActiveRecord shown below:

SELECT `ranges`.* FROM `ranges` WHERE ("starts_at" >= 4261412864) 
ORDER BY `id` LIMIT 1

In the code I got a nil row set.
My first thought was: WTF?
So I cut and pasted it into the mysql command line and saw this:

mysql> SELECT `ranges`.* FROM `ranges` WHERE ("starts_at" >= 4261412864) 
ORDER BY `id` LIMIT 1;
Empty set, 1 warning (0.00 sec)

Wait. Warnings?

mysql> show warnings;
+---------+------+-----------------------------------------------+
| Level   | Code | Message                                       |
+---------+------+-----------------------------------------------+
| Warning | 1292 | Truncated incorrect DOUBLE value: 'starts_at' |
+---------+------+-----------------------------------------------+
1 row in set (0.00 sec)

What? DOUBLE? But the column starts_at is an integer...
So I started changing column types, and generally wasting my time for a couple of hours.

To cut a long story short it's programmer blindness.
It's obvious in retrospect.
The select statement is incorrect and I was interpreting the warning incorrectly.
I assumed the message referred to the column starts_at.
Nope.
What the select where clause is actually saying:
"Oh. You want rows from ranges where the string 'starts_at' has a value that is greater or equal to 4261412864. Er. Ok. I'll try to cast the STRING 'starts_at' to a DOUBLE and then do a comparison. Er... Ok. That didn't produce any rows and by the way I tried to cast that STRING to a double and failed."
Dammit.
So I looked at my code (simplified of course):

range = Range.where('"starts_at" >= ?', val).order(:starts_at).first

Oh how could I have been so dumb?
Quick change to test:

range = Range.where('`starts_at` >= ?', val).order(:starts_at).first

And all is fine.

Sunday, 9 June 2013

Capistrano Gotchas: 401 http request failed


In my last post I covered the basics of deploying a simple Sinatra app to a hosted server using Capistrano.
Later that week I had occasion to use Capistrano for real.
A repo on Github was in a private repository owned by someone else.
I was a contributor to that repo and was using Capistrano to deploy it.

During that exercise I encountered a problem.
I kept getting:
error: The requested URL returned error: 401 while accessing 
# https://github.com/GITHUB_USER/GITHUB_REPO.git/info/refs?service=git-receive-pack 
# fatal: HTTP request failed 

What confused me was that because I wasn't using ssh keys, I was being asked to enter the username and password for the repo despite them being defined in the config/deploy.rb.
I struggled with this for some time running debugs, traces and what not.
In the end the answer dawned on me like a large incontinent beast.

The version of git on my local machine (OS-X) was 1.7.7 and on the server (Ubuntu 10.04) was 1.7.0.
Bugger.
So I downloaded the latest version of git source code which at the time I did it was 1.8.3.
Doing:
make prefix=/opt/git all doc info 

Caused a slew of missing components for generating documentation, and since I didn't need it on the server, I just did:
make prefix=/opt/git all 

After that completed, I did:
sudo make prefix=/opt/git install 

Which installed all the components into /opt/git.

I then adjusted the config/deploy.rb thus:
# Specify the EXACT locations of git 
set :scm_command, '/opt/git/bin/git' 
set :local_scm_command, '/usr/bin/git' 

And voila! The cap deploy worked as advertised.

For reference I did read: https://help.github.com/articles/https-cloning-errors

I have to say that Capistrano seems to have the 'growing like topsy' disease.
The options in the config/deploy.rb seem to be organic rather than structured.
What I mean by that is that the scm_username and scm_password are structured and give a clear indication of their purpose, while the remote server variables user and password do not.
I would have perhaps named variables in a hierarchical manner using yml for example:

application:
    name: [APP_NAME]
repositories:
    scm:
      uses: git
      username: [GIT_USER]
      password: [GIT_PASS]
      branch: master
local:
  command: /usr/bin/git
remotes:
  [SERVER_NAME]:
    web: www.somedomain.com
    app: www.somedomain.com
    db: www.somedomain.com
    username: [SERVER_USERNAME]
    password: [SERVER_PASSWORD]
    deploy_to: /path/to/site/folder
    command: /opt/git/bin/git
    keep_releases: 5
    use_sudo: false
    before:
      ...set of tasks...
    after:
      ...set of tasks...
    term_options: { pty: false } # or whatever
  [ANOTHER_SERVER]:
    ...etc...

Just a thought.

Monday, 3 June 2013

Let's Deploy! Sinatra-DataMapper-Sqlite-MySQL-Hosted-Capistrano

Ok. This is a long post and mirrors the README.md in https://github.com/ZenGirl/Sinatra-DataMapper-Sqlite-MySQL-Hosted-Capistrano so go there if you want to view the application files.

Local development

Example Sinatra app using Sqlite development, MySQL production deployed to a hosted server via capistrano

To run, do:
bundle install

Then simply type:
rackup

For testing production, simply use:
RACKUP_ENV=production rackup

For reference purposes the output of bundle show is as follows:
  Gems included by the bundle:
    * addressable (2.2.8)
    * bcrypt-ruby (3.0.1)
    * bundler (1.3.5)
    * data_mapper (1.2.0)
    * data_objects (0.10.12)
    * dm-aggregates (1.2.0)
    * dm-constraints (1.2.0)
    * dm-core (1.2.0)
    * dm-do-adapter (1.2.0)
    * dm-migrations (1.2.0)
    * dm-mysql-adapter (1.2.0)
    * dm-serializer (1.2.2)
    * dm-sqlite-adapter (1.2.0)
    * dm-timestamps (1.2.0)
    * dm-transactions (1.2.0)
    * dm-types (1.2.2)
    * dm-validations (1.2.0)
    * do_mysql (0.10.12)
    * do_sqlite3 (0.10.12)
    * fastercsv (1.5.5)
    * json (1.8.0)
    * json_pure (1.8.0)
    * log4r (1.1.10)
    * multi_json (1.7.4)
    * mysql (2.9.1)
    * rack (1.5.2)
    * rack-protection (1.5.0)
    * sinatra (1.4.2)
    * sqlite3 (1.3.7)
      * stringex (1.5.1)
      * tilt (1.4.1)
      * uuidtools (2.1.4)

The system does not install the database rows by default.
You can cause this by making a call to http://whatever.com/idl/SEED_SECRET
Obviously this is FAR from secure, and is shown simply to illustrate the point.
The seed data itself is in the seed_data folder.
Be aware that the IpToCountry.2013-05.27.csv is huge.
It was drawn down from http://software77.net/geo-ip and massaged to create the addresses.csv and countries.csv.
In that folder is an example shell script to create new csv files based on a new original file.

I strongly suggest reading the application.rb file as it is chock full of comments.

Production

Passenger

Installing Passenger is simple assuming you have all the required packages installed.
The Passenger install users guides are here:

http://www.modrails.com/documentation/Users%20guide%20Apache.html
http://www.modrails.com/documentation/Users%20guide%20Nginx.html

The users guides are *long* but the gist is:

1) gem install passenger
2) passenger-install-apache2-module

To ensure you have the Apache headers and linux modules, review: http://www.modrails.com/documentation/Users%20guide%20Apache.html#troubleshooting.
For NGinx users, the install is virtually identical.

In any case, all the installer does is use your ruby install to create some files:
  [/etc/apache2/mods-available/passenger.conf]
PassengerRoot /opt/ruby-2.0.0-p195/lib/ruby/gems/2.0.0/gems/passenger-4.0.2
PassengerRuby /opt/ruby-2.0.0-p195/bin/ruby
PassengerDefaultRuby /opt/ruby-2.0.0-p195/bin/ruby
PassengerMaxPoolSize 6
PassengerPoolIdleTime 0
PassengerMaxRequests 1000

and
  [/etc/apache2/mods-available/passenger.load]
LoadModule passenger_module /opt/ruby-2.0.0-p195/lib/ruby/gems/2.0.0/gems/passenger-4.0.2/libout/apache2/mod_passenger.so

OBVIOUSLY you'll want to change the path to the gems and ruby2!

You'll have to `a2enmod passenger` to create the links from `/etc/apache2/mods-enabled` to `/etc/apache2/mods-available`.
After that, just restart Apache.

DNS

You can't make your site run properly (I'm excluding accessing it from an IP address) without a name.
So access your DNS zone settings (netregistry or whatever) and ensure you have the A or CNAME records configured for that name.
In the case of this application the name is `sdshmc.mydomain.com` and the record in DNS looks like this:
sdshmc 3600 IN A 192.168.170.115

Once the name has propagated, you can check it like this:
my_remote_name@my_remote_host:~$ nslookup sdshmc.mydomain.com
Server:  74.207.242.5
Address: 74.207.242.5#53

Non-authoritative answer:
Name: sdshmc.mydomain.com
Address: 192.168.170.115

Cool.

Database

The site uses MySQL in production, so you will have to ensure that the database defined in config/database.yml exists and is accessible.
Here is a mysql command line example:
mysql> create database sdshmc;
Query OK, 1 row affected (0.01 sec)

mysql> show create database sdshmc;
+----------+-----------------------------------------------------------------+
| Database | Create Database                                                 |
+----------+-----------------------------------------------------------------+
| sdshmc   | CREATE DATABASE `sdshmc` /*!40100 DEFAULT CHARACTER SET utf8 */ |
+----------+-----------------------------------------------------------------+
1 row in set (0.06 sec)

mysql> grant all on sdshmc.* to 'sdshmc'@'localhost' identified by 'sdshmc';
Query OK, 0 rows affected (0.21 sec)

mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

Don't worry about creating tables as DataMapper will do that.

Web Server

I'm illustrating Apache2 here, but the steps are similar for NGinx.
(I'll cover NGinx setup in a separate post)

First you need to configure the directory where your site will run from.
Before we use capistrano, we'll just test whether your site even functions correctly.
So create your site folder, such as `sdshmc` and use scp to copy your sites files into it.

This is only a test step to make sure your web server configuration is working at all.
Although you could use this method (copying individually changed files) to work on your remote site, it is easy to forget a change.
It's also a PITA.

An example transcript of this looks like this:
my_home_machine:sdshmc my_user_name$ scp -r * my_remote_name@my_remote_host:/home/my_remote_name/sdshmc
The authenticity of host 'my_remote_host (192.168.170.115)' can't be established.
RSA key fingerprint is d6:60:3e:51:07:e0:43:82:cf:5b:49:76:e7:1a:ef:d6.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'my_remote_host' (RSA) to the list of known hosts.
my_remote_name@my_remote_host's password:
Gemfile                      100% 1027     1.0KB/s   00:00
Gemfile.lock                 100% 2000     2.0KB/s   00:00
LICENSE                      100% 1061     1.0KB/s   00:00
README.md                    100% 7036     6.9KB/s   00:00
application.rb               100%   16KB  15.7KB/s   00:00
database.yml                 100%  938     0.9KB/s   00:00
config.ru                    100%  587     0.6KB/s   00:00
...elided for brevity...
IpToCountry.2013-05-27.csv   100% 8827KB  76.1KB/s   01:56
make_seed_data.sh            100%  654     0.6KB/s   00:00

Obviously you'll have different user names, hosts, IP addresses, fingerprints etc.
Once you're done your folder, probably `/home/yourname/sdshmc` will look something like this:
my_remote_name@my_remote_host:~/sdshmc$ tree .
.
├── application.rb
├── config
│   └── database.yml
├── config.ru
├── db
│   └── development.db
├── Gemfile
├── Gemfile.lock
├── LICENSE
├── logs
│   ├── access.log
│   ├── development.log
│   └── production.log
├── models
│   ├── Country.rb
│   ├── IPv4.rb
│   └── Vendor.rb
├── public
│   └── folder_must_exist
├── README.md
└── seed_data
    ├── addresses.csv
    ├── countries.csv
    ├── IpToCountry.2013-05-27.csv
    └── make_seed_data.sh

5 directories, 18 files

BTW: The tree program is very useful to see the structure of a folder.

Now there is an important step.
You need to ensure your gems are installed.
First check your ruby version is valid:
my_remote_name@my_remote_host:~/sdshmc$ export PATH=/opt/ruby-2.0.0-p195/bin:$PATH
my_remote_name@my_remote_host:~/sdshmc$ which ruby
/opt/ruby-2.0.0-p195/bin/ruby
my_remote_name@my_remote_host:~/sdshmc$ ruby -v
ruby 2.0.0p195 (2013-05-14 revision 40734) [x86_64-linux]

In my case I have a global install of ruby2.
You might have an RVM version.
Whatever.
Now we do a bundle install:
my_remote_name@my_remote_host:~/sdshmc$ which bundle
/opt/ruby-2.0.0-p195/bin/bundle
my_remote_name@my_remote_host:~/sdshmc$ bundle install
Fetching gem metadata from https://rubygems.org/.........
Fetching gem metadata from https://rubygems.org/..
Installing addressable (2.2.8)
Installing bcrypt-ruby (3.0.1)
Installing dm-core (1.2.0)
Installing dm-aggregates (1.2.0)
Installing dm-constraints (1.2.0)
Installing dm-migrations (1.2.0)
Installing fastercsv (1.5.5)
Using json (1.8.0)
Installing json_pure (1.8.0)
Installing multi_json (1.7.4)
Installing dm-serializer (1.2.2)
Installing dm-timestamps (1.2.0)
Installing dm-transactions (1.2.0)
Installing stringex (1.5.1)
Installing uuidtools (2.1.4)
Installing dm-types (1.2.2)
Installing dm-validations (1.2.0)
Installing data_mapper (1.2.0)
Installing data_objects (0.10.12)
Installing dm-do-adapter (1.2.0)
Installing do_mysql (0.10.12)
Installing dm-mysql-adapter (1.2.0)
Installing do_sqlite3 (0.10.12)
Installing dm-sqlite-adapter (1.2.0)
Installing log4r (1.1.10)
Installing mysql (2.9.1)
Using rack (1.5.2)
Installing rack-protection (1.5.0)
Using tilt (1.4.1)
Installing sinatra (1.4.2)
Installing sqlite3 (1.3.7)
Using bundler (1.3.5)
Your bundle is complete!
Use `bundle show [gemname]` to see where a bundled gem is installed.

So... How do we see if it's working?
Well first we create a virtual host definition.
For Apache2, this is done by creating a file in /etc/apache2/sites-available named after your site.
In this case we're creating a site named sdshmc.mydomain.com, so the file will be /etc/apache2/sites-available/sdshmc.mydomain.com.
I have included an example Apache2 VHost config here:
<VirtualHost *:80>
  ServerAdmin     your.name@some.email.service.com

  ServerName      sdshmc.mydomain.com
  # If you want an alias for your site using CNAME for example, do womthing like this:
  #ServerAlias     sdshmc.your-app-name.com

  ServerSignature Off

  # Points to your site files
  # NOTE: You must have a public folder even if it's empty
  DocumentRoot    /home/some_user/sdshmc/public

  # Only interested in warnings and above
  LogLevel        warn

  # For access and error logging
  # Note that you'll have to ensure this folder is wriatble by www-data
  ErrorLog        /home/some_user/sdshmc/logs/error.log
  CustomLog       /home/some_user/sdshmc/logs/access.log combined

  # If you're using cgi-bin programs
  #ScriptAlias     /cgi-bin/ /usr/lib/cgi-bin/

  # The directory where the site is stored
  # NOTE the trailing slash!
  <Directory /home/some_user/sdshmc/public/>
    Options       Indexes FollowSymLinks MultiViews
    AllowOverride All
    # We allow first, then deny
    Order         Allow,Deny
    # For security during testing, put your home IP address here
    #Allow from    200.200.200.200
    # Otherwise, use this:
    Allow from     All
  </Directory>
</VirtualHost>

Edit your file like this:
sudo vi /etc/apache2/sites-available/sdshmc.mydomain.com

SideStep: The logs folder will need to be writable by Apache, so do this:
sudo chown www-data.www-data logs

Enable the site:
sudo a2ensite sdshmc.mydomain.com

You'll see a message about reloading Apache. So we need to do that:
sudo /etc/init.d/apache2 reload

You should see something like this:
sudo /etc/init.d/apache2 reload
 * Reloading web server config apache2   [ OK ]

Sometimes you make a syntax error like I did doing this. An example might be:
Syntax error on line 30 of /etc/apache2/sites-enabled/sdshmc.mydomain.com:
order takes one argument, 'allow,deny', 'deny,allow', or 'mutual-failure'
   ...fail!

As you can see, I made a mistake with the Order directive.
In my case I had a space between the 'Allow,' and the 'Deny'

Ok. You should now be able to browse to http://sdshmc.mydomain.com and see your first message:
{"errors":["You need to provide an IPv4 address"]}

Now try `http://sdshmc.mydomain.com/60.240.233.28/this_app/df76f1e54f63eae442ebf3b4d6c46531`:
{"errors":["Unknown vendor"]}

What's wrong?
Well the database tables have not been seeded.
To see that, go to your mysql command prompt and look at the created tables:
mysql> show table status;
+-----------+--------+---------+------------+------+-...-+
| Name      | Engine | Version | Row_format | Rows | ... |
+-----------+--------+---------+------------+------+-...-+
| addresses | InnoDB |      10 | Compact    |    0 | ... |
| countries | InnoDB |      10 | Compact    |    0 | ... |
| vendors   | InnoDB |      10 | Compact    |    0 | ... |
+-----------+--------+---------+------------+------+-...-+
3 rows in set (0.00 sec)

The tables are created but are empty.
Now we check the logs to see we now have a production.log:
my_remote_name@my_remote_host:~/sdshmc/logs$ ls -l
total 16
-rw-r--r-- 1 my_remote_name my_remote_name 5446 2013-06-03 00:09 access.log
-rw-r--r-- 1 my_remote_name my_remote_name    0 2013-06-02 23:20 development.log
-rw-r--r-- 1 root           root            999 2013-06-02 23:48 error.log
-rw-r--r-- 1 my_remote_name my_remote_name  576 2013-06-03 00:07 production.log

If you tail it while we use the http://sdshmc.mydomain.com/idl/SEED_SECRET url you'll see the tables filling up with data.
Afterwards you can use the mysql command line to check the results:
mysql> show table status;
+-----------+--------+---------+------------+--------+-...-+
| Name      | Engine | Version | Row_format | Rows   | ... |
+-----------+--------+---------+------------+--------+-...-+
| addresses | InnoDB |      10 | Compact    | 127157 | ... |
| countries | InnoDB |      10 | Compact    |    238 | ... |
| vendors   | InnoDB |      10 | Compact    |      2 | ... |
+-----------+--------+---------+------------+--------+-...-+
3 rows in set (0.00 sec)

Now a call to http://sdshmc.mydomain.com/60.240.233.28/this_app/df76f1e54f63eae442ebf3b4d6c46531 yields:
{"iso3":"AUS","country":"Australia"}

And a call to http://sdshmc.mydomain.com/60.240.233.28/this_app/df76f1e54f63eae442ebf3b4d6c46531.xml yields:
<response>
  <iso3>AUS</iso3>
  <country>Australia</country>
</response>

In browsers you'll see this:
This XML file does not appear to have any style information associated with it. The document tree is shown below.

The error is because we didn't include an <xml .../> header.

We'll be deleting all the code later and pointing to a capistrano structured layout later, but this proves our app works.

Capistrano

Finally. We're close.
First you need to ensure the capistrano gem is installed:
gem install capistrano
Fetching: highline-1.6.19.gem (100%)
Successfully installed highline-1.6.19
Fetching: net-ssh-2.6.7.gem (100%)
Successfully installed net-ssh-2.6.7
Fetching: net-sftp-2.1.2.gem (100%)
Successfully installed net-sftp-2.1.2
Fetching: net-scp-1.1.1.gem (100%)
Successfully installed net-scp-1.1.1
Fetching: net-ssh-gateway-1.2.0.gem (100%)
Successfully installed net-ssh-gateway-1.2.0
Fetching: capistrano-2.15.4.gem (100%)
Successfully installed capistrano-2.15.4
6 gems installed

Next we need to capify our app.
If you're doing this yourself, you'll have to do this:
capify .
[add] writing './Capfile'
[add] writing './config/deploy.rb'
[done] capified!

If you're using this application, I've already done it.
You'll see that two files get created; Capfile and config/deploy.rb.

Capfile is very basic and we won't need to fiddle with it:
load 'deploy'
# Uncomment if you are using Rails' asset pipeline
    # load 'deploy/assets'
load 'config/deploy' # remove this line to skip loading any of the default tasks

The config/deploy.rb is where most of the work is done.
Initially it will be like this:
set :application, "set your application name here"
set :repository,  "set your repository location here"

# set :scm, :git # You can set :scm explicitly or Capistrano will make an intelligent guess based on known version control directory names
# Or: `accurev`, `bzr`, `cvs`, `darcs`, `git`, `mercurial`, `perforce`, `subversion` or `none`

role :web, "your web-server here"                          # Your HTTP server, Apache/etc
role :app, "your app-server here"                          # This may be the same as your `Web` server
role :db,  "your primary db-server here", :primary => true # This is where Rails migrations will run
role :db,  "your slave db-server here"

# if you want to clean up old releases on each deploy uncomment this:
# after "deploy:restart", "deploy:cleanup"

# if you're still using the script/reaper helper you will need
# these http://github.com/rails/irs_process_scripts

# If you are using Passenger mod_rails uncomment this:
# namespace :deploy do
#   task :start do ; end
#   task :stop do ; end
#   task :restart, :roles => :app, :except => { :no_release => true } do
#     run "#{try_sudo} touch #{File.join(current_path,'tmp','restart.txt')}"
#   end
# end

I've made some changes to the deploy.rb to illustrate what you need to change:
set :application, 'IPv42Country'

# I'm using git. If you use svn, put that here
set :scm, :git
set :repository, 'https://github.com/YOUR_GIT_USERNAME/YOUR_GIT_PROJECT.git'
set :scm_username, 'YOUR_GIT_LOGIN'
set :scm_passphrase, 'YOUR_GIT_PASSWORD'

# Must be set for the password prompt from git to work
default_run_options[:pty] = true

# The server user and password
set :user, 'YOUR_REMOTE_SERVER_USERNAME'

# We always deploy the master branch
set :branch, 'master'

# Where we are going to deploy the code
set :deploy_to, '/home/your_remote_folder/sdshmc'

# Now we set roles
role :web, 'sdshmc.mydomain.com'
role :app, 'sdshmc.mydomain.com'
role :db,  'sdshmc.mydomain.com', :primary => true # This is where Rails migrations will run
# We could have done this:
# server 'sdshmc.mydomain.com', :app, :web, :db, :primary => true


# if you want to clean up old releases on each deploy uncomment this:
# after "deploy:restart", "deploy:cleanup"

# if you're still using the script/reaper helper you will need
# these http://github.com/rails/irs_process_scripts

# If you are using Passenger mod_rails uncomment this:
namespace :deploy do
  task :start do ; end
  task :stop do ; end
  task :restart, :roles => :app, :except => { :no_release => true } do
    run "#{try_sudo} touch #{File.join(current_path,'tmp','restart.txt')}"
  end
end

You'll have to fix all the locations I have used of course.
Now we have to get rid of that test app we used scp to load.
So go to your server and get rid of all the files except logs.
Your tree should look like this:
my_remote_name@my_remote_host:~/sdshmc$ tree .
.
└── logs
    ├── access.log
    ├── development.log
    ├── error.log
    └── production.log

1 directory, 4 files

Now we run our first capistrano command!
On your local machine, in the root folder of your site you do the initial setup:
my_machine:your_root_site_folder your_username$ cap deploy:setup
  * 2013-06-03 11:55:08 executing `deploy:setup'
  * executing "sudo -p 'sudo password: ' mkdir -p /home/remote_user/sdshmc \
    /home/remote_user/sdshmc/releases \
    /home/remote_user/sdshmc/shared \
    /home/remote_user/sdshmc/shared/system \
    /home/remote_user/sdshmc/shared/log \
    /home/remote_user/sdshmc/shared/pids"
    servers: ["sdshmc.mydomain.com"]
Password:
    [sdshmc.mydomain.com] executing command
 ** [out :: sdshmc.mydomain.com]
    command finished in 903ms
  * executing "sudo -p 'sudo password: ' chmod g+w /home/remote_user/sdshmc \
    /home/remote_user/sdshmc/releases \
    /home/remote_user/sdshmc/shared \
    /home/remote_user/sdshmc/shared/system \
    /home/remote_user/sdshmc/shared/log \
    /home/remote_user/sdshmc/shared/pids"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 695ms

I've split the lines up to make it clear what is happening.
Notice that you had to enter your remote servers password.

Your remote server tree should now look like this:
my_remote_name@my_remote_host:~/sdshmc$ tree .
.
├── logs
│   ├── access.log
│   ├── development.log
│   ├── error.log
│   └── production.log
├── releases
└── shared
    ├── log
    ├── pids
    └── system

6 directories, 4 files

See the new folders?

Ok. Now we do a check on your local machine:
my_machine:your_root_site_folder your_username$ cap deploy:check
  * 2013-06-03 12:05:44 executing `deploy:check'
  * executing "test -d /home/remote_user/sdshmc/releases"
    servers: ["sdshmc.mydomain.com"]
Password:
    [sdshmc.mydomain.com] executing command
    command finished in 650ms
  * executing "test -w /home/remote_user/sdshmc"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 649ms
  * executing "test -w /home/remote_user/sdshmc/releases"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 649ms
  * executing "which git"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 639ms
The following dependencies failed. Please check them and try again:
--> You do not have permissions to write to `/home/remote_user/sdshmc/releases'. (sdshmc.mydomain.com)

Ah. Now see what happened? So we check the permissions for our user:
my_remote_name@my_remote_host:~/sdshmc$ ls -l
total 12
drwxr-xr-x 2 www-data www-data 4096 2013-06-02 23:36 logs
drwxrwxr-x 2 root     root     4096 2013-06-03 01:55 releases
drwxrwxr-x 5 root     root     4096 2013-06-03 01:55 shared

Well there's the problem. root owns releases and shared.
So we need to change the permissions:
my_remote_name@my_remote_host:~/sdshmc$ sudo chown -Rv my_remote_name.my_remote_name releases shared
[sudo] password for my_remote_name:
changed ownership of `releases' to my_remote_name:my_remote_name
changed ownership of `shared/pids' to my_remote_name:my_remote_name
changed ownership of `shared/log' to my_remote_name:my_remote_name
changed ownership of `shared/system' to my_remote_name:my_remote_name
changed ownership of `shared' to my_remote_name:my_remote_name
my_remote_name@my_remote_host:~/sdshmc$ ls -l
total 12
drwxr-xr-x 2 www-data        www-data        4096 2013-06-02 23:36 logs
drwxrwxr-x 2 my_remote_name  my_remote_name  4096 2013-06-03 01:55 releases
drwxrwxr-x 5 my_remote_name  my_remote_name  4096 2013-06-03 01:55 shared

And a test of cap deploy:check locally again:
my_machine:your_root_site_folder your_username$ cap deploy:check
  * 2013-06-03 12:05:44 executing `deploy:check'
  * executing "test -d /home/remote_user/sdshmc/releases"
    servers: ["sdshmc.mydomain.com"]
Password:
    [sdshmc.mydomain.com] executing command
    command finished in 650ms
  * executing "test -w /home/remote_user/sdshmc"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 649ms
  * executing "test -w /home/remote_user/sdshmc/releases"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 649ms
  * executing "which git"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 639ms
You appear to have all necessary dependencies installed

Ok. Next step.
Normally at this point you would create the database and add credentials, but we've already done that above.
So we skip onto actually doing a test push to our server.
So on our local machine we do:
my_machine:your_root_site_folder your_username$ cap deploy:update
  * 2013-06-03 12:14:32 executing `deploy:update'
 ** transaction: start
  * 2013-06-03 12:14:32 executing `deploy:update_code'
    executing locally: "git ls-remote https://github.com/YOUR_GIT_USER/YOUR_GIT_PROJECT.git master"
    command finished in 1840ms
  * executing "git clone -q -b master https://github.com/YOUR_GIT_USER/YOUR_GIT_PROJECT.git \
    /home/remote_user/sdshmc/releases/20130603021433 && \
    cd /home/remote_user/sdshmc/releases/20130603021433 && \
    git checkout -q -b deploy f288a7baf7389fd486a777755a3415bec8a90025 && \
    (echo f288a7baf7389fd486a777755a3415bec8a90025 > /home/remote_user/sdshmc/releases/20130603021433/REVISION)"
    servers: ["sdshmc.mydomain.com"]
Password:
    [sdshmc.mydomain.com] executing command
 ** [sdshmc.mydomain.com :: out] Unpacking objects:   1% (1/74)
 ** [sdshmc.mydomain.com :: out] Unpacking objects:   2% (2/74)
 ** [sdshmc.mydomain.com :: out] Unpacking objects:   4% (3/74)
 ** [sdshmc.mydomain.com :: out] Unpacking objects:   5% (4/74)
 ...elided for brevity...
 ** [sdshmc.mydomain.com :: out] Unpacking objects:  97% (72/74)
 ** [sdshmc.mydomain.com :: out] Unpacking objects:  98% (73/74)
Unpacking objects: 100% (74/74), done.] Unpacking objects: 100% (74/74)
    command finished in 4488ms
  * 2013-06-03 12:14:47 executing `deploy:finalize_update'
  * executing "chmod -R -- g+w /home/remote_user/sdshmc/releases/20130603021433 && \
    rm -rf -- /home/remote_user/sdshmc/releases/20130603021433/public/system && \
    mkdir -p -- /home/remote_user/sdshmc/releases/20130603021433/public/ && \
    ln -s -- /home/remote_user/sdshmc/shared/system /home/remote_user/sdshmc/releases/20130603021433/public/system && \
    rm -rf -- /home/remote_user/sdshmc/releases/20130603021433/log && \
    ln -s -- /home/remote_user/sdshmc/shared/log /home/remote_user/sdshmc/releases/20130603021433/log && \
    rm -rf -- /home/remote_user/sdshmc/releases/20130603021433/tmp/pids && \
    mkdir -p -- /home/remote_user/sdshmc/releases/20130603021433/tmp/ && \
    ln -s -- /home/remote_user/sdshmc/shared/pids /home/remote_user/sdshmc/releases/20130603021433/tmp/pids"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
    command finished in 732ms
  * executing "find /home/remote_user/sdshmc/releases/20130603021433/public/images \
    /home/remote_user/sdshmc/releases/20130603021433/public/stylesheets \
    /home/remote_user/sdshmc/releases/20130603021433/public/javascripts -exec touch -t 201306030214.48 -- {} ';'; true"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
 ** [out :: sdshmc.mydomain.com] find: `/home/remote_user/sdshmc/releases/20130603021433/public/images'
 ** [out :: sdshmc.mydomain.com] : No such file or directory
 ** [out :: sdshmc.mydomain.com] find:
 ** [out :: sdshmc.mydomain.com] `/home/remote_user/sdshmc/releases/20130603021433/public/stylesheets': No such file or directory
 ** [out :: sdshmc.mydomain.com]
 ** [out :: sdshmc.mydomain.com] find: `/home/remote_user/sdshmc/releases/20130603021433/public/javascripts'
 ** [out :: sdshmc.mydomain.com] : No such file or directory
    command finished in 669ms
  * 2013-06-03 12:14:48 executing `deploy:create_symlink'
  * executing "sudo -p 'sudo password: ' rm -f /home/remote_user/sdshmc/current && \
    sudo -p 'sudo password: ' ln -s /home/remote_user/sdshmc/releases/20130603021433 /home/remote_user/sdshmc/current"
    servers: ["sdshmc.mydomain.com"]
    [sdshmc.mydomain.com] executing command
 ** [out :: sdshmc.mydomain.com]
    command finished in 897ms
 ** transaction: commit

Whoah! Lot's happened.
I won't go through all the goriness of it.
It's easier to show what happened on our remote server:
my_remote_name@my_remote_host~/sdshmc$ tree .
.
├── current -> /home/remote_user/sdshmc/releases/20130603021433
├── logs
│   ├── access.log
│   ├── development.log
│   ├── error.log
│   └── production.log
├── releases
│   └── 20130603021433
│       ├── application.rb
│       ├── config
│       │   └── database.yml
│       ├── config.ru
│       ├── db
│       │   └── development.db
│       ├── Gemfile
│       ├── LICENSE
│       ├── log -> /home/remote_user/sdshmc/shared/log
│       ├── logs
│       │   ├── access.log
│       │   ├── development.log
│       │   └── production.log
│       ├── models
│       │   ├── Country.rb
│       │   ├── IPv4.rb
│       │   └── Vendor.rb
│       ├── public
│       │   ├── folder_must_exist
│       │   └── system -> /home/remote_user/sdshmc/shared/system
│       ├── README.md
│       ├── REVISION
│       ├── seed_data
│       │   ├── addresses.csv
│       │   ├── countries.csv
│       │   ├── IpToCountry.2013-05-27.csv
│       │   └── make_seed_data.sh
│       └── tmp
│           └── pids -> /home/remote_user/sdshmc/shared/pids
└── shared
    ├── log
    ├── pids
    └── system

18 directories, 23 files

Ah. So now we see that current points to the releases/20130603021433 folder which has all our code.
Each new release will get another timestamp and get a new folder with current pointing to it.
Cool.

Ok. We're not done yet. We have to modify our Apache virtual host config to point to the public folder under current.
So we make these changes:
<VirtualHost *:80>
  ...elided...

  # Points to your site files
  # NOTE: You must have a public folder even if it's empty
  DocumentRoot    /home/some_user/sdshmc/current/public

  ...elided...

  # The directory where the site is stored
  # NOTE the trailing slash!
  <Directory /home/some_user/sdshmc/current/public/>
    ...elided...
  </Directory>
</VirtualHost>

Restart Apache and test it. All should work.

Deploying a new version

Ok. You've made a ton of changes, tested them locally and want to get it live.
Make sure all has been pushed to your repo and then:
cap deploy

You can see what options are available using:
cap -T

Making one-off changes to an existing deployment

Sometimes you have to change just one thing and don't want to do a complete redeployment.
So go ahead, change your file and push it to your repo.
Then on your local machine do:
cap deploy:upload

NOTICE:


The Gemfile has:
group :production do
# Now notice we are using the mysql not mysql2 gem!
gem 'mysql'
gem 'dm-mysql-adapter'
end

Notice that the gem is the mysql gem and not the mysql2 gem.
The gem reference is also in the database.yml file:
production:
  adapter: mysql
  encoding: utf8
  username: sdshmc
  password: sdshmc
  database: sdshmc

Notice that the adapter is mysql and not mysql2.
Supposedly the dm-mysql-adapter does not suffer from the utf8 issue.
Having said that, setting the encoding to utf8, UTF8, utf-8 or UTF-8 causes:
/Users/kim/.rvm/gems/ruby-2.0.0-p0/gems/data_objects-0.10.12/lib/data_objects/connection.rb:79: warning: Encoding utf8 is not a known Ruby encoding for MySQL

This appears to be an issue with the DataMapper DataObject library.
Which of course does not exist.
It's only a warning, but it disturbs me.

If you try to use the mysql2 gem and adapter DataMapper barfs on the DataObject requires.
When run, it attempts to require a mysql2 version of it's code.

Saturday, 1 June 2013

Let's Deploy!

My partner, Ben, is putting together a series of articles for TechRepublic on choosing between Sinatra, Padrino and Rails.
To help him determine the management pros and cons of each system, I decided to do a series of posts about the end to end processes with each.
I started with a simple application that addresses some of the common issues faced in the real world.
The simple application accepts requests in a RESTful fashion for IPv4 addresses and returns the associated country in JSON or XML format.
To make the decisions more interesting, I have added the following versions of each system:

  • ORMs: DataMapper, ActiveRecord and Sequel.
  • DBs:  Development: Sqlite and MySQL, Production: MySQL and PostgreSQL
  • Deployment Targets: Hosted server, Heroku and EngineYard
  • Deployment Systems: Capistrano and Vlad

Topics covered in development and production include:

  • Access logging (Apache style)
  • Execution logging (Log4j style)
  • Migrations
  • Initial data seeding
  • Simple authorization

Source control will be managed using Git.
All code will use Ruby2.

So over the next few weeks I will be posting these articles with all source code.
All the source code will be made available via my GitHub repo.

After that I will consider doing the same for JRuby, Java, Groovy and Grails onto Jetty, Tomcat and Resin.
If I'm feeling perverse enough, I may do the same for PERL and maybe even PHP5 under Quercus (PHP on the JVM)

It's all a learning experience!

Enjoy.

P.S. If anyone has any suggested additions to the above please leave a comment.

Tuesday, 21 May 2013

gem install mysql2 missing mysql.h on OS X

Ok. The problem surfaced after I needed Ben to help on a site I was building.
He didn't have ruby or mysql installed on his laptop.
The site was built using Ruby-2.0.0-p195 and Rails-4.0.0-rc1.
That, in itself, isn't the issue.

We went and installed the x86_64 version of MySQL 5.6 on his machine.
This sets up in /usr/local/mysql.
Got a bit annoyed as the install doesn't have any StartupItems so start/stop is manual.
Ce la vie.

I now installed Ruby-2.0.0-p195 with no dramas.

The site code was loaded up into a folder and "bundle install" was run.
That's when the fun started.
The mysql2 gem wouldn't compile.
You get something like this:

This could take a while...
ERROR:  Error installing mysql2:
 ERROR: Failed to build gem native extension.

    /usr/local/rvm/rubies/ruby-2.0.0-p195/bin/ruby extconf.rb
checking for rb_thread_blocking_region()... yes
checking for rb_wait_for_single_fd()... yes
checking for mysql.h... no
checking for mysql/mysql.h... no
-----
mysql.h is missing.  please check your installation of mysql and try again.
-----
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of necessary
libraries and/or headers.  Check the mkmf.log file for more details.  You may
need configuration options.

Provided configuration options:
 --with-opt-dir
 --with-opt-include
 --without-opt-include=${opt-dir}/include
 --with-opt-lib
 --without-opt-lib=${opt-dir}/lib
 --with-make-prog
 --without-make-prog
 --srcdir=.
 --curdir
 --ruby=/usr/local/rvm/rubies/ruby-2.0.0-p195/bin/ruby
 --with-mysql-config
 --without-mysql-config


Gem files will remain installed in /usr/local/rvm/gems/ruby-2.0.0-p195/gems/mysql2-0.3.11 for inspection.
Results logged to /usr/local/rvm/gems/ruby-2.0.0-p195/gems/mysql2-0.3.11/ext/mysql2/gem_make.out

Wait. What? The mysql.h does exist (and is valid) in the /usr/local/mysql/include folder.
I, of course, realised that I had not set the --with-mysql-include folder and tried that along with some others.
Nope.
No luck.

I did some googling and the options where confusing and ultimately useless.
The most common options are:
  • Install mysql server using homebrew. Wait. What?
  • Setting env ARCHFLAGS="-arch x86_64" before running gem.
  • Using bundle to create a .bundle/config file.
  • Including every --with-mysql option available on the gem install.
I didn't reinstall mysql as that seemed like silly talk.
I did try the --with-mysql-config=/usr/local/mysql/bin/mysql_config option with no luck.
I had this tickling sensation in my brain that indicated that I was buggerising around in the wrong 'mode' and that I was missing something gob smackingly obvious.
But I persevered.

I tested the /usr/local/mysql/bin/mysql_config program and it's output was like this:

Usage: /usr/local/mysql/bin/mysql_config [OPTIONS]
Options:
        --cflags         [-I/usr/local/mysql/include  -Wno-unused-private-field -Os -g -fno-strict-aliasing -arch x86_64]
        --cxxflags       [-I/usr/local/mysql/include  -Wno-unused-private-field -Os -g -fno-strict-aliasing -arch x86_64]
        --include        [-I/usr/local/mysql/include]
        --libs           [-L/usr/local/mysql/lib -lmysqlclient]
        --libs_r         [-L/usr/local/mysql/lib -lmysqlclient_r]
        --plugindir      [/usr/local/mysql/lib/plugin]
        --socket         [/tmp/mysql.sock]
        --port           [0]
        --version        [5.6.11]
        --libmysqld-libs [-L/usr/local/mysql/lib -lmysqld]
        --variable=VAR   VAR is one of:
                pkgincludedir [/usr/local/mysql/include]
                pkglibdir     [/usr/local/mysql/lib]
                plugindir     [/usr/local/mysql/lib/plugin]

So I did some digging around in the gems folder and under the ext/mysql2 folder there was the mkmf.log file.
As I was stuffed up with flu, surrounded by used tissues and it was 3am it took me a little while to see the issue.
Then I saw it:

cc1: error: unrecognized command line option "-Wno-null-conversion"

It's cc1.
It's failing because it has been provided with a -W gcc option that isn't supported by the compiler.
I tried without luck to get "gem" and the "extconf.rb" to use the ENV['CC'] variable.
So using the gcc-4.2 compiler was the only option.
I checked the compiler version:

gcc -v
Using built-in specs.
Target: i686-apple-darwin11
Configured with: /private/var/tmp/llvmgcc42/llvmgcc42-2335.15~25/src/configure --disable-checking --enable-werror --prefix=/Developer/usr/llvm-gcc-4.2 --mandir=/share/man --enable-languages=c,objc,c++,obj-c++ --program-prefix=llvm- --program-transform-name=/^[cg][^.-]*$/s/$/-4.2/ --with-slibdir=/usr/lib --build=i686-apple-darwin11 --enable-llvm=/private/var/tmp/llvmgcc42/llvmgcc42-2335.15~25/dst-llvmCore/Developer/usr/local --program-prefix=i686-apple-darwin11- --host=x86_64-apple-darwin11 --target=i686-apple-darwin11 --with-gxx-include-dir=/usr/include/c++/4.2.1
Thread model: posix
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)

Then it struck me that I was an idiot.
I edited the /usr/local/mysql/bin/mysql_config file and removed the offending -W compiler options.
(It's around line 120 or so)
The original looks like this:

cflags="-I$pkgincludedir  -Wall -Wno-null-conversion -Wno-unused-private-field -Os -g -fno-strict-aliasing -DDBUG_OFF -arch x86_64 " #note: end space!
cxxflags="-I$pkgincludedir  -Wall -Wno-null-conversion -Wno-unused-private-field -Os -g -fno-strict-aliasing -DDBUG_OFF -arch x86_64 " #note: end space!

And the changed version:

cflags="-I$pkgincludedir  -Wall -Os -g -fno-strict-aliasing -DDBUG_OFF -arch x86_64 " #note: end space!
cxxflags="-I$pkgincludedir  -Wall -Os -g -fno-strict-aliasing -DDBUG_OFF -arch x86_64 " #note: end space!

Now the install works just peachy.

Oh. One last point.
I had to link the dylib to /usr/local/lib:

ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib











Tuesday, 14 May 2013

Benji and Jesse are off to find their forever homes


Ok. Just took back the latest foster cats to the Animal Welfare League of Queensland.
You can see the fostering site at: http://www.facebook.com/AWLQLD or the main site at http://www.awlqld.com.au/

That makes 12 so far! 4 cats and 8 kittens.
All except the last two have found forever homes.
For cat lovers I have included pictures below.
Or you could just go to my Facebook page to see the album...

We picked these two up from the AWLQ  as they had been stuck in a cage for several weeks with cat flu. That's not cruelty BTW, but rather the foster centre had simply run out of space!

The Tabby (Benji) is a total companion cat and for the first week followed me around everywhere wanting head butts and cuddles.
After he got over the flu his 'instincts' took over and he was constantly trying to 'get it on' with me.
Now he's back at the fostering centre, they'll give him the snip and he'll just be a fantastic cat with no requirement to h__p his new mom. :-)
He would make a very welcome addition to a family with a large house as he likes to run around getting involved in everyones business.

 

 


















Jesse below is a year old and is a huge ball of fluff with a small cat embedded in it.
She's a very private cat but does like to sleep a lot. I suspect that an older couple would find her just peachy.

 

Tuesday, 30 April 2013

Bible Thumpers: Y U No Check Your Pamphlets?

Got one of these in the letter box on the weekend:


Can you see it? Oh. It's a bit small. Here's a close up:


Wait. What? Oh. So that's why I could never see the kingdom. It's over there, beyond the sparkley fountains of milk and honey in... in... HAD.

Seriously? It's the primary quote on the pamphlet for Cthulhus name.
The thing that grabs peoples attention.
And you got it wrong?


I did try google maps by the way. Gave up after a bit because it was boring. And stupid.
Mainly because of the semi-useless word "HAD."
The results where quite humorous and gave results focused on Israeli restaurants in:

Ash Sharqiyah, Oman
A possible I suppose.

Al Anbar, Iraq
Hmm. Seems to stretch the idea a bit don't you think?

And surprisingly:

Casper, WY, United States

Yeah. Real place. Loads of Israeli restaurants there according to Google.
Although I would hazard a guess and say that this kingdom is not likely to be at the Ramada Plaza Riverside Hotel and Convention Center unless they prefer the decor.
And the big question is, why Israeli?
I'm babbling.

Monday, 22 April 2013

Displaying a summary of Apache2 access logs by country using Ruby

Recently I had a case where a site was being hammered by script kiddies.
Nothing compromised I might add.
First off I limited access to the site by country (Australia in this case).
Note I wasn't blocking countries, but blocking all countries except Australia.

Then I wanted to get a summary of the Apache2 access log to see where these annoying little herberts where from.
An Apache2 access log entry looks something like this:


180.76.5.62 - - [22/Apr/2013:12:19:46 +1000] "GET /index.php?title=User:Coombayah HTTP/1.1" 403 409 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"


Now where was that access from?
So I knocked up a *very* simple ruby script to get that data.
The code below is 100% inelegant.
It's not meant to be elegant.
It's meant to illustrate the point.
A 'real' script would have much better coding and options for dates and ranges.

Output for today would look something like this:


2013-04-22 - 128
  119.63.193.131      2    0    2 2000667011 Japan
  119.63.193.132      2    0    2 2000667012 Japan
  123.125.71.112      2    0    2 2071807856 China
  123.125.71.74       2    0    2 2071807818 China
  176.31.9.218        4    0    4 2954824154 France
  192.80.187.162      4    0    4 3226516386 United States
  216.152.250.163     4    0    4 3633904291 United States
  216.152.250.187     4    0    4 3633904315 United States
  218.30.103.31       2    0    2 3659425567 China
  220.181.108.155     2    0    2 3702877339 China
  46.246.60.177       7    0    7  787889329 Sweden
  46.28.64.213       25    0   25  773603541 Ukraine
  66.249.74.135      10    0   10 1123633799 United States
  IPs with 1 access and failure: ["119.63.193.195", "119.63.193.196", "123.125.71.107", 
  "123.125.71.113", "123.125.71.69", "123.125.71.72", "123.125.71.75", "123.125.71.76", 
  "123.125.71.83", "123.125.71.91", "173.255.217.233", "180.76.5.10", "180.76.5.15", 
  "180.76.5.162", "180.76.5.62", "180.76.5.7", "180.76.6.227", "202.46.48.27", 
  "202.46.61.34", "220.181.108.143", "220.181.108.147", "220.181.108.148", 
  "220.181.108.158", "220.181.108.159", "220.181.108.161", "220.181.108.162", 
  "220.181.108.178", "220.181.108.181", "220.181.108.183", "220.181.108.186", 
  "42.98.185.25", "46.161.41.24", "78.46.250.165", "80.93.217.42", "85.216.108.89"]


As you can see, pretty basic.
The principal rows show the IP address, accesses, successes, failures, IP-Value and country.

First off you need a Ip-To-Country list.
In a prior post I mentioned getting zone range files.
So go to http://software77.net/geo-ip/ and get the full list.
The file will have a lot of useful comments at the top.
Make a copy of the file excluding the comments as IpToCountry.csv.
You should see a file roughly 126,000 lines long where each line looks something like this:


"16777216","16777471","apnic","1313020800","AU","AUS","Australia"


The fields are 'Address From', 'Address To', 'Registrar', 'Date Assigned', 'Country2', 'Country3', 'Country'.

And now the ruby code.

First the requires:


require 'date'
require 'csv'


Since we are using *BRUTE FORCE* and not bothering to be elegant, we simply read the csv file into an array of hashes.


printf "Loading Ip to Country map\n"
ip_to_country = []
CSV.foreach('IpToCountry.csv') do |row|
  ip_to_country << { :from => row[0].to_i, :to => row[1].to_i, :country => row[6] }
end


The first pass now reads the access log into a hash of hashes:


unique_days = {}
f = File.new('logs/access.log', 'r')
while (line = f.gets)
  ip,tmp,tmp,dt,offset,verb,url,http,rcode,sz,tmp,browser = line.split
  d = DateTime.strptime( "#{dt} #{offset}", "[%d/%b/%Y:%H:%M:%S %Z]")
  date = d.strftime("%Y-%m-%d")
  if ! unique_days.has_key? date
    unique_days[date] = { :ip => {}, :total => 0 }
  else
    if ! unique_days[date][:ip].has_key? ip
      unique_days[date][:ip][ip] = { :total => 0, :succeeded => 0, :failed => 0 }
    end
    unique_days[date][:ip][ip][:total] += 1
    if rcode == '200'
      unique_days[date][:ip][ip][:succeeded] += 1
    else
      unique_days[date][:ip][ip][:failed] += 1
    end
    unique_days[date][:total] += 1
  end
end
f.close


I have eschewed elegance here for brute force and clarity.

So. Now we have everything we need.
Now we traverse the data dumping the report out:


unique_days.each do |date,h|
  printf "#{date} - #{h[:total]}\n"
  only_one_failed = []
  h[:ip].sort.map do |k,data|
    octets = k.split('.')
    if data[:total] == 1 && data[:failed] == 1
      only_one_failed << k
      next
    end
    ip_value = (octets[0].to_i * 256 * 256 * 256) + (octets[1].to_i * 256 * 256) + (octets[2].to_i * 256) + (octets[3].to_i)
    country = 'Unknown'
    ip_to_country.each do |row|
      if ip_value >= row[:from] && ip_value <= row[:to]
        country = row[:country]
        break
      end
    end
    printf "  %-16s %4d %4d %4d %10d %s\n", k, data[:total], data[:succeeded], data[:failed], ip_value, country
  end
  printf "  IPs with 1 access and failure: #{only_one_failed}\n"
end


And you have your report.

Configuring an Apache2 instance to only allow access from a specific set of countries

Ok. This is pretty simple, but I'm documenting it here for myself.
The issue arose for me when one of our test servers started getting hammered by script kiddies from China and so on.
They didn't get in, but that wasn't the point.
The point was that the web site was for Australian users only.
Now to short circuit any comments I can say:
1) The solution had to work for several frameworks (Elgg, Joomla, etc)
2) I wasn't redirecting traffic to a special page
3) I wanted to black all access except Australian IPv4 ranges
I could have gone to MaxMind or its ilk, but that would make '1' problematic.

The first step was to locate a *correct* and *regularly updated* list of CIDR and range formatted IPv4 list.
I tried several including http://www.ipdeny.com/ipblocks/ with no luck.
Most have gaps in their lists because they use 'official' sources.
And ISP ranges are not always covered in those lists.

Eventually I found http://software77.net/geo-ip/
I downloaded the zone and CIDR lists for the countries I was interested in.
Then I modified the /etc/apache2/sites-available/my-site.something.com file to look like this:
<caveat>
Abbreviated for this post. :-)
</caveat>

<VirtualHost *:80>
  ServerAdmin     my.email@my.email.service.com
  ServerName      my-site.something.com
  ServerSignature Off
  DocumentRoot    /home/my-user/my-site/site
  LogLevel        warn
  ErrorLog        /home/my-user/my-site/logs/error.log
  CustomLog       /home/my-user/my-site/logs/access.log combined
  ScriptAlias     /cgi-bin/ /usr/lib/cgi-bin/
  <Directory /home/my-user/my-site/site/>
    Options       Indexes FollowSymLinks MultiViews
    AllowOverride All
    Order         Deny,Allow
    # See .htaccess for country limiting
  </Directory>
</VirtualHost>

Then in the /home/my-user/my-site/site/.htaccess file I added the following to the bottom:

<Limit GET HEAD POST>
  Order Deny,Allow
  Deny from all
  # Report generated on Mon Apr 22 01:02:36 2013
  # by http://software77.net/geo-ip/
  # Report Type  : CIDR format
  # Country      : Australia
  # ISO 3166 CC  : ALPHA-2 AU; ALPHA-3 AUS
  # Registry     : APNIC
  # Records found: 6,160 BEFORE flattening (As they appear in the database)
  # Records      : 3,999 AFTER flattening (Adjoining CIDR blocks concatenated into single blocks where possible)
  Allow from 1.0.0.0/24
  Allow from 1.0.4.0/22
  ...elided for brevity...  
  Allow from 223.255.248.0/22
  Allow from 223.255.255.0/24
</Limit>

The block from under the 'Deny from all' to the end is the CIDR formatted file for a country.
You have to add the 'Allow from ' in front of the address, but if you use vi that's trivial.
If you want to allow another country access, then simply add the CIDR formatted file below the first.

Then I ran '/etc/init.d/apache2 reload' and all is well.

IMPORTANT NOTE: Things change. You have to check for additions and deletions on the CIDR files on a regular basis - say, once a month to ensure that you .htaccess file is up to date.

Some readers may have noticed that I suggest downloading the zone files as well.
This is to allow you to read your apache logs and see which country that access came from.
I will be posting another entry soon that shows how to do that in Ruby.

Sunday, 21 April 2013

I've decided to stop watching debates about the existence or nature of god for a while


We foster cats and catlings.
"Wait. What?" I hear you say.
Just give me a few moments to explain.
Over the last year Ben and I have been fostering sick cats, mothers with kittens and the like.
Loads of them.
And, like 2 year olds, you have to watch them like a hawk.
They are always being mischevious, investigative and learning about the world they are in.
Since they are often sick or just kittens, we have to keep them inside for their own safety.
So they learn about the world that is the our house.

You can't concentrate on much, read a book and so on.
Some things you can do includes watching TV.
And I've been watching debates about the existence and nature of god while keeping a watchful eye on the fuzz balls.
So while you listen to Dawkins, Krause, Lennox et al, you observe the cats, their habits, their kittie-language and so on.

One thing that leaps out is the solid block that is the limit to their intelligence.
They can jump on chairs without knowing that it is a chair.
They eat food from a mechanical food chain beyond their grasp.
They cathramorphise us as their parent cats.
They can't read.
They may watch TV sometimes but have no capability of understanding any of the massive technological edifice behind those flickering images.
They are limited.
I suspect that even with evolution in play they will never be able to:
Open a can of cat food;
Read a book;
Build a car or fly a jet fighter within any reasonable time span.

So when I see a bunch of atheists, theists and deists debating I get an image in my head.
And that image is a bunch of house cats, sitting on a carpet discussing the nature of Dog.

I'm not saying that the debates are pointless and shouldn't be done.
They should continue, but on the understanding that a hundred years from now they will be debating the same thing.

So I'm laying off the debates for a while.

For cat-lovers here are some photos of the latest bunch: