Debugging Rails with pry within a Docker container

If you are running your rails apps in production using docker, you may be tempted to still run locally outside of docker. One of the big reasons to potentially hold back is the use of pry.

However, its actually very simple to attach to a running docker process which makes using pry a breeze.

 

First, ensure that you have the following lines in your docker-compose file for the service you want to attach to:

Now, rebuilding your container by using docker-compose up -d  which should recreate the container if needed.

Now, use docker ps  to find the container id and then docker attach  in order to attach to the running process.

Now, you can insert a binding.pry in your code somewhere…

 

You can now perform your usual debug commands. When done debugging type exit to leave the pry debug session.

To detach from the container without exiting press control + p  and the control + q . Note that if you hit control + c instead of the escape sequence the container process will exit.

Docker for Developers Part 2

The Saga continues

This is a continuation of my previous post which was an introduction to docker geared for developers. In the previous post, we got somewhat familiar with the docker ecosystem and built a very simple hello world application. In this part we are going to get dynamodb running locally, run a one off script to preload data, build a sample api service that reads from dynamo, and finally a sample website that reads from that api service. As with the first part, you can retrieve a copy of the completed code base off github or directly download the assets for part 2 here.

Running DynamoDB

Before we can really start building our API server, we need a place to store data. As we learned in the previous part, we can use docker-compose files to orchestrate the various services that make up our application. This could be a postgres or mysql instance, however in this case the application is going to leverage DynamoDB for storage. When developing the application it doesn’t make sense to create an actual dynamo table on AWS as you would incur some costs for something that is only going to be used for development. There are some caveats and limitations to this however which you can read about here. Since we would never want a local dynamodb container running in any environment other than development, we want to go ahead and edit docker-compose.override.yml with the following:

Save that file, and run docker-compose up -d and you should see output that the dynamodb container is running alongside our movie-api instance

Now that DynamoDB is running we need to be able to link our API container to it so that they can communicate. Our Compose file needs one small change..

Rerun docker-compose up -d which will recreate the movie-api-1 container.

Loading data into DynamoDB

Now that we have dynamodb running, we can show how to run a one off command to invoke a script that will seed data into dynamo. First, we can create the directory structure for storing our sample data and a place for our seed scripts. Within the movie-api directory create two new directors. One called scripts and the other data. We can use an amazon published sample data set for building this application. You can retrieve it here and extract this to the data directory.

The application directory structure should now look like this:

Now within the scripts directory lets write a simple script for seeding this data to our local dynamodb instance. Within the movie-api/scripts directory create a file called seed_movie_data.py and open it for editing.

 

This script is pretty straightforward, this script will create the table within dynamodb if it doesnt exist, and then seed it with a little over 4,000 movies from the json file in our data directory.

Before we can run our script, we need to set a few environment variables on our movie-api instance. Go ahead and open up docker-compose.override.yml and adjust it to reflect the following:

The AWS credentials do not need to be anything real, in fact, leave them as above (foo, and bar). They just need to be set to prevent boto from barfing when connecting to the local dynamodb instance. In a production setting, we would leverage IAM roles on the instance to connect which is far more secure than setting credentials via an environment variable.

Once you have created the script, and also set the environment variables we can run the following to recreate the container with the new environment variables and then run the script.

As you can see, we ran the script that we created within the container. We do not need to have our python environment setup locally. This is great, because the containers environment is isolated from every other application we may be developing (and even other services within this application). This provides a high degree of certainty that if we deploy this image, that it will function as we expect. Furthermore, we also know that if another developer pulls this code down and runs the project it will work.

Building our API

Now that we have the data in our datastore  we can build a simple lightweight API to expose this information to a client. To keep things simple, we are going to create a simple endpoint which will return all the movies that were released in a specified year. Lets go ahead and open up movie-api/demo/services/api.py in our IDE and add a bit of code to make this happen.

Save this and then we can try querying our service with a simple curl command:

Building our Website

Now that we have an API that returns the data we need, we can create a simple web interface with a form that will present this data in a nice fashion. There are a few ways to make this happen, normally i’d recommend using something like angular however to further demonstrate container linking and such I will use a seperate flask app.

Within the movie-web directory we need to create our app skeleton. To simplify things, copy the Dockerfile, app.py and requirements.txt files from movie-api to movie-web. Besides the copying of those 3 files, go ahead and create the following directory structure and empty files so that the output of tree matches the below.

In requirements.txt remove the boto reference and add in requests.

Open up app.py and edit the third line to reflect the below

Go ahead and create demo/services/site.py and open it up within your IDE.

Now to get things running we just need to edit our docker-compose.yml and docker-compose.override.yml files.

docker-compose.yml:

docker-compose.override.yml:

With those changes saved. We can run docker-compose up -d which should launch our container. We can verify connectivity with curl.

Now lets create a couple templates, and a build out the endpoints on our webapp so that we can perform a simple search.

Open up demo/services/templates/index.html:

Open up demo/services/templates/results.html:

And finally edit demo/services/site.py:

If you visit your web browser at http://192.168.99.100:8081/ which should show the following:

movie-db-index

Enter in 1993 and you should see the following results:

movie-db-results

At this point that completes our application and this tutorial on docker for developers. To Recap, we installed and setup the docker toolbox on your machine. We then demonstrated how to use docker-machine, docker, and docker-compose to build a sample application that uses dynamodb, an api service, and finally a web application to view a sample dataset. You should be familiar now with creating a Dockerfile to build an image, and using compose to orchestrate and run your application. You have defined environment variables, container links, ports, and even leveraged the ability to map a volume coupled with flasks ability to reload on file changes to rapidly speed up development.

One thing you may have noticed is that we spent most of the time dealing with application code, and not a whole lot of time spent working with docker itself. Thats kind of the point. One of the greatest strengths of docker is that it simply gets out of your way. It mays it incredibly easy to rapidly iterate and start working on your application code.

While this wraps up my 2 part series on docker for developers, ill be writing additional posts centered around docker for QA and Operations Engineers. This will focus on testing, CI/CD, production deployments, service discovery, and more.

Docker for Developers Part 1

Summary

 

Recently I started working on a few projects where docker seemed like a great fit to rapidly speed up development of the project. In one case we wanted to build a prototype service that contained an API endpoint that utilized 4 micro services. The docker landscape is still green with a lot of toolsets not even existing for a year. While I feel the development side of things is great, the production deployment, auto scaling, and release management is still lacking. One of the projects I have been following closely is Rancher which seems to be on track to solve all of these things. This will be a series of posts initially focusing on development and the building of a fully featured sample application demonstrating the power of docker running locally. I will add posts documenting CI with Jenkins, through to a production deploy and management on AWS.

What will we do?

This tutorial is going to walk through the creation of a sample web application that utilizes a sample API service backed by dynamodb. Specifically we will:

  1. Layout the structure of our application and explain why things are laid out like they are.
  2. Build a sample hello world flask app that shows the initial power of docker and docker-compose.
  3. Run a local dynamodb container locally for development.
  4. Load some data into that local dynamodb install.
  5. Build a sample flask API that reads from that local dynamodb instance.
  6. Build a sample flask website that reads from the API and returns some basic data.

Setup

1. So to dive right into it, you will need two things installed on your machine to work through this tutorial. Currently everything below assumes you are running on OSX however it should work just fine under Linux as well.

  • Install Virtualbox: https://www.virtualbox.org/wiki/Downloads
  • Install the docker toolbox: https://docs.docker.com/installation/mac/

2.  Assuming that you have never used boot2docker before (and if you did, you should be prompted with instructions on how to convert to docker machine), run the following command to setup a default docker machine. This will be a virtual machine that will be where all the various containers run that you launch. More on that in a minute.

3. You can now run $ eval “$(docker-machine env default)”  to set the required environment variables. If you will be launching docker containers often, you might even elect to put this in your bashrc or zshrc file.

4. You should be able to run docker ps and see the following:

 

Helpful Commands

This area serves as a quick reference for various commands that may be helpful as a developer working with Docker. A lot of these may not make sense just yet, and thats fine. You will learn more about them below and can always come back to this one spot for reference.

 

Building our application

So now that the tooling is setup we can discuss what our project structure will look like.  You can see a completely functional and done copy of the below project on github and you can grab just the files we create here in part 1.

Initial Skeleton

  1. First create a directory called docker-movie-db-demo somewhere.
  2. Within that directory create two directories. One called movie-api and the other movie-web

It should look like this

We created two directories however, that are going to house two separate applications. The first of which, movie-api is going to be a simple flask API server that is backed by data in dynamodb. The second application called movie-www is going to have a simple web interface with a form, and allow the user to list movies from a certain year.

Our first Flask App

Within movie-api go ahead and create a few empty files and directories so that your structure matches the below. We will go through and touch these files one by one.

app.py

Open app.py up and lets toss a few lines in.

This is pretty basic, but it will initialize a basic flask app from demo/services/api.py and listen on a port that is specified as the first argument when running python app.py 5000.

requirements.txt

Open requirements.txt and add in the following

This is also pretty straightforward, but we are ensuring we install flask. Boto is installed for interfacing with dynamodb. I’ll have to write a separate article on why its important to pin versions and the headaches that can solve down the line.

demo/services/api.py

For now lets just add in the following:

We are adding a simple route for the index page of the api service that for now just returns the text “Hello World!”

Dockerfile

The dockerfile is where the magic happens. Depending on how you are used to doing development, you might create a virtual env somewhere, or possible a vagrant image. This certainly works however you often end up with a bunch of files scattered everywhere, mismatches between your virtual env and someone elses (if you arent careful) and/or multiple vagrant images floating around that slow down your machine.

Open the Dockerfile (Note that the Dockerfile should have a capital D) up and paste in the following:

When running the build command, this tells the system how to build an image. In this case it will use a python 2.7 base image, copy the CWD (movie-api) to /code in the container, set the CWD to /code, run an apt-get update, pip install our requirements and then finally run our application. If you want more details you can read the dockerfile reference here which explains whats going on in detail and whats possible.

At this point we have enough of a skeleton to actually build a container, run it if we wanted to.

We just built an image, spun up a container based off that image, queried the service and got a response, stopped the service, and deleted the container. However, every time you make a code change your going to have to rerun the build command, and then relaunch your container. If your doing quick iterative development this can get annoying quickly. There is a better way.

 

 Introducing docker-compose

The docker-compose files are how we orchestrate the the building and running of our containers in an easier fashion. We can leverage how these files work with flasks built in reload system on file change to enable rapid iterative development

docker-compose.override.yml is special. When you run the docker-compose command, it will look for docker-compose.yml and docker-compose.override.yml. If present, it will go ahead and merge them and then perform actions based on that merged data.

We leverage this behavior to build our development environment. If we wanted a test environment for example, we would add a docker-compose.test.yml file, and when running docker-compose target that environment with docker-compose -f docker-compose.yml -f docker-compose.test.yml. However, this is generally only done by build systems, and so we use the override file for development as it keeps the command for developers simpler as they don’t need to specify -f.

docker-compose.yml

Within the root of our project directory (docker-movie-db-demo) lets create a file called docker-compose.yml and make it look like so:

We have just defined a movie-api service that for now has no image, will always restart on failure, and exposes the container port 5000. You can see the full docker-compose file reference here

As mentioned above, the override file will allow us to override some things in the movie-api base compose definition to make development a little bit faster.

Create and edit a file called docker-compose.override.yml and make it look like so:

If you remember, in our dockerfile, we copy up the movie-api files into the image during its build. This is great when you want to make a container that you start shipping around to various environments such as test, staging, and production. However when you just want to do local development, building that same container every time is time consuming and annoying. With our override file, we have made it so that we will mount our code base within the running container. This allows us to use our favorite IDE locally to do development but immediately see those changes reflected. We also have exposed port 5000 on the container and mapped that to port 8080 on our docker-machine. This makes it a little bit easier to debug and test. In production you generally wouldn’t do this and i’ll detail more in a seperate article focusing around production deployment of this workflow.

Starting our movie-api app.

Now, from the root of our project directory (docker-movie-db-demo) run the following command:

 

You can tail the logs of the container by running:

So already you can see that starting up the container is simpler. However things really shine when you start editing code. Fire vim or your favorite IDE and edit movie-api/demo/services/api.py:

If you kept tailing the logs you will see that it instantly reloaded, and if you run curl 192.168.99.100:8080 again you will see that the output changes.

docker-movie-db-demo-1

Wrapping it up

This concludes part 1 of the tutorial. In summary we laid out the structure for our project, went through how to setup a machine for docker, built a sample flask application that returns a hello world message. We also walked through how to make changes to the application and test those changes in realtime without having to build an image and redeploy the container.

In the next post, i’ll focus on adding a local dynamodb instance, how to run one off data scripts to do things like load data and build a sample web interface that interacts with the API.

Part 2 Here

Akismet: Simple Comment Spam Filtering

So I have developed a few websites in the past year that included blogs. One of the big issues is comment spam. Normally the standard procedure has been to implement reCaptcha. While I do love helping to digitze books recently several methods have come to light forbypassing reCaptcha. The amount of Viagra comments started to get to an extreme. There are certainly some less creative solutions such as automatically dropping comments with certain keywords, or blocking comments from outside of the country. You could also simply block all comments by default and approve comments as they come but that inhibits active discussion on posts, and frankly is kind of crappy.

I did some research and happened upon Akismet. This service offers a pay as much as you want model which I love. You can pay as little as nothing or as much as 120/yr. Now that I found a service the question came down to implementation. I found this fantastic gem called rakismet which offers easy integration and provides helper methods for checking if a message is spam, and best of all, letting them know if a comment was truly spam or not via api calls.

I wont dive into the setup of the Gem because the instructions are really fantastic. I do want to briefly talk a bit about a great way to implement this functionality to place comments marked as spam into a queue for approval or deletion.

My Comment Model includes:

The important thing here is really the approved bit, as by default, comments are approved. When you iterate of your comments you can check to see if approved is true before displaying the comment in your view.

Our comment controller might look something like…

The important thing to note in the exampe above is the spam method. When we call it on comment this reaches out to the Akismet API and checks the comment to see if its considered spam. The API will return either true or false, which if true means the comment is spam and thus sets its approved status to false. You can now leverage this data in your admin panel or whever to see your comments that are pending approval.

So the best thing about this service, is that you get to help improve it. You can use something like the below to help Akismet learn which comments are spam and which arent.

 

Overall my experience with the service so far has been excellent. Integration is easy with any rails app, documentation is excellent, and you cant beat the pricing model of pay whatever you want. You’re users are happy because they dont have to try to guess the hard to read words, solve a math problem, orselect the cats.

TL;DR: Captcha’s are dead. Get rid of them, and start effortlessly filtering comments with Akismet.

Multi SSH (MuSSH)

Recently I ran into a situation where I needed to run a command on a large number of servers. There are existing libraries out there, but I found that they wouldnt run under OS X for one reason or another, or they didnt handle a large number of servers very well. Normally I do this with something along the lines of

And while the above does work it suffers from being a big loop. It connects to a server, runs the command, disconnects, and connects to the next and so on.

 

I ended up putting something together in ruby which can either take a comma delimited list on the command line or read from a file if you have a large number of hosts or happen to use the same hosts often

You can see the latest code here on github.

Resolving a large list of domains from CSV

This script is pretty simple, I was tasked with taking a CSV file that had about 5000 domains and seeing if any resolved to a particular IP address.

Straightforward and does the trick

Installing PG gem under OS X when Bundle install fails

The other day a coworker ran into an issue installing the postgres gem on her laptop while doing a bundle install. I have ran into the issue a couple of times myself and its a quick fix however the first time I encountered it it took a few minutes longer than it probably should to figure out.


 

This is a pretty simple fix, first we need to ensure that you have postgres installed. You can download and install postgres here: http://www.postgresql.org/download/

After that is installed, simple locate your pg_config file (an alernative would be to do a (find / -type d -name “pg_config”) although this may take a while. You could also do which psql91 (or whatever version you have installed) and look at the symlink in that directory. Generally your install should be in /opt/local/lib/.

What we are looking for is the pg_config directory we simply install the gem manually by running the below.

And there we go installed and a bundle install should now complete without issue.

Mass Updating Amazon Web Services S3 Permissions

Recently I ran into an issue where I needed to update the Access Control Lists(ACLs) for about 1.5 Million pictures. Amazon currently doesnt provide an easy way to do this and current tools such as 3Hub and such cant handle such a large volume.

By far the biggest flaw when it comes to mass updating in current S3 utilities is the lack of threading. With 3Hub I was able to update the permissions on around 400k images in a 30 hour period. I did not want to leave my laptop at work yet again so I decided to just write a script that could do this quicker and run on a server.
Here is the code that I used to accomplish this.
The config file is pretty straightforward and goes into a directory called config
Now obviously the script above would need to be adapted, in my case I had a list of ids from 1 to 9999. My method above is just a quick and dirty way to thread the application without having to pull a list of the IDs and feed it out. In our development environment which contains about 300,000 images I was able to update all of the permissions in about an hour.
Something to note –  If you run this script and update the ACLs it replaces your ACls with the above. So if have 5 different ACLs be sure to set them all above so that they are applied. If you need to only change a specific ACL and not the others then this script would need to be adapted to do that.

Using tmux for collaboration

Summary
An old coworker of mine recently showed me the wonders of using tmux when pairing up. Using tmux you can easily share your screen with multiple people and both observe and make changes. Rather than pull a coworker from across the call or use a join.me session you can easily show them something with just a couple of quick commands.
Setup
Copy the below to your .bash_profile and then source that file (source ~/.bash_profile).
Usage
Creating
To start a session that someone can join, simply type tmux-start and a session name.
The way tmux works is it drops a pid file in /tmp with your session name. Normally this pid is readable by the user that created it. The chmod 777 changes the file permissions to allow this to be readable by all.
Listing
You can list the current sessions by typing tmux-list. This will spit out a list of the users and the name of the session for which you can join.
Joining
You can join a session by typing tmux-join and the session name.