In my previous post I discussed the benefits of investing time in personal project. As a continuum, I’ll present one of my latest personal projects in this post. In addition to presenting the details, I’ll list benefits I gained while hacking around the project.
At the beginning of the year, I was looking for a new topic for my next home project. I’ve been always fascinated by machine vision and image processing in general but did not have much real experience in utilizing image processing methods. I had seen fancy Instagram filters which make it possible, for example, to draw cat years to an image or video feed. It’s amazing how well these features work these days. During last year, I had created manually so-called thug memes from images of my colleagues because I found it funny. By combining all these thoughts, the obvious topic for the next personal project was Thug Meme generator.
Simply put, the high-level requirements for the project were:
- identify facial landmarks from input image
- draw “thug elements” on top of the original image based on 1
- add meme texts to the original image
From the beginning it was clear that I’d do this by using my favorite programming language, Python. Based on a quick research it seemed that there are two widely used options for identifying facial landmarks: dlib and haarcascades with OpenCV. I decided to support both because I thought it would be a nice way to evaluate both methods. I wanted to wrap the functionality into a project which would be open source and have continuous integration in place. Since GitHub seems to be the de facto SCM for OSS and it integrates with Travis swimmingly, the choice for SCM and CI was clear.
I started the implementation by creating the functionality for the landmark detection. With dlib this was rather trivial because the API provided high-level functions that were suitable for my use case. However, with OpenCV haarcascades, I had to build some self-baked logic to get decent results. Overall, dlib beat OpenCV based detector which, in addition, required a lot of fine tuning of detecting parameters. This is understandable because dlib is using more modern methods and uses considerably heavier model files (see dlib vs haarcascades).
After getting the functionality in place for detecting the facial landmarks, drawing thug elements and meme texts was straightforward although rotating image while keeping track of the position of a certain pixel required a bit digging into math. I needed such feature for rotating thug glasses according to the angle of the detected face and I also wanted to have random angle for the thug cigar because I thought it’d look thugger in an image which has multiple faces. After getting all the functionalities in place, I added a simple CLI by using Click.
For the results, I guess the image below is worth more than 1000 words of my explanations. On the left, the detection results with OpenCV, dlib on the middle, and an example Thug Meme on the right.
Here are some miscellaneous points that I learned on the way:
- Wrapping your sources inside src directory is an effective way to make sure to run your tests against installed version of your package.
- Using tox with usedevelop=true seems to be one easy way for testing the installed version. In addition, tox is the obvious choice for testing against multiple different Python and dependency versions.
- Click is my new best CLI friend
- I decided to write the readme file in reStructuredText (I have been using only markdown before) format because PyPI did not support markdown. However, I noticed that markdown support was added just about the same time I was releasing the first version.
- I wanted to try pipenv to see how it fits with a Python library. After playing with it a while, I noticed that it’s more suited for deployable applications which need to support only a single Python version.
- I used Travis Build Stages for separating tests and deployment to separate build steps. Build Stages seemed to be in beta stage while writing this. I think the feature has a lot of potential and is definitely something that Travis was lacking compared to e.g. GitLab and BitBucket pipelines.
- I found it easy to start playing with image processing by using Python and existing libraries. Internet is also full of examples, for instance, start with PyImageSearch.
As a summary, I had loads of fun implementing the project and will definitely continue improving it on my spare time. As a positive surprise, the project has got some attention in GitHub. If you want to know more, you can view and clone the project here.
Happy coding memers!