Category Archives: Technology

  • -

Play(book) with Ansible

Tags : 

Recently I was in a meeting with a customer discussing the software selection for an automation product. The usual suspects were there: Puppet, Chef, Ansible, HP Server Automation/Operations Orchestrator.

A few months ago I tried Puppet in my spare time and just setting it up took a considerable amount of time. Then I tried Ansible. And started right away having fun writing playbooks: in no time I put together some useful books to prepare an entire infrastructure without learning a new language! Having declarative, YAML-based, files is a good choice for me since they can also double as documentation.

Surely Puppet and Chef are good solutions but for a quick start I was really impressed by Ansible and I definitely suggest it (at 3yo is relatively new compared to the others).

For example add some Perl modules and checkout the app from its repo:

- name: CPANM prerequisites
  cpanm: name={{ item }}
  with_items:
    - Term::ReadLine::Perl
    - Term::ReadKey
    - YAML
    - DBD::mysql
    - DateTime
    - Mojo::mysql
    - EV
    - IO::Socket::Socks
    - IO::Socket::SSL
    - Net::DNS::Native
    - Data::Dumper
    - CGI
    - Socket
    - Switch
    - Config::Abstract::Ini
    - List::Util
    - MIME::Lite
  environment:
    http_proxy: http_proxy
    https_proxy: https_proxy
    PERLBREW_ROOT: perlbrew_root
  tags: [ cpan_prerequisistes ]

- name: Copy the code from repository
  git: repo={{ repository }} dest={{ app_home }} force=yes

Give me six hours to chop down a tree and I will spend the first four sharpening the axe.

Abraham Lincoln


  • -

Get rid of Nagios (just kidding)

Tags : 

I had this post in draft for quite a long time, now I have some time to publish it. Beware: it is based on quite old articles but I’d like to give my opinion since this kind of all-or-nothing claims are to be taken with a grain of salt.

Let’s start with Gartner (1,2) for those who believe to the almighty “magic quadrant”. Despite saying a lot of obvious things (“Nagios is free and you can’t beat that”) there are some faults, one of the biggest being: for every product (free or not) YOU (or your outsourcer) has to have the skills to use/maintain it. Even with HP Openview Operations. Or IBM Tivoli. And if a valuable person leaves your company leaving you without skills on a particular product you cannot blame… the product. Maybe start without relying on a “one man person team”. Nagios helped me to augment different HP Operations installations with benefit for both the customer (gets a better monitoring environment) and HP (who still maintains leadership in large enterprise and can get feedback about possible enhancements). Where Nagios (or similar OSS solutions) are unbeatable is where you can’t spend a lot of money in licenses (but hopefully are willing to spend for the support). Speaking about tech support: I prefer an honest one through mailing lists, github issues without guarantees but with a lot of good will than a commercial, formally correct, very rigid (and always without guarantees of resolution of the problem) support. A good tech support, even from big names is a difficult thing to achieve (mostly because of the company size: when solving problems fast it’s imperative to reach the technical people often buried under contracts options, level 1 help desks and support teams, escalation managers and so on).

One different an more interesting approach (even funny) is Andy’s:

 

Ok, Nagios is dated (I also began using it when the name was Netsaint). So what?

It means that is mature and your can rely on it. And for a lot of not-on-the-edge customers it’s ok.

For a reasonable amount of money (for support and someone who gets it running… like me) you’ll get a product that you can control with lots of documentation and without forums hidden behind… a support contract.

I won’t go into the technical details but Andy has a good point: Nagios starts to feel a little bit dated but there are a lot of add-ons that deliver very good solutions. Now. And given the LEGO-style approach (you get the pieces you like and combine them to reach your goal) it wins hands down when compared to all-in-one solutions (that usually solves a few problems in a good way and leaves a lot unsolved).


  • -

Your custom data visualization

Tags : 

By building on the available software it’s incredibly easy to add value for the customer.

Let’s take, as an example, a client request to map some internal-data to a on-premises cloud infrastructure: there isn’t an off-the-shelf solution for that and a custom development like this can provide meaningful insights at a glance in a small amout of time!

Building with open source often blows your mind!


  • -

Collaboration tools

Speed up your team collaboration by using next-gen tools and stop sending Excel/Word attachment!
A centralized, fully searchable, work history shared across a team is the best way to keep everyone updated and to achieve your goals faster (never heard of DevOps?).

We choose this tools as a complement of our activities to provide a 24/7 feedback to our clients:

Slack + Screehero
Redmine

If Redmine falls short (for example for some big spreadsheet) then we can switch to OpenOffice or Google docs!


  • -

Your feeds on feedsushi.com

In the mid of 2013 we started to find a solution to the demise of Google Reader and found Tiny Tiny RSS. Instead of using it only on our servers we created an hosted version on Amazon Web Services with a dedicated iOS app (in collaboration with indie developer Claudia Grilli).

 

This is the result.

Do you want to know more about our usage of AWS? Contact us!.

  • -

Use case: a robust, distributed and scalable Nagios infrastructure

Tags : 

Today I want to share with you one of my recent1 implementation of a distributed, scalable and fault tolerant Nagios infrastructure: an open source solution (backed with commercial support if you need it) that is gaining even more traction in enterprise environments worldwide.

When working on a grand-scale server farm it became obvious that a single Nagios installation on a single machine can’t provide a good environment that supports a lot of hosts and services checks.
There are of course a lot of different solutions to this problem and the first is to create a master-slave (distributed monitoring) with multiple Nagios instances (I did such a solution for a former client with 5 geographically-distributed branches with a small number of checks without growth). This kind of solution is not the best if you want to scale fast and to a higher number of hosts/services and is also a little bit difficult to mantain the configuration (to ease this problem you have to do a little bit of scripting to keep everything in sync).

A running Gearman

A running Gearman

To this day there are new technologies that can provide a more interesting solution, first of all the wonderful Gearman. To link Nagios and Gearman another great piece of software was used: mod_gearman. This broker puts every job (checks/handlers/performance data) into the appropriate queue on the Gearman server, ready to be processed by workers on an arbitrary number of remote hosts (not exclusively dedicated to monitoring if you want).

For our needs the basic Nagios Core was sufficient (since no one sees the web interface, more on this on a different post). We started with 4 Red Hat Enterprise 6.x virtual machines (given the customer’s specifications) with 4 cores each that are responsible of checks and event handler executions. Two of them are “special”: they run also a gearmand and a Nagios Core instances; one is the production one and the other is the standby one (which can be used as a second web console). The first mechanism that provides the fault tolerance is built on VMWare (vMotion/HA). In the (relatively unlikely) event of unavailability of the first Nagios Core the second one (which is in sync) can take over with a few manual steps (by choice not automatically). When we wanted to further increase the capacity of this infrastructure we cloned one of the worker machines adding the fifth worker machine in matter of minutes.

An example of a logarithmic graph

An example of a logarithmic graph

On top of that two parallel pnp4nagios installations deliver performance data to the RRD files on each Nagios Core server (this also avoids keeping in sync the huge perfdata directory). A few custom templates were developed to visualize data from different rrd files into a dynamically-generated graph. As a useful complement we added also the MK Livestatus  broker to efficiently query Nagios from external scripts.

Each additional software (gearmand, mod_gearman NEB and workers, check commands) was compiled from scratch to allow the maximum flexibility in pairing fully functional versions.

This is the result which serves more than 1500 hosts and more than 11000 services (and counting).

Nagios infrastructure

What to know more? Contact us.

  1. well, it’s almost two years old and exposing a quite impressive uptime []