CDR/CEL Processing – Climbing the Beanstalk

One of the most annoying tasks within Asterisk (or VoIP in general) is the task of CDR and event processing. Why is processing these so annoying? Well, depending on your infrastructure, problems can arise because of any of the below:

  • Row locking within the database
  • Handling of multiple input points
  • Handling a constantly changing data set
  • Split brain processing issues with clusters
  • Replication of data records between multiple data processing points
  • Data synchronizing
  • Data uniqueness and data consistency
  • etc.

The primary issue is to make sure that when you process an event or a CDR, it is processed once and only once.

Asterisk provides multiple backends for processing CDR and CEL records. These include log files, MySQL, Postgres, ODBC, and others. However, all of these are more or less prone to the same caveats listed above.

When we started developing our cloud platform, cloudonix.io, we were confronted with the following issues for processing CDR/CEL records:

  • Our platform had an unknown number of Asterisk servers, autoscaling at unknown times.
  • Our platform was split into multiple cloud zones and regions.
  • Our platform was spitting out multiple event types that required different types of processing.

Thus, we took the normal track everybody did: logging into a RDBMS of sort. The result was annoying.  Processing was slow, locking of records was always an issue, and, worst of all, providing multiple processing points to handle the events was becoming increasingly difficult. The solution: beanstalkd (http://kr.github.io/beanstalkd).

For lack of a better description, beanstalkd is very simply a job queue. It will accept a free formatted string into a queue (named ‘tube’) in the beanstalk world. Jobs can be pushed into the queue (put), extracted from the queue (reserve), deleted from the queue (delete), and more. Our idea for operation was really simple: CDR records need to process relatively slow, CEL records need to process relatively fast, we want to be able to have multiple clients reserve work from the queue, process the work uniquely, then delete the job, or return to the queue upon failure.

Thus, cdr_beankstalk and cel_beanstalk were born.

The beanstalk backends are fairly similar in nature, and provide a means of inserting CDR/CEL events directly from Asterisk to a remote beanstalkd server. So, how do we work with beanstalkd? Let’s have a look.

Step 1: Install the beanstalkd server

Some of the major distributions provide beanstalkd as a precompiled package, so try that one first, otherwise download the source. Pay attention to the installation and configuration, as you will need that later on. Normally, the installation will make the beanstalkd server available at port 11300.

Note:beanstalkd doesn’t include any type of authorization or security, so make sure you block this port to the world. Remember, security is on you!

Step 2: Configure cdr_beanstalkd.conf

The following sample shows the cdr_beanstalkd.conf file:

;
; Asterisk Call Management CDR via Beanstalkd job queue
;
; Beanstalkd is a simple job queue server, that is highly versatile and simple to use.
; Beanstalkd includes the capability of using multiple queues at the same time, with priorities.
;
; This module requires that your server has the beanstalk-client library installed. The library
; can be downloaded from - https://github.com/deepfryed/beanstalk-client
;

[general]
enabled = yes

host = 127.0.0.1    ; Specify the remote IP address of the Beanstalkd server
port = 11300        ; Specify the remote PORT of the the Beanstalkd server
tube = asterisk-cdr ; Specify the default CDR job queue to use
priority = 99       ; Specify the default job priority for the queue. This parameter is useful when building
                    ; platform with multiple Asterisk servers, that are used for different functions. For example,
                    ; none billable CDR records can be inserted with a priority of 99, while billable ones be
                    ; inserted with a priority of 1

You will need to uncomment the lines to activate the backend. The important configuration parameters here are tube and priority. The backend enables the insertion of events to a single assigned tube. That means that all CDR records will be inserted to the asterisk-cdr tube. When a job is inserted, you can assign a priority. The lower the number, the higher the priority. The priority mechanism is especially useful when you have multiple Asterisk servers with varying functionality, which require different levels of priority for processing. For example, an Asterisk server that deals with wholesale routing requires a higher priority than one doing voicemail.

Step 2: Test your configuration

Using the console, make sure your new CDR backend is working:

*CLI> cdr show status

Call Detail Record (CDR) settings
----------------------------------
 Logging: Enabled
 Mode: Simple
 Log unanswered calls: No
 Log congestion: No

* Registered Backends
 -------------------
 Adaptive ODBC
 cdr_manager (suspended) 
 cdr-custom
 cdr_beanstalkd
 csv

A similar configuration is available for cel_beanstalkd.conf and the same concepts apply.

Once your configuration is alive and tested, you can use the beanstalkd client libraries to start processing your jobs. One of my favorite libraries for PHP is called Pheanstalk(https://github.com/pda/pheanstalk). It’s a little old, but provides all the functionality that you could want. The overall beanstalkd client life cycles can be defined as the following:

put with delay                 release with delay
 −−−−−−−−−−−−−−−−−→ [DELAYED]←−−−−−−−−−−−.
                       |                  |
                       |   (time passes)  |
                       |                  |
 put                   v     reserve      |        delete
−−−−−−−−−−−−−−−−−→ [READY]−−−−−−−−−−−→ [RESERVED]−−−−−−−−−→ *poof*
                      ^  ^                |  |
                      |   \    release    |  |
                      |    `−−−−−−−−−−−−'    |
                      |                      |
                      | kick                 |
                      |                      |
                      |           bury       |
                   [BURIED] ←−−−−−−−−−−−−−−−'
                      |
                      | delete
                       `--------→ *poof*

This means that in order to process uniquely, you will need to reserve a job, delete it when finished processing, or release the reservation if the processing failed, which means that you will not let the queue store the reserved job again. The nice thing is that you don’t need to worry about queue integrity or the priorities, beanstalkd will take care of that for you. In addition, you can configure beanstalkd to store your queue on file as well, meaning that your storage and operations are stateful. Thus, you can continue your processing and work, even after a failure.

9 Responses

  1. I noticed that the code is available on GitHub, but not in the Asterisk source package.

    Is it still in development? When it will be distributed with the code?

    Regards,
    Marcelo

  2. why you picked beanstalkd ? there is no development from 2014
    can you compare it to apache kafka ?(pros/cons)

  3. We originally created the first version of this CDR handler back in 2010, for usage with Asterisk 1.6. When created, it was used internally for one of our side projects, which was later on abandoned and was unmaintained for a while. Back then, the contribution process wasn’t as straightforward as today, thus, it never got properly submitted up-stream.

    Recently, we’ve updated the code base to support latest Asterisk, and thus, at the same time we’ve decided to fully release it to the master branch. I admit, it is a bit dated as a technology tool – but, having said that, it is still a viable tool to use. Even PHPAGI hadn’t been updated in years, yet, it is still the predominant tool with AGI developers.

    As I see it, stability is the most important factor – as a telephone needs to be, first and foremost, stable. Not that Kafka isn’t stable, the technical footprint that Kafka requires is far bigger than Beanstalk’s, and thus, introduce potential instability factors to the system. Personally, I don’t believe that every new/shiny/cool tech should be used, just off the bat. I prefer my technology a bit more seasoned.

    Internally, as a company, we look at technology and evaluate it all the time. For example, we’ve recently reviewed the various ARI libraries and deemed all of them as – improperly maintained, lacking proper documentation, lacking proper deployment guidelines and more. Thus, we decided to create a new one. Yes, I am the creator of PHP-ARI, but we’ve decided not to use it and write something new (which will be released soon enough), simply because it doesn’t fit our new use-case.

    The comparison with Kafka is pointless – these are two fundamentally different tools.

  4. Does beanstalk-client support saving events in case beanstalk-server is temporarily unavailable? I didn’t find it at any dicription.

    Best regards,
    Daniil

  5. >>> The comparison with Kafka is pointless – these are two fundamentally different tools.

    Not pointless at all. There’s value on people understanding Kakfa is much bigger and complex but also has advanced and interesting features. I think beanstalkd is a good pragmatic choice for this, simplicity is a feature after all, but is not true the comparison is pointless, they do overlap on some use cases and Kafka could be used for this same purpose.

  6. Hello!
    Currently trying to build asterisk with Beanstalkd support but i keep getting the ‘PJPROJECT BEANSTALK… fail’ error message while doing the ‘configure’
    What is the path that i should specify with the –with-beanstalk option? i tried with the beanstalk src, also with the installation bin, also with the configuration files but it just doesn’t get rid of the message 🙁

    Any help would be appreciated

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

About the Author

What can we help you find?