Jump to content

SocketSink, A tool for 2600hz kazoo users to gather and store websocket events


Recommended Posts

Hello all! So, I thought I'd share some progress I've made on an app that connects to 2600hz websockets, listens for and saves events that come across. Then stores those files on backblaze for later processing. Today, I can only give you the 10,000 foot view on how it works and basic instructions on getting it going. I'll be adding another agent later that'll commit the events to a MySQL DB and do some processing on them to make them user readable. If you want to join in the fun though, you might want to get this agent up and running now so you can start gathering the data.

Repo: https://github.com/wildernesstechie/socketsink

10,000 foot view:

2 VPS machines run sink.py - These agents connect to your 2600hz cluster and record the events coming off of it to memory. Then, after 30 minutes the agent terminates, saving the events to a gzipped file and uploads them to backblaze

1 VPS running dbupload.py - This agent collects files from the unprocessed folder on backblaze, and commits them to a MySql database. A series of SQL statements are run to transform the raw events into more friendly data like hold time, park time, ringing time, missed, answered, ect.

1 VPS running a WebGUI - You will be on your own for this one. I'm using Databik for this. But, it's not free or open source, so I can't give it out to you all.

 

Basic Install Insructions:

Install python3.6+

clone the repo

setup a venv

pip the required packages (websocket-client, kazoo-sdk, b2sdk, mysql-connector-python)

Get yourself a backblaze b2 account and setup API keys for it

go into sink_settings.py and put your backblaze API key and id in

name your unprocessed folder (it can be anything)

put your account ID where it says 'KazooAccountIdHere' and your API key where it days 'ApiKeyHere' (sorry, only API keys work ATM)

If you aren't on hosted, you'll need to update your api base and webhook base URLs.

change to another ca bundle if you'd like.

The setup cron to run the agent every 0 and 30 minutes

Rinse, repeat on a second VPS except cron every 15 and 45 minutes.

 

Sit back and watch the data flow into your backblaze! I'll be back soon with an agent that imports to a DB. (It's already done, just needs some cleanup)

Merry Gift Giving Holiday Of Your Choosing and New Year!!! :)

Edited by Rick Guyton (see edit history)
Link to comment
Share on other sites

4 minutes ago, FASTDEVICE said:

@Rick Guyton have you considered using https://kafka.apache.org/ to store the data?

Yea, it's mentioned on the git actually. For people familiar with Kafka administration I'm sure it's a great fit. I'm interested in generating reports on a daily basis/weekly/monthly basis. If I were doing realtime, that'd be the ticket. But, I'm not sure what it offers otherwise to balance out the headache of another component to learn and admin.

Edited by Rick Guyton (see edit history)
Link to comment
Share on other sites

I wasn't thinking real-time reporting but an account (or several for that matter) can generate a lot of messages to capture. Kafka appears to be a good way to continually capture the websocket data and generate reports later.  Here is a guide for developing a Kafka connector.  https://docs.confluent.io/current/connect/devguide.html

 

Link to comment
Share on other sites

14 hours ago, FASTDEVICE said:

I wasn't thinking real-time reporting but an account (or several for that matter) can generate a lot of messages to capture. Kafka appears to be a good way to continually capture the websocket data and generate reports later.  Here is a guide for developing a Kafka connector.  https://docs.confluent.io/current/connect/devguide.html

 

I mean, it would be cool to use it and gain familiarity with it. Maybe someone who knows Kafka can fork my project or use bits and pieces of it to make the connectors? I personally have 0 experience with it, so I couldn't say if it's even possible or if the tech is a good fit or not. From what I've heard, it's more for creating full pub/sub micro-services architectures. Much like how 2600hz uses RabbitMq. And that's a bit overkill for my purposes. I'm currently using this to gather events for all of my sub accounts. Extrapolating from that, I think this can easily accommodate 15k+ devices. More if you drop the subscriptions I have in there for qubicle. And, if you have that many devices, you should probably be in private cloud or global infrastructure where you can listen directly to RabbitMq instead of using websockets.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...