Tom Posted October 1, 2017 Report Posted October 1, 2017 (edited) I have developed an MIT-licensed set of Ansible roles for Kazoo called kazoo-ansible. The goal of kazoo-ansible is to provide the entire community with a way to easily manage a repeatable installation of a Kazoo cluster, so we can spend less time setting up and maintaining our Kazoo installations and more time on the unique offerings of our VOIP endeavors. kazoo-ansible has the following features: Automatically clusters CouchDB, Freeswitch, Kamailio, and Kazoo Let's Encrypt TLS certificate generation for Monster UI, including support for multiple Monster UI hosts Uses CouchDB instead of BigCouch Splits up roles for CouchDB, Freeswitch, Kamailio, Kazoo, Monster UI, and RabbitMQ to allow lots of cluster custimization Publishes roles to Ansible Galaxy to allow easy integration into custom playbooks Easy to install using included bootstrap scripts I have added installation documentation to make the installation more intuitive. kazoo-ansible is currently in a pre-release state. It is feature-complete and working for my needs; however, I'd really like to get feedback from the community before I release 1.0.0. Check out kazoo-ansible on GitHub: https://github.com/kazoo-ansible/kazoo-ansible Edited October 10, 2017 by Tom (see edit history) Quote
Administrators mc_ Posted October 2, 2017 Administrators Report Posted October 2, 2017 This looks really nice! I would consider adding, either on the main README or in an INSTALL or similar file, copy/paste-able shell commands for each of the installation steps. What is intuitive and obvious to you may not be to others and having an example shell session will provide more feedback as people try setting this up for themselves. For instance, if I wanted to play with this, I would need to dig up how to add the user for SSH and sudo, running the Kazoo commands, etc. If there are more hooks into Kazoo that would make life easier for setting this up, do let us know! Anything we can do to facilitate easier setup and maintenance, we'd like to know about. Thanks again for this work and sharing it with the community. Quote
Tom Posted October 2, 2017 Author Report Posted October 2, 2017 Thanks for the suggestion to improve the documentation. I always find documentation to be more difficult than the actual programming Kazoo actually made it very easy to automate this because I was able to use the sup command to perform the clustering. Some of the sup commands are a bit easier for humans to read than machines, but a quick regex was able to fix that for me. Clustering CouchDB was actually the only hard part. Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 I am currently using your playbooks. I had to disable firewalld settings because i got a couchdb access error . fatal: [10.0.1.1]: FAILED! => {"changed": false, "failed": true, "msg": "<urlopen error [Errno 111] Connection refused>"} fatal: [10.0.1.2]: FAILED! => {"changed": false, "failed": true, "msg": "<urlopen error [Errno 111] Connection refused>"} Could you elaborate on that. Also could you tell me more about the role of kazoo domain variable in group vars Quote
Tom Posted October 6, 2017 Author Report Posted October 6, 2017 The common role should add an exception so that each node in the cluster can communicate. What does /etc/firewalld/zones/kazoo-zone.xml look like on one of the servers on your cluster? Does it match the IP address that will be resolved for the server names you used in /etc/ansible/hosts? The kazoo domain variable is used to create an Nginx configuration for the domain where Monster UI will be hosted. For example, if your domain is monsterui.example.com, the kazoo domain variable would be monsterui.example.com. Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 ok so the zones file looks fine. all proper ips are there. Even after removing the firewalld commands i still got the error TASK [kazoo-ansible.couchdb : Cluster CouchDB] ********************************************************************************************* fatal: [10.0.1.1]: FAILED! => {"changed": false, "failed": true, "msg": "<urlopen error [Errno 111] Connection refused>"} fatal: [10.0.1.2]: FAILED! => {"changed": false, "failed": true, "msg": "<urlopen error [Errno 111] Connection refused>"} however waiting a bit and running it again it went through. Maybe couchdb needs some time to startup ? (My machine is dual xeon E5 with ssds) Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 Same kind of error with freeswitch. RUNNING HANDLER [kazoo-ansible.freeswitch : Gracefully Restart FreeSwitch] ***************************************************************** fatal: [10.0.1.1]: FAILED! => {"changed": true, "cmd": "fs_cli -x 'fsctl shutdown asap restart'", "delta": "0:00:00.010998", "end": "2017-10 -06 17:01:25.362917", "failed": true, "rc": 255, "start": "2017-10-06 17:01:25.351919", "stderr": "[ERROR] fs_cli.c:1659 main() Error Connec ting [Socket Connection Error]", "stderr_lines": ["[ERROR] fs_cli.c:1659 main() Error Connecting [Socket Connection Error]"], "stdout": "", "stdout_lines": []} fatal: [10.0.1.2]: FAILED! => {"changed": true, "cmd": "fs_cli -x 'fsctl shutdown asap restart'", "delta": "0:00:00.014279", "end": "2017-10 -06 17:01:25.404357", "failed": true, "rc": 255, "start": "2017-10-06 17:01:25.390078", "stderr": "[ERROR] fs_cli.c:1659 main() Error Connec ting [Socket Connection Error]", "stderr_lines": ["[ERROR] fs_cli.c:1659 main() Error Connecting [Socket Connection Error]"], "stdout": "", "stdout_lines": []} to retry, use: --limit @/root/kazoo-ansible/site.retry I ran it again and it dissapeared Quote
Tom Posted October 6, 2017 Author Report Posted October 6, 2017 It does sound like the Ansible script is running quicker than the components can come online, but that's surprising because your server is really good :). I tested this on a few VMs running on a laptop and Google Cloud. I'd like to think of how I might be able to replicate your test case. Were you able to complete the installation? Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 Still working on it. Trying to get past this right now TASK [kazoo-ansible.kazoo : Install Kazoo] ************************************************************************************************* failed: [10.0.1.2] (item=[u'kazoo-applications-4.1-34.el7.centos', u'kazoo-application-*-4.1-34.el7.centos']) => {"changed": true, "failed": true, "item": ["kazoo-applications-4.1-34.el7.centos", "kazoo-application-*-4.1-34.el7.centos"], "msg": "Error: Package: kazoo-applications -4.1-34.el7.centos.noarch (2600hz-stable)\n Requires: kazoo-core = 4.1-34.el7.centos\n Available: kazoo-core-4.0-0.el7.c entos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-0.el7.centos\n Available: kazoo-core-4.0-1.el7.centos.x86_64 (2600hz -stable)\n kazoo-core = 4.0-1.el7.centos\n Available: kazoo-core-4.0-2.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-2.el7.centos\n Available: kazoo-core-4.0-3.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4. 0-3.el7.centos\n Available: kazoo-core-4.0-4.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-4.el7.centos\n Available: kazoo-core-4.0-5.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-5.el7.centos\n Available: ka zoo-core-4.0-6.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-6.el7.centos\n Available: kazoo-core-4.0-7.el7.c entos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-7.el7.centos\n Available: kazoo-core-4.0-8.el7.centos.x86_64 (2600hz -stable)\n kazoo-core = 4.0-8.el7.centos\n Available: kazoo-core-4.0-9.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-9.el7.centos\n Available: kazoo-core-4.0-10.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4 .0-10.el7.centos\n Available: kazoo-core-4.0-11.el7.centos.x86_64 (2600hz-stable)\n kazoo-core = 4.0-11.el7.centos\n i ended up changing the play command to yum install kazoo-applications kazoo-applications-* removing the versioning and el7.centos Quote
Tom Posted October 6, 2017 Author Report Posted October 6, 2017 I thought hard-coding the versioning would make it more stable, but it seems that the dependencies break when you aren't using the latest version. I will update the roles tonight to use the latest version, and you'll have to update them. I'll post back here when I've done so. Thank you so much for testing this and helping me identify issues I haven't run into. Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 No problem ! A good ansible install will help all of us. Thanks for taking the time to make this. I am learning ansible as i go and understanding the clustering aspects of kazoo through your work! I did get an error for selinux whihc i had disable. I wonder if its possible to define if not to do certain things in ansible . some thing like: ansible-playbook site.yml -nofirewall -noselinux I also created a requirements.yml file with all the playbooks so i can download and install them using ansible-galaxy install -r requirements.yml I usually have it unsecured in test environment so i can play around with different settings. Got this error a few times. Just kept rerunning the script and it went ok after 2-3 tries TASK [kazoo-ansible.kazoo : Cluster Freeswitch] ************************************************************************************************************************ fatal: [10.0.1.1]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Shared connection to 10.0.1.1 closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_4H0Vm1/ansible_module_freeswithcluster.py\", line 42, in <module>\r\n main()\r\n File \"/tmp/ansible_4H0Vm1/ansible_module_freeswithcluster.py\", line 39, in main\r\n module.fail_json(msg=error)\r\n File \"/tmp/ansible_4H0Vm1/ansible_modlib.zip/ansible/module_utils/basic.py\", line 1997, in fail_json\r\n File \"/tmp/ansible_4H0Vm1/ansible_modlib.zip/ansible/module_utils/basic.py\", line 1977, in _return_formatted\r\n File \"/tmp/ansible_4H0Vm1/ansible_modlib.zip/ansible/module_utils/basic.py\", line 414, in remove_values\r\n File \"/tmp/ansible_4H0Vm1/ansible_modlib.zip/ansible/module_utils/basic.py\", line 414, in <genexpr>\r\n File \"/tmp/ansible_4H0Vm1/ansible_modlib.zip/ansible/module_utils/basic.py\", line 425, in remove_values\r\nTypeError: Value of unknown type: <type 'exceptions.IOError'>, Command failed: {'EXIT',{noproc,{gen_server,call,[ecallmgr_fs_nodes,{connected_nodes,false}]}}}\r\n\r\n\r\n", "msg": "MODULE FAILURE", "rc": 0} ok: [10.0.1.2] now i have everything installed Have two problems monsterui is working on https but nginx for http is not redirecting to https to monsterui. Turns out i have to enter my ip in http conf file for nginx since i dont have dns. Second sup command is not working Freeswitch is connected to both ecallmgr . Kamailio see both freeswitch as dispatchers. Cocuhdb has all the required dbs. epmd: up and running on port 4369 with data: name kazoo-rabbitmq at port 25672 name freeswitch at port 8031 name ecallmgr at port 11501 name kazoo_apps at port 11500 name couchdb at port 33480 but i get this error for sup command. Failed to connect to service kazoo_apps@xyz.pbx with cookie change_me Possible fixes: * Ensure the Kazoo service you are trying to connect to is running on the host * Ensure that you are using the same cookie as the Kazoo node, `sup -c <cookie>` * Verify that the hostname being used is a Kazoo node kazoo_apps is not running! /etc/kazoo/core/config.ini has the right cookie set for everything requirements.yml Quote
Uzair Mahmud Posted October 6, 2017 Report Posted October 6, 2017 (edited) ok so i have to run the sup command with -c "my_cookie" to make it work. seems like the default cookie did not get set to the new one. Also having trouble loggin to couchdb from browser with the given user. get the following error. : {gen_server,call, [config, {set,"couch_httpd_auth","secret", "c5f3fe1a02a91777514373a7527b45c5",true,nil}, 30000]} also through terminal i get following errors on couchdb [root@navy1 ~]# curl -X PUT http://couchdb:password@localhost:5984/test1234 {"error":"error","reason":"internal_server_error"} [root@navy1 ~]# curl -X PUT http://couchdb:password@localhost:5984/test1234 {"error":"file_exists","reason":"The database could not be created, the file already exists."} I am on lan and without a dns server so that might explain some stuff. I also ran this playbook multiple times which might also break some stuff. It's pretty late here so thats it for today from me. I really appreciate the couchdb clustering and the rest of the clustering stuff. Reading through your playbooks my understanding of kazoo setup deepened to another level in record time. This also gave me a reason to finally learn and start using ansible! Thanks for your efforts. Lets make this more robust! Edited October 6, 2017 by Uzair Mahmud (see edit history) Quote
Uzair Mahmud Posted October 7, 2017 Report Posted October 7, 2017 (edited) i spun up a bunch of vms for testing and only ran the script fresh on them. ok so i got rid of all the errors but adding pauses before those commands. Also figured out how to use tags in ansible to control what parts of the scripts to run and whihc parts to exclude. One problem is i cant login to fauxton over interface 5984 or 5986. i get the following error: CRASH REPORT Process config (<0.5859.0>) with 0 neighbors exited with reason: no match of right hand value {error,eacces} at config_writer:save_to_file/2(line:38) <= config:handle_call/3(line:242) <= gen_server:try_handle_call/4(line:629) <= gen_server:handle_msg/5(line:661) <= proc_lib:init_p_do_apply/3(line:240) at gen_server:terminate/7(line:826) <= proc_lib:init_p_do_apply/3(line:240); initial_call: {config,init,['Argument__1']}, ancestors: [config_sup,<0.87.0>], messages: [], links: [<0.88.0>], dictionary: [], trap_exit: false, status: running, heap_size: 17731, stack_size: 27, reductions: 76043 The final problem that i am having is that sup command is not taking the cookie i specify in groupvars/all. it still tries to use the default cookie change_me. The rest is working extremely well. Edited October 7, 2017 by Uzair Mahmud (see edit history) Quote
Uzair Mahmud Posted October 7, 2017 Report Posted October 7, 2017 (edited) figured out the couchdb login problem from fauxton had to do chown -R couchdb:couchdb /opt/couchdb Something is resetting the persmissions for opt/couchdb/ folder . its probably just my setup that i encountering this. also sup problem goes away after a restart. User error on that one Edited October 7, 2017 by Uzair Mahmud (see edit history) Quote
Tom Posted October 8, 2017 Author Report Posted October 8, 2017 (edited) I was able to fix: CouchDB clustering errors - Allow time for CouchDB to start before attempting to cluster it FreeSwitch clustering errors - Wait 30 seconds after DB creation to give Kazoo time to fully start Dependency errors - Do not hard-code Kazoo versions CouchDB Login Problem: I ran into this issue as well, and it seems to be Kazoo related. I changed the owner of the /opt/couchdb folder, and restarted kazoo. It appears that Kazoo is changing the ownership. We might need to ask the 2600hz team for help with this. You can re-install all of the Ansible roles with --force. Let me know if you run into any more issues. Edited October 9, 2017 by Tom (see edit history) Quote
Uzair Mahmud Posted October 9, 2017 Report Posted October 9, 2017 Thanks for the updates. checking them out right now. ok so i change the cookie to "change_me" and thats how i got the installation to work. if i change the erlang cookie to anything else and install and reboot also i keep getting the error failed to connect to service kazoo_apps@xyz.pbx with cookie change_me for couchdb the installation folder could be changed. thats the only other option for now . Quote
Tom Posted October 9, 2017 Author Report Posted October 9, 2017 Are you thinking that we should retarget the installation to /opt/kazoocouchdb or something like that? I haven't been able to replicate the cookie issue on a brand new cluster I'm afraid. If I'm able to find out anything more, I'll let you know. Quote
Uzair Mahmud Posted October 9, 2017 Report Posted October 9, 2017 (edited) Thanks, Another thing i noticed is that my kamailio is registered into the acl with subnet /32 while i am on subnet /8. I wonder if this can be auto extracted from network settings. I will do more testing on my machines to figure out the sup cookie thing. as far as i know you are supposed to change the cookie in config.ini in kazoo folder and everything there is set to the right cookie. Edited October 9, 2017 by Uzair Mahmud (see edit history) Quote
Tom Posted October 9, 2017 Author Report Posted October 9, 2017 (edited) Subnets: /32 is the correct ACL, since /32 is a single IP address. Since the clustering is automated, we can ensure that only the exact Kamailio IP addresses are whitelisted. Cookie: The Kazoo role does change the cookie. I'm really confused why it's not working. You might have to restart kazoo-applications and kazoo-ecallmgr for the cookie file used by sup to actually be written. Edited October 9, 2017 by Tom (see edit history) Quote
Uzair Mahmud Posted October 9, 2017 Report Posted October 9, 2017 i see what you are saying with the network. Learned something new i have noticed that its not just couchdb folder that gets it settings changed. i installed net-data for server stats in the opt folder and that one also kept changing its owner. Quote
Tom Posted October 9, 2017 Author Report Posted October 9, 2017 We may need to ask a Kazoo developer for help with this. The Ansible roles don't go around changing these permissions. The permissions get changed once you restart kazoo-applications or kazoo-ecallmgr. Quote
2600Hz Employees Sean Wysor Posted October 9, 2017 2600Hz Employees Report Posted October 9, 2017 What is the owner being changed to? I do not know of anything offhand that sets permissions in opt outside of /opt/kazoo. Quote
Tom Posted October 10, 2017 Author Report Posted October 10, 2017 2 hours ago, Sean Wysor said: What is the owner being changed to? I do not know of anything offhand that sets permissions in opt outside of /opt/kazoo. The kazoo user takes ownership of everything under /opt: [tnewman@kazoo opt]$ ls -la /opt total 0 drwxr-xr-x. 4 kazoo root 34 Oct 10 01:08 . dr-xr-xr-x. 17 root root 224 Oct 9 23:33 .. drwxr-xr-x. 9 kazoo couchdb 122 Oct 10 01:06 couchdb drwxr-xr-x. 8 kazoo daemon 107 Oct 10 01:11 kazoo Here are the steps to verify this: sudo chown -R couchdb /opt/couchdb The permissions are now correct: [tnewman@kazoo opt]$ ls -la /opt total 0 drwxr-xr-x. 4 kazoo root 34 Oct 10 01:08 . dr-xr-xr-x. 17 root root 224 Oct 9 23:33 .. drwxr-xr-x. 9 couchdb couchdb 122 Oct 10 01:06 couchdb drwxr-xr-x. 8 kazoo daemon 107 Oct 10 01:11 kazoo sudo systemctl restart kazoo-applications The permissions are incorrect again: [tnewman@kazoo opt]$ ls -la /opt total 0 drwxr-xr-x. 4 kazoo root 34 Oct 10 01:08 . dr-xr-xr-x. 17 root root 224 Oct 9 23:33 .. drwxr-xr-x. 9 kazoo couchdb 122 Oct 10 01:06 couchdb drwxr-xr-x. 8 kazoo daemon 107 Oct 10 01:11 kazoo Quote
Tom Posted October 10, 2017 Author Report Posted October 10, 2017 On 10/9/2017 at 6:56 PM, Sean Wysor said: What is the owner being changed to? I do not know of anything offhand that sets permissions in opt outside of /opt/kazoo. The kazoo-applications and kazoo-ecallmgr startup scripts take recursive ownership of /opt/. I have submitted a pull request at https://github.com/2600hz/kazoo-configs-core/pull/8, and I hope someone can review it for me. Quote
2600Hz Employees Sean Wysor Posted October 11, 2017 2600Hz Employees Report Posted October 11, 2017 Oh good catch, I will merge this and we will update the packages related to this. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.