The 4th post in the ‘Automate Leaf and Spine Deployment’ series goes through the creation of the base and fabric config snippets and their deployment to devices. Loopbacks, NVE and intra-fabric interfaces are configured and both the underlay and overlay routing protocol peerings formed leaving the fabric in a state ready for services to be added.
- Automate Leaf and Spine Deployment Part1 - introduction and structure
- Automate Leaf and Spine Deployment Part2 - input variable validation
- Automate Leaf and Spine Deployment Part3 - fabric variables and dynamic inventory
- Automate Leaf and Spine Deployment Part4 - deploying the fabric with ansible
- Automate Leaf and Spine Deployment Part5 - fabric services: tenant, interface, route
- Automate Leaf and Spine Deployment Part6 - post validation
- github repo
The following sections start with the Ansible host and N9K pre-configurations and go right through to the deployment of the fabric using Ansible.
Table Of Contents
Prerequisites
The deployment has been tested on NXOS 9.2(4) and NXOS 9.3(5) (in theory should be fine with 9.3(6) & 9.3(7)) using Ansible 2.10.6 and Python 3.6.9. There are a few nuances when running the different versions of code, see the caveats section in Part1 for more details.
git clone https://github.com/sjhloco/build_fabric.git
mkdir ~/venv/venv_ansible2.10
python3 -m venv ~/venv/venv_ansible2.10
source ~/venv/venv_ansible2.10/bin/activate
pip install -r build_fabric/requirements.txt
Once the environment has been setup with all the packages installed run napalm-ansible
to get the location of the napalm-ansible paths and add them to ansible.cfg under [defaults].
Before any configuration can be deployed using Ansible a few things need to be manually configured on all N9K devices:
- Management IP address and default route
- The features nxapi and scp-server are required for Naplam replace_config
- Image validation can take a while on NXOS so is best to be done so beforehand
interface mgmt0
ip address 10.10.108.11/24
vrf context management
ip route 0.0.0.0/0 10.10.108.1
feature nxapi
feature scp-server
boot nxos bootflash:/nxos.9.3.5.bin sup-1
- Leaf and border switches also need the TCAM allocation changed to allow for arp-supression. This can differ dependant on device model, any changes made need correcting in
/roles/base/templates/nxos/bse_tmpl.j2
to keep it idempotent
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
copy run start
reload
The default username/password for all devices is admin/ansible and is stored in the variable bse.users.password
. Swap this out for the encrypted type5 password got from the running config. The username and password used by Napalm to connect to devices is stored in ans.creds_all
and will also need changing to match (is plain-text or use vault).
Before the playbook can be run the devices SSH keys need adding on the Ansible host. ssh_key_playbook.yml (in ssh_keys directory) can be run to add these automatically, you just need to populate the device’s management IPs in the ssh_hosts file.
sudo apt install ssh-keyscanansible-playbook ssh_keys/ssh_key_add.yml -i ssh_keys/ssh_hosts
Base and Fabric role
Both roles are setup in a similar manner using the variables defined in base.yml, fabric.yml and host_vars to render jinja templates creating the config snippets. Tags are defined under the task rather than the role so apply to all tasks within the role.
There are no filter plugins for these roles so they do have a little bit of programmability in the templates. This is kept to the bare minimum and is only for differences in configuration between spine and leaf/border and the optional settings in bse.services
.
vars_files:
- vars/ansible.yml
- vars/base.yml
tasks:
- name: Builds the base config snippet
import_role:
name: base
tags: [bse, bse_fbc, bse_fbc_tnt, bse_fbc_tnt_intf, full]
- name: Builds the fabric config snippet
import_role:
name: fabric
tags: [fbc, bse_fbc, bse_fbc_tnt, bse_fbc_tnt_intf, full]
The role tasks are pretty simplistic, they generate the config snippets from the role template and save it to file. The configuration is saved in a device specific folder within ~/device_configs/device_name/config, the parent directory location can be changed with ans.dir_path
.
changed_when stops ansible reporting changes when the template is rendered and check_mode allows the configuration to still be written to file when the playbook is run in check-mode.
- name: "FBC >> Generating fabric config snippets"
template:
src: "{{ ansible_network_os }}/fbc_tmpl.j2"
dest: "{{ ans.dir_path }}/{{ inventory_hostname }}/config/fabric.conf"
changed_when: False
check_mode: False
Interface cleanup role (intf_cleanup)
To keep the playbook truly declarative any interfaces that are not used need to be reset to the default settings. For example, if the interfaces used for the MLAG were changed without interface cleanup the old interfaces would not be wiped breaking the idempotency.
Interfaces used by the fabric can be defined in the following locations:
- Fabric interfaces: Defined under
fbc.adv.bse_intf
and turned into host_vars by the inventory_plugin - MLAG peer-link: Defined under
fbc.adv.bse_intf.mlag_peer
and turned into host_vars by the inventory_plugin - MLAG keepalive: Defined under
fbc.adv.bse_intf.malg_kalive
and turned into host_vars by the inventory_plugin - End host interfaces: Defined under
svc.adv.single_homed
andsvc.adv.dual_homed
and manipulated by the svc_intf_dm method within the format_dm.py filter_plugin
The first task in the intf_cleanup role passes three arguments into the get_intf.py filter_plugin :
- hostvars[inventory_hostname]: Device host_vars which has the total number of physical interfaces and the used fabric interfaces
- fbc.adv.bse_intf: Interface naming format (intf_fmt) for filtering and configuration (for example Ethernet1/)
- flt_svc_intf: End host interfaces (service_interface.yml) using the method flt_svc_intf within format_dm.py filter_plugin from the services role
- name: "Getting interface list"
block:
- name: "INTF_CLN >> Getting list of unused interfaces"
set_fact:
flt_dflt_intf: "{{ hostvars[inventory_hostname] |get_intf(fbc.adv.bse_intf, flt_svc_intf |default(None)) }}"
These arguments are used to create a list of used and available interfaces which are converted into sets and the symmetric_difference (non-duplicates, so non-used interfaces) returned to Ansible and stored in the flt_dflt_intf Ansible fact. This fact is used by the roles second task to render the dflt_intf_tmpl.j2 template and generate a config snippet of all the unused interfaces.
- name: "INTF_CLN >> Generating default interface config snippet"
template:
src: "{{ ansible_network_os }}/dflt_intf_tmpl.j2"
dest: "{{ ans.dir_path }}/{{ inventory_hostname }}/config/dflt_intf.conf"
changed_when: False
check_mode: False
The template renders the default interface configuration as got from show run all
, it must match exactly including the hashed out lines.
{% for intf in flt_dflt_intf%}
interface {{ intf }}
!#shutdown
!#switchport
switchport mode access
!#switchport trunk allowed vlan 1-4094
{% endfor %}
The intf_cleanup role is automatically run (using tags) whenever either the fabric or service_interface roles are run.
Assembling config snippets
The config snippets are saved within device specific directories ~/device_configs/device_name/config with an extension of .conf. This directory is deleted and recreated at every playbook run. The parent directory location can be changed using ans.dir_path
.
Ansible assemble takes all files within the config directory that have an extension of .conf and creates a unified configuration file (config.cfg). The order of the configuration in the file does not matter, NXOS is smart enough to workout what is needed. The only gotcha is the order of operation as would be the same with manual configuration, for example the creation of a VLAN must be before the assignment of it to an interface.
- name: "SYS >> Joining config snippets into one file"
assemble:
src: "{{ ans.dir_path }}/{{ inventory_hostname }}/config"
dest: "{{ ans.dir_path }}/{{ inventory_hostname }}/config/config.cfg"
regexp: '\.conf$'
changed_when: False
check_mode: False
tags: [bse_fbc, bse_fbc_tnt, bse_fbc_tnt_intf, full, merge]
Napalm
Napalm replace_config is used to replace the devices current configuration with the configuration from the config.cfg. It is stateless, it doesn’t care what is already configured just what the end result will be. The device is clever enough to do the difference and ONLY apply the changes needed. Unless the change is disruptive to a feature (for example changing BGP ASN) there will be no downtime.
- If something isn’t relevant anymore it is cleaned (wiped form the device)
- Only makes change for the differences, it won’t change any of the existing config if already in place
Napalm uses SCP to copy over candidate_config.txt and checkpoint to create sot_file and rollback_config.txt. Use show file xx
to view these.
show filesot_file Device config, equivalent of show run all show filecandidate_config.txt Config transferred by Napalm that is to applied show filerollback_config.txt Rollback config is same as SOT show diff rollback-patch filesot_file filecandidate_config.txt Check diff between device config and declared config
API calls are used to copy files over, get the diff and apply the configuration. By default Napalm expects a response to each API call in 60 seconds, this has been increased to 360 seconds as it can take upto 6 minutes to deploy full configuration (with service roles). If it takes longer (N9Kv running 9.2(4) is very slow) Ansible will report the build as failed but it is likely the process is still running on the device so give it a minute and run the playbook again, it should pass and with no changes needed.
The applied configuration is automatically saved to ~/device_configs/diff/device_name.txt and optionally printed to screen.
Napalm commit_changes is set to True as Ansible check-mode is used to do a dry run. Check-mode will show you what changes would be made by the playbook if committed, it does everything except actually applying the configuration.
- name: "CFG >> Applying changes using replace config"
napalm_install_config:
provider: "{{ ans.creds_all }}"
dev_os: "{{ ansible_network_os }}"
timeout: 360
config_file: "{{ ans.dir_path }}/{{ inventory_hostname }}/config/config.cfg"
commit_changes: True
replace_config: True
diff_file: "{{ ans.dir_path }}/diff/{{ inventory_hostname }}.txt"
get_diffs: True
register: changes
tags: [bse_fbc, bse_fbc_tnt, bse_fbc_tnt_intf, full]
- debug: var=changes.msg.splitlines()
tags: [diff]
Ansible Napalm does not have a dedicated method for rolling back changes so requires a separate task to do so by applying rollback_config.txt.
- name: "NET >> Rolling back configuration"
block:
- net_get:
src: rollback_config.txt
dest : "{{ ans.dir_path }}/{{ inventory_hostname }}/config/rollback_config.txt"
check_mode: False
connection: network_cli
- napalm_install_config:
provider: "{{ ans.creds_all }}"
dev_os: "{{ ansible_network_os }}"
timeout: 360
config_file: "{{ ans.dir_path }}/{{ inventory_hostname }}/config/rollback_config.txt"
commit_changes: True
replace_config: True
diff_file: "{{ ans.dir_path }}/diff/{{ inventory_hostname }}_rollback.txt"
get_diffs: True
register: changes
tags: [rb]
This other Configure NXOS with Napalm post goes into more detail deploying with Napalm and som of the issues you are likely to come across.
Running playbook
Tags are used to allow for only certain roles or combination of roles to be run. The table is a list of tags that are useful up to this point, there are other tags related to the services roles which are discussed in more detail in Automate Leaf and Spine Deployment Part5 - fabric services: tenant, interface, route.
The base and fabric roles are intrinsically linked so when deploying the only option is to deploy them both (and intf_cleanup).
Ansible tag | Playbook action |
---|---|
bse |
Generates the base configuration snippet saved to device_name/config/base.conf |
fbc |
Generates the fabric and intf_cleanup configuration snippets saved to fabric.conf and dflt_intf.conf |
bse_fbc |
Generates, joins and applies the base, fabric and inft_cleanup config snippets |
rb |
Reverses the last applied change by deploying the rollback configuration (rollback_config.txt) |
diff |
Prints the differences between the current_config (on the device) and desired_config (applied by Napalm) to screen |
bse
andfbc
will only generate the config snippet and save it to file. No connections are made to devices or changes applieddiff
tag can be used withbse_fbc
orrb
to print the configuration changes to screen- Changes are always saved to file no matter whether
diff
is used or not -C
or--check-mode
will do everything except actually apply the configuration
Generate the base config: Creates the base config snippets and saves it to base.conf
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tagbse
Generate the fabric config: Creates the fabric and interface cleanup config snippets and saves them to fabric.conf and dflt_intf.conf
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tagfbc
Generate the complete config: Creates the config snippets, assembles them in config.cfg, compares against device config and prints the diff
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tag 'bse_fbc, diff '-C
Apply the config: Replaces current config on the device with changes made automatically saved to ~/device_configs/diff/device_name.txt
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tagbse_fbc