Backstory:
I'm making an Ansible role that creates an SSH key for hosts that need one and automatically updates my configuration file with the latest information; lan ip, username, key, etc. I can't get the Ansible to delete the existing block correctly though before adding the new one. Then when it adds the new one, it overwrites the other hosts block for some reason (so at the end, only 1 block exists even if there are 10 hosts. I'll fix the latter later, but my main concern is to get the old block deleted correctly.
Question(s) & tl;dr:
Is there a better approach to doing this, does anyone see why this regex doesn't work in Ansible, and could my regex be simplified/improved?
Here's the relevant tasks from my Ansible role
- name: Read existing SSH config
slurp:
src: "{{ ssh_config_dir }}/config"
register: ssh_config_file
- name: Decode existing SSH config
set_fact:
ssh_config_content: "{{ ssh_config_file.content | b64decode }}"
- name: Parse existing SSH config into lines
set_fact:
ssh_config_lines: "{{ ssh_config_content.split('\n') }}"
- name: Check if existing host entry matches
set_fact:
host_entry_valid: >
{{ ssh_config_lines | select('match', '^Host {{ inventory_hostname }}$') | list | length > 0 and
ssh_config_lines | select('match', '^\\s*Hostname {{ hostvars[inventory_hostname].ansible_host }}$') | list | length > 0 and
ssh_config_lines | select('match', '^\\s*User {{ ssh_remote_user }}$') | list | length > 0 and
ssh_config_lines | select('match', '^\\s*Port {{ ssh_port }}$') | list | length > 0 and
ssh_config_lines | select('match', '^\\s*IdentityFile {{ ssh_key_dir }}/{{ inventory_hostname }}{{ ssh_key_name_suffix }}$') | list | length > 0 }}
- name: Debug host entry validity
debug:
var: host_entry_valid
- name: Backup the existing SSH config
copy:
src: "{{ ssh_config_dir }}/config"
dest: "{{ ssh_config_dir }}/config.bak"
when: not host_entry_valid
- name: Define the regex pattern
set_fact:
my_regex: '^(\s+)?Host\s+{{ inventory_hostname }}(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n^(\s+)?$'
- name: Print regex pattern
debug:
msg: "{{ my_regex | quote }}"
- name: Remove existing host entry if it doesn't match
lineinfile:
path: "{{ ssh_config_dir }}/config"
state: absent
regexp: '^(\s+)?Host\s+{{ inventory_hostname }}(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n^(\s+)?$'
when: not host_entry_valid
Check if existing host entry matches
needs works, but it should trigger the Remove existing host entry if it doesn't match
task which I'm having problems with. The regex matches a block perfectly in VSCode, but in Sublime it matches multiple blocks and Ansible it doesn't seem to work at all.
This started out as a ChatGPT creation, but I tweaked some of it and ended up writing the regex myself.
This works in VSCode:
^(\s+)?Host test(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n
but the negative(?) look ahead does not. It does work in Sublime, but sublime matches multiple blocks for some reason.
^(\s+)?Host test(\s+)?$\n^(((\s+)?[A-Za-z0-9./_-]|#)+(\s+)?)$\n(?=Host)
The idea is to match Host hostname
and then look for any lines that contain text until there is a blank line; and with the negative look ahead check to see if host exists after the blank line, but I'm not familiar with using lookaheads and this won't work if the block is the last block in the file, so I'll be removing that.
Here's the relevant Ansible output and the old SSH block doesn't get removed:
TASK [ssh-keys : Remove existing SSH agents and known hosts] ********************************************************************************************
included: /scripts/ansible/roles/ssh-keys/tasks/remove_existing_ssh_agents.yml for test
TASK [ssh-keys : Remove all SSH agents] *****************************************************************************************************************
skipping: [test]
TASK [ssh-keys : Remove host from known hosts] **********************************************************************************************************
skipping: [test]
TASK [ssh-keys : Copy SSH key to remote server] *********************************************************************************************************
included: /scripts/ansible/roles/ssh-keys/tasks/copy_key.yml for test
TASK [ssh-keys : Copy SSH key to remote server] *********************************************************************************************************
skipping: [test]
Example block:
Host test
HostName 10.0.0.4
User myuser
IdentityFile ...
Host test2
...
As Zeitounator commented, there's a specific module for SSH, which will be far easier and safer than doing it all yourself.
But regarding your question, your regular expression pattern can be rewritten like this:
^[\t ]*host[\t ]+(.+?)[\t ]*\n(?:(?!(?:^[\t ]*#.*\n)*[\t ]*host\b)[\t ]*(?:(\w+)\b(?:[\t ]*=[\t ]*|[\t ]+)(.*)|#.*)?(?:\n|\Z))+
Test it live here: https://regex101.com/r/LiGszL/2
(\s+)?
to match spaces can be written \s*
instead (clearer).
But it still has a problem, as \s
is equivalent to
[\r\n\t\f\v ]
, which will also match new lines. In
Perl/PCRE flavour, one can use \h
to match horizontal spaces.
But it seems that Python hasn't got it. So we'll replace it by
[\t ]
(note the 2 different space characters: normal space and
non-breaking space, respectively, U+0020 and U+00A0 in Unicode).
Probably considering only the normal space would be enough.
\n
will match a new line. This will work fine since the config
file is made for Unix systems and will only have this character.
But if we have a Windows file, it will be \r\n
. We could use
\r?\n
or even (?:\r|\n|\r\n)
as old Mac OS systems did use
\r
chars in the past. I'll stick to the simple \n
for clarity.
With Perl/PCRE flavours, one can use \R
to match any type of
line separators.
The single-line pattern above is the same as this commented original version:
# Host entry:
# Start of line followed by optional horizontal spaces,
# The word "Host" case-insensitive, followed by anything (captured) and a new line.
^[\t ]*host[\t ]+(.+?)[\t ]*\n
# A configuration line or comment, multiple times:
(?:
# Negative lookahead to avoid matching a new "Host" entry, but
# also with optional comment lines before it.
(?!(?:^[\t ]*\#.*\n)*[\t ]*host\b)
# Optional horizontal spaces.
[\t ]*
# Config line, comment or empty line (done with the ? at the end).
(?:
# A) A config line, capturing it (with space or equal sign).
(\w+)\b(?:[\t ]*=[\t ]*|[\t ]+)(.*) |
# B) Or a comment.
\#.*
)?
# New line or end of the config file.
(?:\n|\Z)
)+
See it in action with explanation: https://regex101.com/r/LiGszL/1
Note that it could be simplified in the middle part matching config lines or comments. No need to do all these checks as we could simply match anything as we have the negative lookahead to stop us. But this shows how it could be possible to read the host configuration lines or the comments with a second regular expression.
Full example, in JavaScript:
const regexHostEntry = /^[\t ]*host[\t ]+(?<host>.+?)[\t ]*\n(?<config>(?:(?!(?:^[\t ]*#.*\n)*[\t ]*host\b)[\t ]*(?:(\w+)\b(?:[\t ]*=[\t ]*|[\t ]+)(.*)|#.*)?(?:\n|\Z))+)/gim;
const regexConfigLine = /^[\t ]*(\w+)\b(?:[\t ]*=[\t ]*|[\t ]+)(.*)/gim;
const input = `Host test
Hostname test.domain.com
User james
Port 22
# Comment
IdentityFile ~/.ssh/key.pub
# With 2 aliases
Host test2 test-2
Hostname test2.domain.com
User = james
Port=22
# Port 23
IdentityFile = ~/.ssh/key2.pub
# For all hosts except test2, activate compression and set log level:
Host * !test2
Compression yes
LogLevel INFO
IdentityFile ~/.ssh/id_rsa
Host *.sweet.home
Hostname 192.168.2.17
User tom
IdentityFile "~/.ssh/id tom.pub" # If has spaces, then quote it.
# With a lot of spaces between lines
Host localhost
Hostname 127.0.0.*
IdentityFile ~/.ssh/id_rsa
# Without empty lines between Host definitions:
Host dummy
Hostname ssh.dummy.com
User user
Host dummy2
Hostname ssh.dummy2.com
User user`;
let matches = input.matchAll(regexHostEntry);
if (matches) {
matches.forEach((match) => {
console.log(`Found match for Host ${match.groups.host}:`);
console.log([...match.groups.config.matchAll(regexConfigLine)]);
});
}