I need to backspace and add a comma right before every instance of 'https:' in a file. I believe regular expression can do this but i'm not sure how.
The line below is what I would like.
2021-05-11 23:39:30,https://www.tiktokv.com/share/video/
Current format is this:
2021-05-11 23:35:41
https://www.tiktokv.com/share/video/
Using Regex with pattern r"(\n)(?=https)"
Ex:
s = """2021-05-11 23:35:41
https://www.tiktokv.com/share/video/
2021-05-11 23:35:41
https://www.tiktokv.com/share/video/
2021-05-11 23:35:41
https://www.tiktokv.com/share/video/"""
print(re.sub(r"(\n)(?=https)", r",", s))
Output:
2021-05-11 23:35:41,https://www.tiktokv.com/share/video/
2021-05-11 23:35:41,https://www.tiktokv.com/share/video/
2021-05-11 23:35:41,https://www.tiktokv.com/share/video/
Without regex
from io import StringIO
s = StringIO("""2021-05-11 23:35:41
https://www.tiktokv.com/share/video/
2021-05-11 23:35:41
https://www.tiktokv.com/share/video/
2021-05-11 23:35:41
https://www.tiktokv.com/share/video/""")
res = []
for idx, line in enumerate(s, 1):
if idx % 2 == 0:
res[-1] += f',{line}'
else:
res.append(line.strip())
print(res)
Output:
['2021-05-11 23:35:41,https://www.tiktokv.com/share/video/\n',
'2021-05-11 23:35:41,https://www.tiktokv.com/share/video/\n',
'2021-05-11 23:35:41,https://www.tiktokv.com/share/video/']