I'm new to Python and am looking for a slug/url parameterization library that offers similar function to that in the Ruby Stringex library. For example:
# A simple prelude
"simple English".to_url => "simple-english"
"it's nothing at all".to_url => "its-nothing-at-all"
"rock & roll".to_url => "rock-and-roll"
# Let's show off
"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
"10% off if you act now".to_url => "10-percent-off-if-you-act-now"
# You don't even wanna trust Iconv for this next part
"kick it en Français".to_url => "kick-it-en-francais"
"rock it Español style".to_url => "rock-it-espanol-style"
"tell your readers 你好".to_url => "tell-your-readers-ni-hao"
I've come across webhelpers.text.urlify, which claims to do this however- the results weren't close. Any help is much appreciated.
Check slugify, which is based on Django's own slugify template filter, but with NFKD normalization. Here's the relevant code:
re.sub(r'[-\s]+', '-',
unicode(
re.sub(r'[^\w\s-]', '',
unicodedata.normalize('NFKD', string)
.encode('ascii', 'ignore'))
.strip()
.lower()))
It's not nearly as powerful as Ruby's Stringex, but you could easily extend it to expand those ampersands, dollar symbols, etc. Take a look at Unidecode, a Python port of Text::Unidecode
Perl module, the same thing Stringex uses for Unicode transliteration.