Regex Groups in Python

Federico Viticci has me hooked on Pythonista. I am not sure what took me so long to buy it because the very premise of scripting on my phone sounds like the best idea ever. I bought it and can confirm it is (in my opinion) the best app ever. At the very least, it is my favorite app right now.

This is not an app review, that may come later. No, this is more of a reminder for myself the next time I go to use regex groups in a Python script. I am extremely new to Python so when I learn how to do something I generally write my self a little text note or store the code snippet for future use, this time I am just going to share it in a blog post.

Groups are incredibly useful. You can use regular expressions to identify a portion of a string and assign it a group name that can be easily referenced when replacing values.

The syntax in Python for defining a regex group is:

(?P<foo>some-regex)

The syntax to reference that group later is:

\g<foo>

This is often better described with a real world example. So here is a link to a tweet on Twitter.com that I am going to break up in to multiple useful groups:

https://twitter.com/binaryghost/statuses/261213781294718976

Using regex to identify the may portions of the URL (domain, user, status, and id) I can assign each chunk of the URL an identifiable name:

Now that I have my groups identified, I can call them in the substitution pattern to do something like convert the format of the link to one that is compatible with Tweetbot’s URI scheme:

tweetbot://\g<user>/status/\g<id>

So when you put all this together in Python you basically get this little code snippet:

So disregarding the import statement, it is possible to identify only specific sub-strings and reuse or replace them at will with only 1 line of code. As you can see, I did not end up using <domain> or <status> and I could have probably left them out but I wanted them to be apart of the example. Every language has its own flavor of regex groups, this syntax however is specific to Python which I why I felt it best to document it with a useful example.

In addition here is a gist with the above code in a working example specifically written for Pythonista:

*Also check out Viticci’s version over at MacStories