XOR Media

Coding, Operations, Etc.

String Truncate Middle With Ellipsis

Posted on Sun 21 September 2014 by

There are times when you need to middle truncate a string. In many cases it's for UX/human purposes, though in some situations it's the best way to generate unique string for a length-limited field. This is the case I ran in to recently in trying to automate submission of IAP to both Google Play and the App Store which require short unique names for each SKU.

The Setup

Consider the following titles, each of which is 39 characters long.

Midsomer Murders - Series 1 - Episode 1
Midsomer Murders - Series 1 - Episode 2
Midsomer Murders - Series 1 - Episode 3
Midsomer Murders - Series 1 - Episode 4
Midsomer Murders - Series 1 - Episode 5
Midsomer Murders - Series 2 - Episode 1
Midsomer Murders - Series 2 - Episode 2
Midsomer Murders - Series 2 - Episode 3
Midsomer Murders - Series 2 - Episode 4
Midsomer Murders - Series 2 - Episode 5
Midsomer Murders - Series 3 - Episode 1
Midsomer Murders - Series 3 - Episode 2
Midsomer Murders - Series 3 - Episode 3
Midsomer Murders - Series 3 - Episode 4
Midsomer Murders - Series 3 - Episode 5
...

Assume we had to fit these strings in to a field we had no control over that requires them to be 32 characters or less and unique, or perhaps we're displaying them in a UI where there's not enough room for the full title. The naive approach would be to truncate them to 32 characters in length and add ellipsis to make it clear that the title has been truncated.

Midsomer Murders - Series 1 -...
Midsomer Murders - Series 1 -...
...

That doesn't work particularly well as it results in duplicates across each series. Taking a closer look at the format of the titles, which in this case are consistent, we notice that there's two points that will uniquely identify an episode. The series number and the episode number. So what if we truncate things in the middle rather than the end.

Midsomer Murderi...1 - Episode 1
Midsomer Murders...1 - Episode 2
...

The code to do this is fairly simple and looks something like the following.

def truncate_middle(s, n):
    if len(s) <= n:
        # string is already short-enough
        return s
    # half of the size, minus the 3 .'s
    n_2 = int(n) / 2 - 3
    # whatever's left
    n_1 = n - n_2 - 3
    return '{0}...{1}'.format(s[:n_1], s[-n_2:])

This process isn't perfect though as with a different set of titles it may truncate out the series number as result in duplicates. For UI purposes this may be acceptable (a best effort,) but for something that requires uniqueness won't quite be enough. In my particular situation the items have hex UUID's as unique identifiers so the simplest thing to do was to append a few characters of it to the end of the title before truncating. This for all practical purposes insures uniqueness. What other solutions can you think of?

In code, tagged: python, coding, examples, and strings.

About the Author

Ross McFarland Ross McFarland | | |

Ross is a 13 year veteran of the software industry with experience spanning low-level signal processing, web and mobile user interfaces, and high-scale distributed web services including Amazon.com's Digital Media Group. He has made extensive contributions to open source highlighted by his time as a primary maintainer of Gtk2-Perl and author of requests-futures and python-asynchttp librarys. (more)