Today we'll be looking at the deftly named 'format a string of names like bart,
lisa & maggie'
kata on codewars


The problem

Given: an array containing hashes of names

Return: a string formatted as a list of names separated by commas except for the
last two names, which should be separated by an ampersand.

# Example:

namelist([
    {'name': 'Bart'},
    {'name': 'Lisa'},
    {'name': 'Maggie'}
])
# returns 'Bart, Lisa & Maggie'

namelist([
    {'name': 'Bart'},
    {'name': 'Lisa'}
])
# returns 'Bart & Lisa'

namelist([{'name': 'Bart'}])
# returns 'Bart'

namelist([])
# returns ''

Note: all the hashes are pre-validated and will only contain A-Z, a-z, '-' and
'.'.

The aim

My aim with Katas is usually to create the best performing solution as elegantly as I can. I'm not a fan of answers that attempt to squish everything on to one line.

Katas are extremely useful as beginner exercises, and producing answers that nobody would ever use in production-worthy code can send people in the wrong direction. For this challenge, I've set three aims that I think are reasonable:

  1. Solve in <= 10 lines
  2. Keep logic simple
  3. Find fastest performing solution

First attempt

My first attempt was a weird one, completely violating aim #2. I decided to first extract out all of the names into a list, and then build the string right-to-left/backwards. I won't go into explaining it in detail, I'll just leave it here:

#!/usr/bin/env python3

import sys
from collections import deque

def namelist(names):
  extracted = [data['name'] for data in names]
  namestr = ''
  for idx, name in enumerate(reversed(extracted)):
      if idx == 0:
        namestr = name
      elif idx == 1:
        namestr = f'{name} & {namestr}'
      else:
        namestr = f'{name}, {namestr}'
  return namestr
 ❯ time python format_names.py
python3 format_names.py  7.77s user 0.02s system 99% cpu 7.806 total

Testing

I have recently discovered assert in python and have fallen in love! It's perfect for Katas where you don't want to set up boilerplate

tests = [
  ([{'name': 'Bart'}, {'name': 'Lisa'}, {'name': 'Maggie'}],
   'Bart, Lisa & Maggie'),
  ([{'name': 'Bart'}, {'name': 'Lisa'}],
   'Bart & Lisa'),
  ([{'name': 'Bart'}],
   'Bart'),
  ([],
   ''),
  ([{'name': 'Bart'},{'name': 'Lisa'},{'name': 'Maggie'},{'name': 'Homer'},{'name': 'Marge'}],
   'Bart, Lisa, Maggie, Homer & Marge'
  )
]

for _ in range(1_000_000):
  for (input, expected) in tests:
    result = namelist(input)
    assert expected == result, '{} != {}'.format(expected, result)

Quick basic test, and no import required. Python, you're kinda cool.

Second attempt

My second attempt was more serious. It appeared to me that there was no way that I knew of to get around using an if statement.

  • If the list is empty, then we have to return ''
  • If the list has one element, return it
  • If the list has two elements, return them separated by an &.
  • If the list has more than two elements, separate the last two with a &, and all others with ,

I decided to take advantage of the fact that join on an empty array produces '', and join on a single element array returns that element. This meant I could mush the first three cases into one section of the if statement! Brevity at the cost of performance and readability, so this answer isn't the best either.

def namelist(names):
  if len(names) <= 2:
    return ' & '.join([data['name'] for data in names])
  else:
    extracted = [data['name'] for data in names]
    return '{}, {}'.format(
      ', '.join(extracted[0:-2]),
      ' & '.join(extracted[-2:])
    )
 ❯ time python format_names.py
python3 format_names.py  6.28s user 0.02s system 99% cpu 6.309 total

A little faster though! I'm assuming this is because the join operation is quicker than n string interpolations.

The third attempt

There's got to be a better way! slams fist ;).

  • use join
  • recognise that the last array element is the only special case in the long string.
  • use early returns rather than nesting if statements
  • pop is usually super-fast in any language
def namelist(names):
  if len(names) == 0:
    return ''
  elif len(names) == 1:
    return names[0]['name']
  extracted = [data['name'] for data in names]
  last = extracted.pop()
  first = ', '.join(extracted)
  return ' & '.join([first, last])
 ❯ time python format_names.py            [20:58:02]
python3 format_names.py  5.27s user 0.02s system 99% cpu 5.300 total

Woohoo we have a winner! It's readable and fast. clicks submission button

Some other solutions

But what have others submitted out there in the wide world?
I've selected 3 styles of answer for this Kata and benchmarked them using the same conditions.

The first two I find equally heinous, and I love the third

# 7.27s for 1 mil
def namelist7(names):
  return ", ".join([name["name"] for name in names])[::-1].replace(",", "& ",1)[::-1]

# 9.74s for 1 mil
def namelist8(names):
    l = []
    if len(names) == 0:
        return ''
    else:
        for name in names:
            l.append(''.join(name.values()))
        str = ', '
        if len(l) == 1:
            return l[0]
        else:
            return str.join(l[:-1]) + ' & ' + l[-1]

# 6.65s for 1 mil
def namelist9(names):
    nameList = [elem['name'] for elem in names]
    return ' & '.join(', '.join(nameList).rsplit(', ', 1))