Merge sort in Python

Question

I've written a implementation of a merge sort in Python. It seems correct and I have tested it with this:

l = list(range(1000))
random.shuffle(l) # random.shuffle is in-place
mergesort(l) == list(range(1000)) # returns True

# Stable sort
cards = list(itertools.product(range(4), range(13)))
random.shuffle(cards)
cards = mergesort(cards, key=lambda x: x[1]) # Sort by rank
cards = mergesort(cards, key=lambda x: x[0]) # Sort by suit
cards == list(itertools.product(range(4), range(13))) # returns True

I've also tested its performance, comparing it to sorted and the merge sort implementation in rosetta code:

rl = list(range(100))
random.shuffle(rl)

%timeit sorted(rl)
# 100000 loops, best of 3: 11.3 µs per loop

%timeit mergesort(rl) # My code
# 1000 loops, best of 3: 376 µs per loop

%timeit rcmerge_sort(rl) # From rosetta code
# 1000 loops, best of 3: 350 µs per loop

I'm looking for any suggestions on how to improve this code. I suspect there is a better way to do the mergelist function, particularly in how I tried to avoid code duplication like:

if top_a <= top_b:
    nl.append(top_a)
    try:
        top_a = next(it_a)
    except:
        ...
else:
    # duplicates above code

In my code I placed the iterators and first values in a list, then use the variable k as index, but this leads to hacks like abs(k-1) and placing magic numbers 0 and 1 in the code.

def mergesort(l, key=None):
    # Split the list into sublists of length 1
    sublists = [[x] for x in l]

    while len(sublists) > 1:
        new_sublists = []

        # Create a generator that yields two sublists at a time
        sublists_pairs = ((sublists[2*x], sublists[2*x+1]) 
                          for x in range(len(sublists)//2))

        for a, b in sublists_pairs:
            new_sublists.append(mergelist(a, b, key))

        # If the length is odd, then there is one sublist that is not merged
        if len(sublists) % 2 != 0:
            new_sublists.append(sublists[-1])

        sublists = new_sublists

    return new_sublists[0]


def mergelist(a, b, key=None):

    nl = []

    # Iterators that yield values from a and b
    its = iter(a), iter(b)

    # The top of both lists
    tops = [next(it) for it in its]

    while True:
        # Determine the iterator that the next element should be taken from
        if key:
            k = 0 if key(tops[0]) <= key(tops[1]) else 1
        else:
            k = 0 if tops[0] <= tops[1] else 1
        nl.append(tops[k])

        try:
            # Update the top of the iterator
            tops[k] = next(its[k])
        except StopIteration:
            # Unless the iterator is empty, in which case get the rest of
            # the values from the other iterator
            # abs(k-1) is similar to (0 if k == 1 else 1)
            nl.append(tops[abs(k-1)])
            for e in its[abs(k-1)]:
                nl.append(e)

            return nl

Caridorc · Accepted Answer · 2014-10-26 18:48:27Z

1

Using list(x) is better than [x] because it is more explicit.

Your while loop is weird, delete

try:
    # Update the top of the iterator
    tops[k] = next(its[k])
except StopIteration:
    # Unless the iterator is empty, in which case get the rest of
    # the values from the other iterator
    # abs(k-1) is similar to (0 if k == 1 else 1)
    nl.append(tops[abs(k-1)])
    for e in its[abs(k-1)]:
        nl.append(e)

and put something like

if (condition):
    #thing1
else:
    #thing2
    break

answered Oct 26, 2014 at 18:48

Caridorc

28k7 gold badges54 silver badges137 bronze badges

\$\begingroup\$ Thanks for the reply. I think list(x) would make a list from the elements of x, but the program needs to create a list with a single element, which would be done with list as list((x,)) . Also, I don't understand what should I put in the if statement, can you elaborate? \$\endgroup\$
– parchment
Commented Nov 3, 2014 at 0:19
\$\begingroup\$ The if statement works in simpler cases, I think it is easier to read, in your case, I don't think it can be used, I'm not good at using iterators. \$\endgroup\$
– Caridorc
Commented Nov 3, 2014 at 18:33

Add a comment |

Arvind Padmanabhan · Accepted Answer · 2014-11-02 14:37:08Z

1

The while loop can be written as follows (although I think it does not speed things up):

while True:
    # Determine the iterator that the next element should be taken from
    if key:
        low,high = (0,1) if key(tops[0]) <= key(tops[1]) else (1,0)
    else:
        low,high = (0,1) if tops[0] <= tops[1] else (1,0)
    nl.append(tops[low])

    # Update the top of the iterator
    tops[low] = next(its[low], None)
    if not tops[low]:
        # Unless the iterator is empty, in which case get the rest of
        # the values from the other iterator
        nl.append(tops[high])
        nl.extend(its[high])

        return nl

answered Nov 2, 2014 at 14:37

Arvind Padmanabhan

4192 silver badges5 bronze badges

\$\begingroup\$ Thank you for the reply. If does get rid of the ugly call to abs, but I'm not sure about replacing the try - catch block with an if statement. What if there's a 0 in the list to be sorted? \$\endgroup\$
– parchment
Commented Nov 3, 2014 at 0:13
\$\begingroup\$ Good point. Rewrite as: if tops[low]==None \$\endgroup\$
– Arvind Padmanabhan
Commented Nov 4, 2014 at 10:53

Add a comment |

Stack Exchange Network

Merge sort in Python

2 Answers 2

Your Answer

Hot Network Questions

Merge sort in Python

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions