Why coding a monkey sort or bogosort in ruby isn't so stupid after all

a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type a given text, such as the complete works of William Shakespeare

I’m going to write a monkey sort in ruby. Although pretty much useless for any production systems, running through the practice of implementing a monkey sort will allow us to wonder into a lovely little ruby method called shuffle while at the same time give us pause to consider what a correct monkey sort actually is.

I’m not sure monkeys deserve a bad rep on the sorting front. I suspect they can actually sort quite well. Unfortunately, the infinite monkey theorem has ensured the poor creatures are routinely used as a metaphor for inefficiency.

What is a monkey sort?

If you’ve never heard of a monkey sort before, it might be handy first to define it:

Bogosort or monkey sort randomly shuffles a collection until it is in order

This can be expressed in pseudo-code as follows:

while not isInOrder(input):
	shuffle(input)

The many faces of monkey sort

Let me assure you that you not the first to marvel at the stupidity of a monkey sort. In fact, among this method’s many names (slow sort, random sort, shotgun sort, bogosort), is the appropriately named stupid sort.

It’s a hopeful method of sorting - you can liken it to throwing a deck of cards in the air and then picking them up randomly, hoping they will end up in order. If not, throw the deck in the air again. Daft doesn’t begin to describe it.

Ruby code for monkey sort

At first pass, you might be tempted to write code to randomly sort an array yourself, ruby already has this functionality neatly encapsulated in the array#shuffle method. Here’s the first implementation of the monkey sort in ruby that I came up with:

def monkey_sort(set)
  sorted_set = set.sort
  set.shuffle! until set == sorted_set
  puts "Sorted: #{set.inspect}"
end

There’s not much to it - the method takes an array as its parameter. A sorted version of the array is assigned to the sorted_set array. Using ruby’s shuffle!(note the ! or bang) method on the original array, continue shuffling until the array is equal to sorted_set.

Remember the ! or bang in this case will modify the array/object in place rather than return a new array. So while set.shuffle would return a new array with the elements shuffled, leaving set untouched, set.shuffle! will mutate set itself in place.

Give the monkeys something to sort

Having defined monkey_sort, the next step is to test it. My first thought was to randomly pick five numbers between 0 and 100 and pass them in. There are, as always, many ways to skin a cat or make a monkey bash on a typewriter - here’s one:

arr = (0..100).to_a.shuffle[0..5]
monkey_sort(arr)

I create an array of all values between 0 and 100 using (0..100).to_a. Then, using array#shuffle, I put the array into a random order. Finally, calling [0..5], to select the first 5 elements of the randomly sorted array. This is assigned to a variable named arr and then monkey_sort is called, passing in the randomly selected 5 elements.

That’s a touch verbose. To see our monkey sort in action we really need nothing more complicated than the following:

monkey_sort((0..4).to_a.shuffle)

Lacking the panache of creating a random array of 5 elements between 0 and 100, this will, nevertheless, provide a reasonable array with which to test monkey_sort. Although the very persnickety developer in me would say that it will work, pending shuffle’s propensity to always return a different array to the one being acted upon. That is to say, as long as calling shuffle on a sorted array will never return the array still sorted, we’re on to a winner. This depends on ruby’s implementation of shuffle. I’m not an expert on randomisation, but I think a random ordering of an array could potentially return the original array (please correct me if I’m wrong). I’ve not looked in to shuffle’s implementation, so I’ll just leave it there, as a thought. A very, very picky thought.

Pay peanuts, get monkeys or that implementation is wrong

One could argue that using shuffle to mutate our array each time is incorrect, since we are shuffling a different array each time. This might be subtly wrong because our monkey sort has gone from:

randomly shuffling a set until it is in order

to

randomly shuffling an already random shuffle of our set until it is in order (at least after the first shuffle)

I think that’s a pretty valid concern. The convenience and beauty of an in place mutation should not get in the way of creating a correct algorithm. Let’s try this instead:

def monkey_sort(set)
  sorted_set = set.sort
  shuffled_set = set.shuffle

  until shuffled_set == sorted_set
    shuffled_set = set.shuffle
  end

  puts "Sorted: #{shuffled_set.inspect}"
end

Now we’re ensuring that each time we shuffle it’s on the original array as passed in to our method. Technically, since I pass in an already shuffled array in my test code, I don’t need to call set.shuffle on line 2 of the method. But I’ll call it anyway, just in case a shuffled array isn’t passed in. Is this code better? Is it more correct? I’ll leave that up to you.

That’s not terribly beautiful. It’s also, perhaps, not very idiomatic ruby. It’s readable though. Here’s a less readable, more concise version, though not entirely the same since we sort the set each time we do the comparison which could change the performance of the code:

def monkey_sort(set)
  loop do
    break if set.shuffle == set.sort
  end
  puts "Sorted: #{set.sort.inspect}"
end

You might prefer a recursive approach. That’s fine. A little less understandable for beginners, but go for it:

def monkey_sort_recursive(set)
  monkey_sort_recursive(set) unless set.shuffle == set.sort
end

Stop! there are lots of ways of coding this. I’ll stick with the very verbose slightly less idiomatic version for readability.

No more monkey business

Are those monkeys actually doing anything? Are they lazy? Are they just pretending to sort the array? My monkey sort implementation lacks any visibility that would allow anyone to actually see the sorting algorithm at work. Fix:

def monkey_sort(set)
  sorted_set = set.sort
  shuffled_set = set.shuffle

  until shuffled_set == sorted_set
  	shuffled_set = set.shuffle
  	puts shuffled_set.inspect
  end

  puts "Sorted: #{shuffled_set.inspect}"
end

I’ve changed up the code a little - it should all be pretty readable - I’ve simply added a little output each time we shuffle by printing out our shuffled array. Using inspect just prints the array nicely. Now when I run the algorithm, I can see the various arrangements my monkeys decided to put the array in while attempting to sort it. Of course this sort of output is completely unnecesary, it merely serves to show the random iterations for interest’s sake.

Throw more monkeys at the problem. Or more problems at the monkeys

Stressing the monkey sort with more elements is trivial - a small change to the calling code will pass in 7 elements:

monkey_sort((0..100).to_a.shuffle[0..7])

Now, monkey sort is rather inefficient and with larger arrays it tends not to perform very well. My completely unscientific trials on my machine suggests sorting 5 elements will usually occur in less than 100 iterations. Increasing the number of elements up to 11 basically rendered my sort useless, and returned unsorted. In theory, a monkey sort might never end even with only 2 elements.

When playing with larger array sizes, I like to have a get out of iteration free clause in my code. Something like this:

def monkey_sort(set)
  count = 1
  sorted_set = set.sort
  shuffled_set = set.shuffle

  until shuffled_set == sorted_set
  	shuffled_set = set.shuffle
  	puts "iteration: #{count += 1} #{shuffled_set.inspect}"

	# hard exit if we sort for too long
	if count > 4_999_999
	  puts 'wow, 5 million monkeys could not sort your array'
      return false
	end
  end

  puts "Sorted at last: #{shuffled_set.inspect}"
end

Here, a hard limit on 5 million iterations is in place - any more than that and we fail. You might prefer your hard limit in a block because blocks in ruby are cool:

def monkey_sort(set)
  sorted_set = set.sort

  (1..5_000_000).each do |count|
    shuffled_set = set.shuffle

    if shuffled_set == sorted_set
      puts "Sorted at last: #{shuffled_set.inspect}"
	  return
    end

    puts "iteration: #{count} #{shuffled_set.inspect}"
	return false
  end

  puts 'wow, 5 million monkeys could not sort your array'
end

This implementation isn’t my favourite because the happy path is tucked away in an if statement and I rather have it as the default.

At this point, the code is becoming a little verbose. It would probably be a good time to step back and refactor. I’m not going to do that because I think that’s enough code for today.

More fun than a barrel of monkeys

Although the monkey sort itself is a rather ridiculous algorithm, what I found interesting is that even with this seemingly simple requirement, we can approach the problem two very different ways (one by continually mutating until we are sorted and one by randomly ordering an original array until it is sorted).

My initial stab of mutating an array in place until sorted was, I think, actually incorrect, but only through the process of writing and thinking about the code did I notice that. And _that_ is why coding a monkey sort isn’t that stupid after all.