Monday, May 19, 2008

One reason I like my new job

On my schedule today, the director of engineering will be delivering a presentation with the following topic:
"Solving Sudoku puzzles with recursion"

Knowing this was coming, I started writing an experimental Java solver over the weekend. I had a pretty busy weekend, so I only got as far as reading a puzzle from a file and displaying it. During off-stage time at my two chorus performances, I worked out a possible algorithm in my head, but I'll need a few more days to write it out and debug. I'll probably put it in a Java applet on my web page when it's done, and you'll be able to watch it "think" if it works out the way I'm imagining it.


  1. I hope someday I can get a nice job like that :)


  2. William Ashbless3:41 PM

    Forgive me for rambling a little. I know you almost certainly have a more solid handle on this puzzle already, Kazim, so I speak mostly for other people listening. However, you made me grin because I stopped by as a result of various links on the ACA, and just happened to recently be pondering this very problem so it gives me a reason to leave a comment. :)

    I've done this a couple of times using a couple of strategies. Ultimately, it almost certainly results in recursive-descent-with-backtracking, but that still leaves lots of leeway in data formats and the algorithms to support them.

    Its fairly easy to write one that works 'reasonably well' on simple puzzles, but you should hammer it with the hardest possible ones -- where the branching factor in the search tree is reasonably high. There's files out there with tens of thousands of hard puzzles, basically where only 17 elements are filled in but that still have a unique solution.

    The 'dancing links' algorithm is a generic tactic for 'covering problems' like this, but I haven't tried that yet. My solutions have been in the vein of keeping track of the possibilities for rows, columns, boxes, and cells, in various ways, to allow for the 'choices' at any level of the recursion.

    My initial implementations all had performance problems because of the branching factor. Reducing the branching, and making each step of the branch highly efficient rectified much of this. (We're talking multiple tens of minutes for a solution, or not solving at all in any reasonable time down to seconds or less, on the ultra-hard problems).

    My favourite personal solution doesn't involve any seriously exotic data formats or attempts at vectorization or parallelism yet.

    I haven't done it in Java yet, but I may just try that myself. There are possibly added complications of excessive object allocation and memory fragmentation to slow things down. Should be interesting.

    I'd love to find out what your take is on it, Kazim, at the end. I may have to stop by to check up.

    Anyway, thanks for the neat topic.

  3. The director of engineering demonstrated what was called a "brute force" algorithm, where it started from the upper left corner, tried the possibilities sequentially, and then backtracked at a dead end.

    I improved the algorithm a lot just by intelligently picking which cell to start at. I look for cells with no possible numbers first, to find unsolveable puzzles right away. Then I look for squares with two possibilities, then three possibilities, and so on.

    I've written a visual program, but it's not ready for prime time just yet -- it can work on pre-typed puzzles, but there is no interface for entering your own stuff. That's all I have left to do before I stick it on my web page.

    Anyway, the revised algorithm looks pretty good. The worst case I found (a puzzle on Wikipedia specially intended to confound the brute force program) took 47,000 "moves" to solve it, which sounds like a lot, but when you remove the "sleep" command and just let it go full speed, it only takes a few seconds, even while drawing the graphics. Most puzzles can be viewed going slowly, and it looks pretty cool. A friend of mine at work also implemented a depth-first search, which improves performance but only at the cost of using many megabytes of memory at a time.

    I wanted to try the strategy of alternately testing individual cells, and looking for a home for specific numbers. I'm not quite there yet, though, and may not get there for a while. I think that once I finish the interface, I'll be burned out on the problem for now.

  4. William Ashbless7:36 PM

    Yes, in fact my first attempt tried what your director did. Then I did what you're suggesting, going after cells of fewest possibilities.

    Of course, as you fill in numbers, you constrain the possibilities of other cells, and thus which becomes the 'next cell' to look at (assuming you go for the one with minimum possibilities) can change.

    That made a big difference. It also made a big difference to look at the operations I was using to choose which cell to do next (variation on priority queue) and the operations I was doing to monitor the choices in the cells, rows, boxes (some efficient set implementations).

    A lot of these improvements can be done incrementally.

    If you google for "minimum sudoku" the first entry should be a webpage with a file you can download that was the master list I used.

    Funny you should mention 'looking for a home' for certain numbers. Another one of my friends brought that up. I figure you could have some major speed improvements by working at the puzzle from both directions, choosing the strategy that requires the fewest choices at any given time.

    Anyway, good luck.

  5. Hi, we share some interests at least - i listen to the non-prophets whenever i can, and we very much think alike. Like you I spend my time writing (debugging) code. My poison however is HDL programming (verilog), always interesting to hear what you are doing professionally.

    Sean (Ireland)