M-Space: The Blog


Mon 21 Jan 2008

Moore and more

While reading a computing book from the early 1980s the other day I came across this passage:

"System capacities inexplicably diminish over time. The double-density disk drives that once were the answer to my storage prayers no longer fill the bill. The double-sided, double-density drives I'll get next will seem like a vast frontier — for a while. I've heard guys with 10-megabyte hard disks complain of feeling cramped."
Thinking Forth by Leo Brodie (Spectrum Books, 1984)

I remember my first (borrowed) PC, an 8086 IBM PC with a 20MB hard drive that at the time seemed pretty huge. These days even 20GB would probably be considered rather small, and you'd probably struggle to install a reasonably complete modern Linux system (which is a fair bit smaller than Windows) in less than 2GB. As for floppy disks — double-sided, double-density or otherwise— they are pretty much a thing of the past by now.

It's quite scary how fast computer technology develops, so that yesterday's bleeding edge kit becomes today's obsolescent dinosaur and technology which initially seems to offer more capabilities than you could ask or imagine soon becomes frustratingly limited. Of course, this behaviour fits Moore's Law (read the linked Wikipedia article if you want to find out about that).

One downside of this is that old computer equipment (which could mean things only 2 or 3 years old, and certainly any computer more than 10 years old) may still function perfectly well but be effectively useless, at least for everyday computing, because it lacks the capability to run the modern software we want to use on it or it runs so slowly compared to the newer machines we've become used to.

[/computer] permanent link

Sun 06 Jan 2008

Polyglottal programming

Newsflash: The promised new version of M-Space has now finally been launched. More details to follow soon in the blog, but why not take a look for yourself.

Over the past few days I've found small but useful programming tasks to do with 3 different programming languages, all of them (the languages, not the tasks) beginning with 'P'.

I've recently been doing a fair bit of website work, both on my own site and on the church website that I maintain. Both of those sites are largely driven by PHP, mostly fairly simple stuff with one or two slightly fancier bits thrown in. The most recent change to the church website, involving some of the gnarliest PHP work I've had to do, is an update to the scripts driving the songs database page so that it's possible to select which year you look at songs from. Last year it wasn't a problem since 2007 was the first year that the database appeared on the website. Since today is the first day of services at the church this year, the 2008 database doesn't yet have any songs in it but that situation will be changing in the next day or so. The database itself is in XML, with PHP used to extract the data and turn it into a webpage according to the preferences specified by the user.

While I use a PC running Linux (most of the time) at home and a PC running Windows XP at the church office, my other job entails the use of a Mac. One of my favourite things about MacOS is the terminal emulator, which enables me to work using text-based unixesque commands that I generally find a lot more satisfying (as well as efficient) to use than any graphical interface. A couple of days ago, I was tidying up some files from various meetings over the past few years, and decided the time had come to institute a properly systematic naming convention for minutes, agendas and other documents. In the past I've tried to be systematic, but the system has varied quite a bit with the result that it had become quite difficult to trawl through the collection of documents to find the one I was looking for. In Linux, I would run the 'rename' command, using a simple regular expression to describe a collection of filenames under one system and convert them to the new standard format. Unfortunately the Mac OS version of rename seems to be altogether more simplistic and entirely lacking in support for regular expressions. Undaunted, I decided to try and knock together a perl script to do the renaming I wanted (perl being the obvious choice since regular expressions are an integral part of the language and there is a fairly complete implementation installed on my Mac). While Googling for hints as to how to proceed, I came across a rename script by Larry Wall, the creator of perl, which seems to offer all the core functionality of the Linux command and certainly more than enough to get my job done. Granted, this particular bit of programming wasn't done by me in the sense of actually writing any new code (other than the regexes needed to describe my file names), but it embodies the important programming precept of Don't reinvent the wheel. Incidentally, I'll probably be blogging more about regexes (i.e. regular expressions) soon, as the more I learn about them the more I consider them a really groovy and decidedly useful tool.

My third programming task for the week was a script to facilitate publishing blog entries. The individual posts for this blog are kept as separate text files and the directory structure in which they reside provides the categories for the blog. In order to be able to preview the files locally and then upload them when they are ready but have them appear on the blog at some appropriate future date, it's necessary to slightly change the filenames either before or after uploading them, which makes it slightly tedious to do the job by hand. I decided to write a script to automate the process as far as possible, so that I only have to invoke the script and pass it the source filename and it will take care of working out the correct target filename, opening an ftp connection, switching to the appropriate directory and uploading the file. For this particular programming task I turned to Python, largely because my Python reference books were nearer to hand than my perl ones and also because I'm probably a bit less rusty at it. The script I've put together so far seems to work ok, although it doubtless could be tightened up a little bit.

Perhaps tomorrow I should dig out the Prolog textbook I have stashed away somewhere, or revisit my first post-BASIC programming language, Pascal, just so I can keep the trend going...

[/computer] permanent link

Fri 28 Dec 2007

To blog or not to blog, that is the question?

First there was the website (well, I actually skipped ahead several chapters in the history of the universe, but that's a convenient starting point for the current discussion). Then came the blog...

When blogging first began to take off, in the early years of the 21st century, the phenomenon largely passed me by. Not too many years later, I was getting quite into homebrewing and decided that it would be useful to keep some notes on what I was up to. I thought that a blog might be quite a useful way to keep them organised and would enable other people to read about my experiences too, if anyone was interested (unlikely, I admit). After a brief search around the then-available blog hosting options, I found a free service (Blogger) that seemed quite suitable to my needs and thus my Homebrew Blog was born on 1st January 2005.

It fairly quickly became apparent that I wasn't doing enough brewing to sustain a blog and the stream of entries quickly slowed (dig that oxymoron!) to a trickle, with a few non-brewing related posts creeping in later in the year. The following January I took up knitting and again decided that this might be worth sharing with the world, so I resurrected my blog as a Homebrew Knitblog (same URL as before). For a while most of the entries were about knitting, since my brewing activity was fairly minimal at the time, but since I'd now officially broken the barrier of a single-subject blog it quickly began to diversify to other topics as well. However, the pace of blogging soon dropped off again, partly because I found the website interface for the blog fairly clunky. In fact, I stopped adding to that blog altogether round about May 2006.

In the meantime I'd started a couple of other blogs, also powered by Blogger. The first, entitled Cymraeg Am Byth ("Welsh For Ever"), was my small contribution to combat the dominance of English on the Internet by writing a blog in Welsh. From the very start it was a random collection of mostly short musings, written entirely in Welsh. It is no longer extant, and I'm not sure quite what happened to it. The second blog was based on a suggestion I read somewhere to try writing a simple blog in a language you are learning. In my case, the blog started out in Esperanto but quickly switched to Spanish, as my attention shifted to that language. I forget what it was called initially, but when the language changed it got renamed (somewhat ironically, or perhaps accurately) No hablo español ("I don't speak Spanish"). Unlike the other blogs the aim here was simply to write short articles, sometimes no more than a sentence or two, to practice my Spanish writing skills. I fairly soon gave up on that one too, although it is still (as of this writing) accessible.

Towards the end of 2006 I came across a perl script called Blosxom that enables you to publish a blog as a series of text files, using the directory hierarchy to impose a structure on the blog. I decided to resurrect my main blog yet again, but this time using Blosxom and hosting it directly on my website. The new blog, renamed simply M-Space Blog as it was now part of M-Space (my website), was launched on 31st December 2006, just in time for the new year, and it is the blog you are currently reading. Unfortunately, there were some technical issues with my ISP's server that prevented some of the features such as RSS syndication from working properly. Fortunately my brother, who had set up a Blosxom-based blog on his website (and was probably the one who told me about the system), had no such troubles with his server and was able to find me some space on there from which to host my blog.

By now you'll doubtless have noticed the pattern that my blog seems to change at the start of every new year, so you may be wondering what's in store for it this time. As it happens, I'm not planning any major changes to the structure of the blog although I do hope to be a bit more regular (and frequent) in my blogging next year. A bigger change should be coming soon in the "parent" M-Space website (although it's not strictly a parent as the blog is now hosted elsewhere), which I haven't done much work on for the last couple of years. I'm currently in the process of revamping it, using some techniques which I've developed while working on my church website, and hope to have the new version online soon. But I wouldn't hold your breath...

[/computer] permanent link

Fri 27 Jul 2007

Sudoku recycling #2

As I blogged yesterday, I tried a while ago to write a computer program to generate sudoku squares by randomly filling in the grid, but swiftly concluded that my algorithm was hideously ineffecient and gave up on it. After that, I didn't really think much at all about the problem of sudoku generation until a random conversation at a party last week brought it back to mind. I now have a new strategy for generating sudoku squares and I have developed some of the mathematical tools necessary to show that it will work.

My plan this time is to start with a known sudoku square and then use a series of transformations that preserve the sudoku conditions (i.e. the uniqueness of each digit within any given row, column or block) to convert it into a new square. It is easy to see that swapping arbitrary squares, or rows, columns or blocks can swiftly break the conditions. However, if you restrict yourself to swapping rows or columns without crossing block boundaries, or swapping whole rows (or columns) of blocks, the conditions are preserved. Also, all instances of one digit can be swapped for all instances of another digit (e.g. swap all 1's with 7's) or all columns and rows can be interchanged (viewing the grid as a matrix, you're taking its transpose) without breaking the conditions.

Using fairly elementary mathematics, it is quite straightforward to prove that all those operations preserve the well-formedness of a sudoku square, although to do so it would be necessary to be slightly more rigorous with definitions and notation. Thus we get a set of tools that can be applied at random to any valid sudoku square to get another one. It should be fairly straightforward to implement this as a computer program, and I intend to give it a try soon (probably using Python again). This will effectively be a way of recycling sudokus by converting them into new ones.

However, as usually happens in mathematical (or any other?) research, when you answer a question you find it leads to several questions. While I have established that you can transform one sudoku into another with these tools, it is not clear whether you can generate all sudokus in this way from a given starting point (although, all these tools are reversible so it would be sufficient to show that an arbitrary sudoku can be reduced to some standard form). Also, it is possible that some of the tools could be duplicated using some of the others, so there might be a certain amount of redundancy within the toolset (even if that is the case, it may still be convenient to retain them all - consider the analogous case of logic gates, where all gates can be build from a combination of nand gates; this renders all other gates technically superfluous, but in practice it is much easier to allow and, or, not and others separately as well).

Additionally, there remains the problem of taking a complete sudoku square and converting it into an interesting yet soluble problem. This also raises a stack of questions: how many squares can you delete and still guarantee being able to find a solution? how many can you delete before it becomes impossible to solve? (hopefully no fewer than for the previous question!), is it just a matter of how many, or is the distribution (numerical or physical) important as well?

I suspect that all these questions will be somewhat harder to answer than the previous one (in case you didn't spot the implicit question it was "can I find a set of transformations that will enable me to turn one sudoku square into another one?") so don't expect any results from me anytime soon. Of course, if you have any clever insights into how to solve them, please drop me a line.

[/computer] permanent link

Thu 26 Jul 2007

Sudoku recycling #1

As a mathematician, I've found a fair amount of enjoyment in tackling sudoku puzzles. As well as completing puzzles I come across in magazines or on websites, I'm interested in how you go about generating puzzles as well as theoretical questions about their solubility and stuff like that (to use a technical phrase), although I haven't yet done any serious research on it.

A year or so back I decided to have a go at writing a computer program to generate sudoku puzzles. Realising that most 9x9 grids with random numbers sprinkled about them are not going to yield complete solutions as sudokus (i.e. all squares filled in, satisfying the conditions of uniqueness within any row, column or block), I decided that the way forward would be to generate completed sudoku squares and then randomly delete digits to provide a challenging puzzle. Having broken the problem down into two separate sections, I decided to start by writing a program to generate the completed squares and worry later about how to turn them into puzzles that would present the right level of challenge (i.e. soluble but not trivial).

My initial attempts, which were done using the Python programming language (by which, of course, I mean a language called Python, not a language for programming snakes), were based on the idea of randomly assigning a digit to the first grid square and then proceeding through the grid, square by square and row by row, assigning digits at random from the collection of digits not previously employed in the row, column or block under consideration. For the first few rows and columns, this is easy enough but there comes a point when for many squares the constraints of row/column uniqueness clash with that of block uniqueness and it becomes impossible to assign a digit that won't fail one of the conditions. Although at this stage you could just abandon the square and start again, it seems that you potentially have to wait a very long time for the computer to churn out a valid sudoku square. I suspect, although I haven't yet proved, that most squares generated at random like this will not be valid sudoku squares, so the process will usually fail at some point. It could be that my program had other problems that were stopping it working, but I quickly became convinced that this was a horribly inefficient algorithm in any case. So, I stopped working on it and my attention was, as usual, quickly captured by other, totally unrelated, things.

To be continued...

[/computer] permanent link

Next 5 entries

Powered by Blosxom    :     Go to blog home     :     Go to M-Space home