Fixing Bugs with Genetic Algorithms
Wow, check out this preprint: A Genetic Programming Approach to Automated Software Repair. Essentially, the researchers used a suit of positive and negative unit tests as the distance scoring function for a genetic algorithm which operated on code to mutate branches. More interestingly, they did this on off-the-shelf legacy C programs.
Genetic programming is combined with program analysis methods to repair bugs in off-the-shelf legacy C programs. Fitness is defined using negative test cases that exercise the bug to be repaired and positive test cases that encode program requirements. Once a successful repair is discovered, structural differencing algorithms and delta debugging methods are used to minimize its size. Several modifications to the GP technique contribute to its success: (1) genetic operations are localized to the nodes along the execution path of the negative test case; (2) high-level statements are represented as single nodes in the program tree; (3) genetic operators use existing code in other parts of the program, so new code does not need to be invented. The paper describes the method, reviews earlier experiments that repaired 11 bugs in over 60,000 lines of code, reports results on new bug repairs, and describes experiments that analyze the performance and efficacy of the evolutionary components of the algorithm.
Literally, they wrote some small samples of code that said “here’s what I want this buggy program to do” and then their genetic algorithm actually went off and hacked away at the code (much like many of us flesh-and-blood programmers) and made it work. They have several nice examples, including one on automatically fixing the infamous Zune date bug.
The dream of automatic programming has eluded computer scientists for at least 50 years. Although the methods described in this paper do not evolve new programs from scratch, they do show how to evolve legacy software to repair existing faults.
WP SuperCache .htaccess mod_rewrite rules for Blogs in Subdomains/Subdirectories
I have a unique problem, which is that I have installed my wordpress to a subdirectory, and symlinked httpdocs from several subdomains to that directory. The structure looks like this:
httpdocs/wp/ -> WP Install
subdomains/gadgets/httpdocs/ -> /elliottback.com/httpdocs/wp/
subdomains/books/httpdocs/ -> /elliottback.com/httpdocs/wp/
This means that from my domain, we’re always sticking an extra /wp onto things, but from the subdomains, they go directly into the wp-content directories from the root , in both relative and absolute sense. I consolidated my subdomains this way so that I could run a single WP install and maintain them together. Here’s the .htaccess file that lets WP Super Cache work on either of them:
# BEGIN WPSuperCache
<ifmodule mod_rewrite.c>
RewriteEngine On
AddDefaultCharset UTF-8
RewriteBase /
RewriteCond %{REQUEST_URI} !^.*[^/]$
RewriteCond %{REQUEST_URI} !^.*//.*$
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} !.*=.*
RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteCond %{REQUEST_URI} ^(/wp)?/
RewriteCond %{DOCUMENT_ROOT}%1/wp-content/cache/supercache/%{HTTP_HOST}/%1/$1/index.html.gz -f
RewriteRule ^(.*) %1/wp-content/cache/supercache/%{HTTP_HOST}/%1/$1/index.html.gz [L]
RewriteCond %{REQUEST_URI} !^.*[^/]$
RewriteCond %{REQUEST_URI} !^.*//.*$
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{QUERY_STRING} !.*=.*
RewriteCond %{HTTP:Cookie} !^.*(comment_author_|wordpress|wp-postpass_).*$
RewriteCond %{REQUEST_URI} ^(/wp)?/
RewriteCond %{DOCUMENT_ROOT}%1/wp-content/cache/supercache/%{HTTP_HOST}%1/$1/index.html -f
RewriteRule ^(.*) %1/wp-content/cache/supercache/%{HTTP_HOST}%1/$1/index.html [L]
</ifmodule>
# END WPSuperCache
Let me know what you think–performance stats show that it’s working fine for both the /wp subdirectory and the other subdomains!
PHP Exclusive Single Process Mutex
When running php via cron, there are certainly situations where you only want a single instance of the php file to be running at the same time. Multiple processes shouldn’t be allowed. For example, every 5 minutes, a process is launched to poll for weather updates, and publish them to Twitter. If, for some reason, this process takes more than 15m–because Twitter is very slow–I don’t want more to buildup. At a rate of 12/hr, I would exhaust the number of MySQL connection on my box in a couple hours.
So, here’s the solution I used:
<?php
class pid {
protected $filename;
protected $fp;
public $already_running = false;
function __construct() {
$this->filename = dirname(__FILE__).'/'.basename($_SERVER['PHP_SELF']) . '.pid';
$this->fp = fopen( $this->filename, 'w+ );
if ( !flock( $this->fp, LOCK_EX + LOCK_NB ) )
{
echo "FAILED lock $this->filename\n";
$this->already_running = true;
fclose($this->fp);
} else {
echo "Acquired lock $this->filename\n";
}
}
public function __destruct() {
if( !$this->already_running )
{
echo "Releasing lock $this->filename\n";
flock($this->fp, LOCK_UN);
fclose($this->fp);
}
}
}
?>
It works fine from command line, but for some reason, doesn’t work when invoked via Apache over the web. But since I’m just using it for cron jobs, this is good enough for now. If you know why FLOCK( LOCK_EX + LOCK_NB ) won’t work when invoked through Apache, let me know!! I’m running PHP 5.2.6 (cli) and Apache/2.2.9 (Unix).