Graffiti for the new millenium

Posted 11 Jan 2007 by Dean Harding

I was walking down the street today, when I see this piece of graffiti. Normally I just ignore graffiti, but this one caught my eye:

MySpace Graffiti
(Note: I pixellated the personally-identifying bits)

The geek in me just had to go visit the page, but it wasn't all that interesting really.

Seems a bit dumb to me, though. After all, if I were the owner of the wall this person chose to write their "ad" on, I'd be pretty upset. Leaving a myspace URL, they may as well have just left their name and address!

Looking through the person's comments, though, their ad apparently worked: a couple of their "friends" added themselves because they "saw [the] link on a wall in Neutral Bay."

I don't think this'll catch on, though. After all, it shouldn't be too hard to get the cops knocking on this guy's door...

Path too long?

Posted 10 Jan 2007 by Dean Harding

Why do so many people think the 260 character limit on path names is an operating system limitation? It’s not!

The operating system can access paths up to 32,000 characters long (or a bit more). Individual components of a path are limited to 248 characters, but not the entire path.

Many applications assume the limit is 260 (thank you MAX_PATH!) It’s really quite hard trying to explain to my parents that the reason that Excel can’t open their file (that they just saved, by the way) is because the entire path name is too long (“what’s a path?” they’ll ask; “I just saved it in My Documents,” they say; “copy the file to where?” they ask – as far as they’re concerned, the file system is made up of everything under My Document, and that’s it).

In my opinion, this problem is much exacerbated by the fact that the System.IO namespace in .NET also has the exact same crazy limitation. Try to access files where the FullName is > 260 characters, and you get PathTooLongExceptions all over the place. And there’s nothing you can do about it, except write some sort of native file wrapper class that works properly – who’s going to do that?!

Microsoft had the perfect opportunity to remove this lame limitation with .NET. But it’s still there! Hopefully future versions can fix it.

MARS

Posted 09 Jan 2007 by Dean Harding

I’m talking about SQL Server 2005 feature, not the planet :)

MARS (also known as “Multiple Active Result Sets”) is a nifty feature in SQL Server 2005, but it’s easy to misuse it. By default, MARS is disabled in SQL Server 2005, and you must enable it using the “MultipleActiveResultSets=True” parameter in your connection string.

This is for good reason – there are only two situations where I would recommend using MARS. The first is for transactional consistency. The second is the canonical “doing an update for each row of a result.” And even then, only the first situation would I put into production code.

The thing is MARS can severely impact the performance of your application. Connections are actually relatively cheap – and with connection pooling, they’re almost free. In order for MARS to work, however, the server has to open multiple “sessions” per connection. A session is a server-side object, and SQL Server will pool up to 9 of them for you. Now, I’m obviously not on the SQL Server team, but I am assured that the session are very expensive objects (certainly much more expensive than connections).

So what’s the point of MARS, if opening multiple connections is cheaper than reusing the same one? Transactions are the answer. Here’s sample code for a situation where MARS is invaluable (and if you need these semantics in your application, then the MARS feature alone should convince you to upgrade to SQL Server 2005):


string connectionString = "...;MultipleActiveResultSets=True;...";

SqlConnection conn = new SqlConnection(connectionString);
SqlCommand cmd = new SqlCommand("SELECT ...", conn);

conn.Open();
try
{
    SqlTransaction trans = conn.BeginTransaction();
    cmd.Transaction = trans;
    using(SqlDataReader reader = cmd.ExecuteReader())
    {
        while(reader.Read())
        {
            SqlCommand cmd2 = new SqlCommand(
                  "UPDATE ...", conn, trans);
            cmd2.ExecuteNonQuery(); // Important line
        }
    }
}
finally
{
    conn.Close();
}

The line marked “Important line” is the one. Without MARS enabled, that line would throw the dreaded “There is already an open DataReader associated with this command” exception.

Also, that line which creates a transaction is important – without it, it would be far more efficient to simply open a second connection to the database to perform the UPDATE statement. (I suppose if it’s just a one-off test application, it might be OK, and a little cleaner, to just use MARS. But certainly in a production program, I would only use it when transactions were involved).

In summary: if you don’t absolutely 100% need MARS (and the above example shows the only situation I can think of where that is true), then don’t use it. But if you do need MARS, it’s an invaluable feature!

Phishing Filters

Posted 03 Jan 2007 by Dean Harding

The problem with phishing filters is that they analyse the page to detect a fraudulent site (ignoring URL-scanning, which most of them do as well; but I reckon many people will turn that off due to privacy concerns).

There’s now been reports of phishing sites using Flash page to simulate a web page, in order to get past the filters.

Unfortunately, flash files are very hard to scan for suspicious content, the data presented to the browser (or the flash plugin at least) is too low-level – basically just a bunch of “draw a line from (x1,y1) to (x2, y2)” whereas HTML is at least “draw an input box”-style instructions.

I guess it just goes to show that whatever we try to do, the spammers will always be one step ahead. Anyway, it’s another reason to have a Flash blocking plugin installed on your browser :)

Firebird “Embedded” and web sites

Posted 02 Jan 2007 by Dean Harding

Can I make a simple suggestion:

Do not use Firebird in “Embedded” mode in a website

In fact, I could even extend that to any embedded database, but particularly Firebird. You should stick with server-mode database (Firebird in super-server mode is pretty fantastic for simple databases).

I know it’s quite tempting to use embedded Firebird in a website. Especially if you’re running on a hosted service that otherwise charges for database access. But there are a couple of problems with this model:

File permissions are hard to get right. Firebird requires read/write access to the data file. This isn’t a huge problem if you use the App_Data directory available to ASP.NET 2.0 sites, but not everybody has such a nifty little feature.
The biggest problem by far, and this is especially true for hosted services, is IIS6 worker process recycling. If your service provider recycles your worker process on any kind of schedule (personally, I think that’s a bad idea, but some admins can be quite superstitious about such things) then you can run into problems during the recycle operation.

Point #2 is the killer. You see, when IIS6 recycles a worker process, what actually happens is it just switches the current context to a new process. It’s called “overlapped recycling.” The old process continues to run until it has finished servicing all of the existing requests. So there is a short time when there are two worker processing running. If the new worker process tries to open a connection to your database file while the old one is still running (which is quite possible under heavy-ish load) then you’re going to have problems. Remember that Firebird in embedded mode is basically running as a super-server, and does not allow multiple connections to the file from different processes.

Note that while you can change this behaviour, I wouldn’t recommend it – it’s a bit of a performance killer to queue up new requests while the application is restarting.

Things may work OK for a while, or you might only get intermittent errors (after all, you’ll only run into problems under heavy load at the time of a worker process recycle) but you will have problems.

One other possible work around (other than disabling the overlapped recycle) would be to try to open the database in your Application_Start event, and just keep trying until you succeed – don’t let any other requests connect to the file before then, either. Obviously, this is no good for performance, but at least it’ll work.