Help test a new URL shortening / redirection service

Date June 28, 2008

What?  Another URL redirection service?  What’s the point?  Well, there’s a couple of points.

1) I hadn’t done one yet, and it looked like a simple yet difficult enough challenge to get me thinking about coding again.

2) I didn’t see anything that offered usage statistics.  I tried one that claimed to offer them, but it appeared broken.  I’m pretty sure I’ve heard of some others that do, but I couldn’t find them.

So, today I put up http://ewerl.com. It’s a play on the word “ewe” pronounced as “you”, as in “YOU R L” (URL).  Yeah, I probably shouldn’t have to explain it, but the picture of cartoon sheep should help drive the point home.  Otherwise one might read it as “e-whirl”.  Not what I was going after  :)

I asked for some feedback on Twitter and already saw someone poking XSS holes in the supplied data.  Completely my fault, as I wasn’t cleaning the data for security (see, needing to get my head back in the game, so to speak).  I’m using the PHP PDO library, and binding data with prepared statements, so I’m not *as* worried about SQL injection, but the XSS was a bit of an eye opener (thanks, whoever you were).  That’s plugged temporarily while I come up with a better long term solution. It’s *workable* now, but I want to see if there’s a better approach than what I took.

On the stats end, any URL you shorten with the service has stats publicly viewable by adding “/stats” at the end of the URL.  If you’d prefer to get them in RSS form, add “/rss” instead of “/stats”.  The ewerl.com/faq/ page has a link to an example stats page and that has a link to the RSS feed.

If there’s any features you’d like to see in a service like this, please comment here or drop me an email.  Please test it out, bang on it, poke holes in it, and send me any feedback (good or bad) that you have.

Thanks!

Lessons learned from a reddit overload

Date June 27, 2008

Two evenings ago I wrote a post about browsers still not having upload progress meters.  The blog post was voted up on reddit, and the server got slammed.  So slammed, in fact, that it was unusable for a few hours while I investigated the problem.  I didn’t know the post was on reddit, but I knew I was getting some traffic.  Unfortunately, the day before, I’d installed ‘piwik‘ - a tracking/analytics package.  Given that was the most recent change, I spent some time looking at that avenue first.

Actually, I spent most of my time trying to stop Apache so that I could have a usable machine.  I’d then make a change, restart Apache, and within 5 seconds the machine would be unusable, and it’d take 2-3 minutes for the hastily typed ‘httpd stop’ command to do its thing.  Then I’d start again.  I spent some time trying some MySQL tuning options.  I shut off my Tomcat processes and a separate SOLR Jetty server process.  The machine’s only got 1 gig of RAM, and is ‘only’ a 1.7 ghz processor, so luxuries like Java apps couldn’t be running while I sorted this out.

Finally I realized my APC cache wasn’t on.  I’d put it on the server months and months ago, or thought I had.  I’d moved servers, and can not remember if I’d reinstalled it or not.  The php.ini file had it listed in there, but commented out.  Turning it on and restarting didn’t work - APC had been compiled against a different PHP API version.  I think what had happened was I’d upgraded from PHP 5.1.3 to 5.2.5 and the internal API structure was different enough that APC didn’t work anymore, and I must have commented it out in the php.ini file.  It’s been so long that I finally forgot.

So, trying to recompile APC was an issue in itself, because “pecl install apc” didn’t work.  It was already installed as far as pecl was concerned.  “pecl uninstall apc”, then “pecl install apc”, then wait for 6 minutes for it to grind all the way up to where it complained about not being able to find ‘apxs’.  I had an ‘apxs’ on the machine, and in my haste, I just copied that one in to the path where the compiler could find it.  Everything compiled and APC started up.  Except it wasn’t caching anything (except itself).  The apxs I’d copied was a ‘psa’ one - specific to Plesk’s administative Apache version.  So, that took an extra half hour to sort out (actually, I was taken away to something else, so my brother Mark stepped in and recompiled APC for me with a new apxs).

Turning on APC and getting Apache running again solved much of the problems, even though the load on the machine still hovered between 15 and 20 for the next hour or two.  By this point much of the traffic spike had died down.  I guess people coming to a non-responsive server eventually stop trying!

What I’d ended up doing at the top of the Wordpress index.php file was a rand() call to allow 40% of the traffic in during the ‘bad’ times whilst this was being sorted out.  That let some people in, and gave the other 60% of visitors a terse “server is under heavy load, try back later” message.  Sorry if that was what you experienced yesterday, but it was the best I could do at that point.

We (Mark and I) had spent a bit of time tracing through some xdebug outputs of Wordpress logs during this time (I turned on xdebug cachegrind output for about 10 seconds and got quite a few dump files!).  Wordpress itself is just horribly inefficient under the hood.  Somehow I always knew it, but its done for the sake of flexibility.  At least that’s what I tell myself.  Anyway, I’d spent some valuable time tracing through some of the function calls and I just couldn’t believe how wasteful it is in there.  Obviously the APC code cache helps immensely, but that’s only part of the solution.  However, given the ecosystem around Wordpress, its hard to immediately jump ship to something else.

Anyway, that’s the quick analysis of what happened yesterday.  There were over 8000 visits from reddit alone, and a few thousand from various other sources.  However, as I said, more than 60% of visitors were getting ’server overload’ messages, so many of those visitors wouldn’t have been counted.

By the way, the Wordpress “super cache” plugin thing did absolutely 0 in the wake of hundreds of concurrent reddit visitors.  Not sure if I had set it up wrong, but it looked like there was only one way to set it up.  See comment below - I didn’t finish configuring properly - my fault.

Lesson learned, I’m more prepared now than I was a few days ago.  I doubt I’ll see that much of a surge again any time soon, but I’m ready for it.  Bring it on (in moderation, of course).

On a completely unrelated note, if you’re looking to hire someone, or looking for a new position, check out the new http://webdevjobs.com job board.  Thanks.

Why do browsers still not have file upload progress meters?

Date June 25, 2008

It’s 2008.

Firefox 3 was just released - years of work, thousands of bugs fixed, new features thrown in.  No file upload progress meter.

IE7’s been out for awhile, but I don’t have a copy here handy to test.  My memory is telling me it doesn’t have one, or if it does, its very unobtrusive and not immediately apparent.

Opera 9.5 was recently released.  No file upload progress meter.

Konqueror 3.5 - no file upload progress meter.

It’s too late for me to check Safari2 - but I don’t think it’s got one.  I don’t have Safari3 on a Mac, so *maybe* it’s got one, but I doubt it.  I think I’d have heard about it if it does.

This current tirade stems from implementing a file upload progress meter in PHP5.  Yes, PHP5.2 has some hook, and there’s a PECL extension.  However, there’s *0* documentation on it, and the few pages Google turns up are mostly people saying “it doesn’t work!” and a few “hey it works!” with little in between.  I have one other option, I think - the APC cache system apparently also has file upload progress functionality in it now.  I was hoping to avoid it, because 1) I think it might have conflicted with the ioncube encoder on the machine and 2) I was really hoping to use an extension that just did the one thing I needed, not a bunch of other stuff as well.

I realize this is partially a PHP issue I’m ranting about, but it’s ultimately a hacky workaround to a basic piece of functionality that browsers should support.  The browser is pushing the file up - it has a notion of how many bytes have been pushed out to the network.  It is in a position to know the accurate size of the file in question.  Showing a local status of that information would be a faster experience, rather than adding a layer of AJAX calls to an upload screen.

I was at a developer panel in Ann Arbor back in 2003 and someone from MS was on the panel.  He had been on the IE team at one point, and I remember making the suggestion to him to pass the ‘upload progress bar’ back to the IE team.  It was a bit of an odd response - it wasn’t a ‘no’, but a bit of a bemused look, as if to wonder why people would want that.  He didn’t *say* that, and was cordial enough to take the suggestion, but it obviously did bugger all.  I was sort of figuring IE would have included it in IE7, then FF would eventually make it available as an extension, adding yet “one more extension” to the list of ‘gotta haves’, then they’d talk about how innovate FF is.  :)  Or perhaps even Opera might just slip it in under the radar in the past 5 years.

Alas, it seems that people that write browsers are somehow oblivious to the need app developers face in this area.  Perhaps they all live on petabit connections and/or only do development with local servers, and uploading 10 meg files takes 2 seconds.  In the real world, for most of us, it’s considerably longer and more painful, both on the development side testing, and for end users uploading.

One hundred push ups?

Date June 25, 2008

Can I do it?  I don’t know for certain, but I’m going to give it a try.  This might replace my ‘run a marathon’ ‘new year resolution”, as it’s more attainable, and I need to take things a step at a time (no pun intended).

A few days ago I was invited to join some other PHPers in taking this challenge.  The ’support group’ is on Facebook here.  I’m only 2 days in to it, and feeling sore all over.  But I do think this is something I’ll be able to keep up.  Doing 10 on the first day was a stretch, but 10 today is more doable, and I hit 20 today in my last set.

Join up at our Facebook site if you’re interested in taking the challenge with us, or just go to hundredpushups.com directly and take it from there.

I’m putting this in the PHP category as well to let other PHPers (via planetphp) who don’t yet know about the challenge hear the word.

Railo - new life for ColdFusion?

Date June 23, 2008

I have to admit, I’ve never been in to ColdFusion.  I used it on a couple small projects back in the early 2000s (was just brought in to finish/fix existing code), but never really dug it.  Not that it was *bad*, but the early versions really smacked of watered-down development.  Everything’s a “tag”, so it’s “simple”.  Yet most problems that were facing developers (and still do) aren’t ’simple’ problems.  The problems stem from project communication, client interactions, etc.

A noted CF expert - Hal Helms - had drilled in to my head that most projects fail not because of technical incompetence but communication issues.  Yes, he wasn’t the first to come up with that insight, but he was the first person I’d heard put it succintly.  I use the same examples today when talking with clients/coworkers/whomever.  I’ve rarely been on a project that ‘failed’ (missed deadlines, overbudget, whatever) because I or someone on the project didn’t know how to open a file, or read from a db, or other ’simple’ things.  Many have ‘failed’ because of misunderstanding on the part of one of more parties.

So, CF’s focus on the ‘tag’ approach early on turned me off from exploring it further.  On top of that, it was rather expensive, and closed source to boot.  By 2003, I’d more or less written off CF as one of those techs that would continue to get marginalized in the webdev world, like Perl.  Whatever the technical or cost/benefit merits, when judged against the bigger communities and toolsets of PHP, .Net and Java, CF was destined to be a footnote.

However, a few things changed.  Macromedia bought Allaire, then Adobe bought Macromedia.  CF started compiling down to Java.  IDEs with CF integration continued to get updated.  What was happening?  Perhaps people’d put so much money in to CF already they were willing to help support it through its dark days?  I don’t know for sure, but CF has seemed to hold its own during the last few years, and seems to be going through something of a renaissance.  Which brings me to Railo.

I just heard about Railo today (which inspired this hasty post).  Railo is a third party CFML engine - it’s not created by Adobe, but from a company in Sweden.  They have a free Community Edition, and a reasonably priced ‘Enterprise’ version.  They are coming out with a GPL (2?  3?) version later this year, which will be hosted at jboss.org, if I’m reading that correctly.  I expect JBoss will probably integrate it, which will open up the world of CFML to a new audience of Java developers.

All in all, still very interesting to see.  I have talked to 3 companies this year using ColdFusion, and I don’t think I talked to any for 2 years before that.  While I’m hardly a bellweather of tech adoption, it was still a bit eye-opening to me to have run in to that many in a short time (2-3 months).

On a related note, if you’ve got a ColdFusion opening, why not post it over at http://webdevjobs.com?

By the way, will this have any effect on Groovy adoption at Java shops?  Seems like it might have a splintering effect on Java devs looking for ‘alternative’ Java tech.  Any thoughts on this?

Car mileage update

Date June 20, 2008

Had a decent amount of driving again in the last couple weeks:

419.6 miles - 12.33 gallons to fill up (@ $3.929 - not over the $4 some of you are paying, but still painful!)

That’s 34.03 mpg for the last fill up.  I had some highway driving, which seems to really help kick it up past the 30-31 I typically get now.

I’ve written about this before, but it bears repeating some.  SLOW DOWN and you’ll get much better mileage.  2 summers ago I was getting 25 on average.  I now get 30 on average.  That’s a 20% improvement.

I’ve started keeping track of my mileage over at http://fuelfrog.com.  You might be able to see my mpg chart over at http://www.fuelfrog.com/users/mgkimsal/fuels/dashboard - not sure if you can see that if you’re not logged in to fuelfrog yourself.

Speaking at Codestock

Date June 15, 2008

Exciting news - I’ll be presenting an introduction to Grails at the upcoming Codestock conference in Knoxville this August!  The site doesn’t have full details yet, but I was just notified this morning that my submission was accepted.  I’d actually submitted 3 options - my SOLR presentation, a “Continuous Integration with PHP” topic, and an introduction to Grails.  The Grails topic was selected.

I’d like to thank Alan Stevens for the invitation to submit in the first place.  I was a bit hesitant at first because the conference seemed very .net oriented.  It’s being sponsored by the area .net user group, which makes sense.  Alan let me know that they were looking for cross-platform topics, not just .net ones.  However, it seems I may be the only topic that’s not directly related to Microsoft technologies.  James Avery is presenting “10 Open Source tools you should use” - not sure if those are 10 tools in general, or 10 tools aimed at Windows developers (either way, I’m sure he’ll have a good list!).  There’s another presentation on Mono and ASP.net.  I’m the only Java-based presentation though.  I hope it’s not too much of a ‘fish out of water’ thing.

Why don’t ecommerce companies offer tinyurl-like services?

Date June 12, 2008

I just saw someone tweet that they’d received something in the post, and linked to a URL shortener service.  It redirected to Amazon.  Now I realize that Twitter is a pretty new service, but with mobile rising, I think we’ll see a need for short URLs more and more.  Couple that with the slightly extra privacy you get with a shorter URL (someone needs to actually visit the link to know you’re pointing to furry handcuffs, for example), and the mindshare Amazon would keep by having “amazon.com/6hjw89eh9e7hds”, and the extra metrics they’d be able to capture with that (add a user key in the short URL) and this makes sense to me.  The top of every Amazon product page would have a “Short URL” property available to cut/paste/whatever.

Someone should embed this in their ecommerce system to acknowledge and emrace Twitter, Plurk and the coming wave of microblogging platforms.

PHP addslashes alternatives comparison

Date June 12, 2008

My brother Mark has put together a comparison of addslashes() alternatives over at his blog.  He starts off with:

I’ve seen a lot of people talking about mysql_real_escape_string() vs addslashes() vs addcslashes(). There seems to be a lot of real confusion about what these functions do (even with the php.net manual around), especially when it comes to character sets. I feel that some people are being scared into using some escaping methods with which they are not very familiar. So, I’ve decided to lay it all out in a few charts so there is no confusion about what each function does and how each can help protect against SQL injection attacks.

Read on if you’re interested in this sort of thing, and to get his final conclusion.

Randal Schwartz podcast up

Date June 12, 2008

I had the pleasure of talking with Randal Schwartz about his latest passion - Seaside - over on WebDevRadio.com.  Check it out.