Jim Blackler

Safely adding events to DOM elements in Javascript for IE, Firefox, Opera and Safari

Posted by jimblackler on Sep 29, 2007

Recently I looked into some reported problems with my word game site Qindar.net and the Safari browser. This was a bit easier for me since Apple released a Windows version of Safari (which, albeit arguably surplus to requirements, is actually a very nice, usable browser on Windows).I discovered that the technique I was using to work out which method of event attachment to use was flawed and was failing for Safari. So I refined it slightly to fix the problem.

The problem is that Javascript on Firefox, Opera and Safari support the “W3C DOM Level 2 event binding mechanism”, which uses a function on DOM elements called addEventListener. Internet Explorer however uses a technique that was apparently from before that particular standard was drawn up, employing a function called attachEvent. In addition, the names of the events are different. For instance, IE uses events such as “onmousemove”, “onmouseup”, but the other browsers omit the “on” and name these events “mousemove” and “mouseup”.

Curiously, Opera is the only browser to support both styles.

The simplest and safest way of working out which one to use is simply to test for the existence of a function called addEventListener. I quite like this method because it works on the latest version of the big four browsers, and IE 6, without having to do any browser version probing.

For instance, here is how to add focus and focus lost events to a page in a way that will work on all modern browsers:

[Javascript]
if (window.addEventListener != null)
{ // Method for browsers that support addEventListener, e.g. Firefox, Opera, Safari
window.addEventListener(“focus”, FocusFunction, true);
window.addEventListener(“blur”, FocusLostFunction, true);
}
else
{ // e.g. Internet Explorer (also would work on Opera)
window.attachEvent(“onfocus”, FocusFunction);
document.attachEvent(“onfocusout”, FocusLostFunction); //focusout only works on document in IE
}
[/Javascript]

This is how to add mouse events:

[Javascript]
if (document.addEventListener != null)
{ // e.g. Firefox, Opera, Safari
document.addEventListener(“mousemove”, MouseMoveFunction, true);
document.addEventListener(“mouseup”, MouseUpFunction, true);
}
else
{ // e.g. Internet Explorer (also would work on Opera)
document.attachEvent(“onmousemove”, MouseMoveFunction);
document.attachEvent(“onmouseup”, MouseUpFunction);
}
[/Javascript]

To remove the mouse events, I recommend…

[Javascript]
if (document.removeEventListener != null)
{ //e.g. Firefox, Opera, Safari
document.removeEventListener(“mousemove”, MouseMoveFunction, true);
document.removeEventListener(“mouseup”, MouseUpFunction, true);
}
else
{ //e.g. Internet Explorer (also would work on Opera)
document.detachEvent(“onmousemove”, MouseMoveFunction);
document.detachEvent(“onmouseup”, MouseUpFunction);
}
[/Javascript]

I personally pray there comes a time when these kinds of workarounds are not required. In the mean time, this will have to do.

Posted in Web development || 2 Comments »

Scraping text from Wikipedia using PHP

Posted by jimblackler on Sep 25, 2007

Wikipedia has grown from one of many interesting websites to being one of the most famous sites on the Internet. Millions of volunteer years have been invested over the years, and the pay off is what we have today – a wealth of factual data in one place.

When Wikis were a new concept, many predicted they would descend into chaos as they grew. In the case of Wikipedia the reverse is true. It seems to become increasingly well organised as the site develops. Rather than becoming more jumbled, the natural development of article conventions and the more planned use of standardised templates has created an increasingly neat and consistent structure.

This careful organisation of the prose leads to the interesting possibility of extracting more structured data from Wikipedia for alternative purposes, while staying true to the letter and spirit of the GFDL under which the material is licensed.

There’s the potential for a kind of semantic reverse engineering of article content. HTML pages could be scraped, and pages scoured for hints as to the meaning of each text fragment.

Applications could include loading articles about a variety of subjects into structured databases. Subjects for this treatment could include countries, people, chemical elements, diseases, you name it. These databases could then be searched by a variety of applications.

I’ve knocked up a simple page that gives a kind of quasi-dictionary definition when a word is entered. It looks at the first sentence of the Wikipedia article, which typically describes the article topic concisely.

I’ll show here how the basic page scrape works, which is actually very easy with PHP, its HTML reading abilities and the power of xpath.

$html = @file_get_contents(“http://en.wikipedia.org/wiki/France”); will pull down the HTML content of the Wikipedia article on France.
$dom = @DOMDocument::loadHTML($html); will read the HTML into a DOM for querying.
$xpath = new domXPath($dom); will make a new xpath query.
$results = $xpath->query(‘//div[@id=”bodyContent”]/p’); will find the first paragraph that is a direct child of the div with the id “bodyContent”. This is where the article always starts in a Wikipedia article page.

I then perform some more processing on the results including contingencies for if any of the steps fail. For instance to make the definitions snappier reading I strip any text in brackets, either round or square. There’s also some additional logic to pick the first topic in the list if the page lists multiple subjects (a “disambiguation” page). Predicting the Wikipedia URL for a given topic also involves a small amount of processing.Anyway, when you ask the page “what is France”, it will reply..

France, officially the French Republic, is a country whose metropolitan territory is located in Western Europe and that also comprises various overseas islands and territories located in other continents.

Can’t argue with that!

Edit, 1st March: By request, here is the source of the WhatIs application. It will work in any LAMP environment but the .sln file is for VS.PHP under Visual Studio 2005.

Source of WhatIs

Posted in Web development || 13 Comments »

Archives

Categories

Safely adding events to DOM elements in Javascript for IE, Firefox, Opera and Safari

Posted by jimblackler on Sep 29, 2007

Scraping text from Wikipedia using PHP

Posted by jimblackler on Sep 25, 2007

Overriding the backspace key using JavaScript, in Firefox, IE, Opera and Safari

Posted by jimblackler on Sep 21, 2007