Searching a topic map with Ontopia

In Web Application Development with Ontopia – 3. Creating the JSPs I claimed that

The next blog post will explain how to add a search page to our application.

Well, turns out it didn’t. I actually never wrote a post on that topic. Heck, the blog has been more or less dead for much of the last year.

A lot of things have happened in my life during the last year (moving to L.A., starting a new job in the US, etc.), but there sure hasn’t been much blogging going on. I do hope to get better at that, but doubt there will be too much on topic maps as I don’t spend much time tinkering about with this great technology anymore.

Anyway… I’ve recently received a couple of e-mails requesting information on how to query a topic map with Ontopia, and figured I’d better put my answer in a blog post for anyone to read.

tolog search

The easiest way to query a topic map with Ontopia is to use tolog and search for topics names/occurrence values using a simple tolog query. An example query would be:

        select $TOPIC, $TYPE, $REL from
            value-like($OBJ, "my search query", $REL),

            { topic-name($TOPIC, $OBJ) |
              occurrence($TOPIC, $OBJ) },

            direct-instance-of($TOPIC, $TYPE)

        order by $REL desc, $TOPIC?

Try it here.

The result set from executing this query would contain the topic, it’s type and the relevancy score of the hit, sorted by relevance.

You can incorporate this into JSP files and make use of the taglibs from the tutorial for printing out results, linking to topic/type pages, etc.

Example files that will hopefully help getting you started (ZIP).

If you want to go further on the search (filtering etc.), you probably have to to more of it in Java code — using the APIs that Ontopia provides.

Topic Maps Query Service – YQL for Topic Maps?

As I was playing with YUI3 and YQL (an awesome abstraction above and across web services — select * from internet) last weekend, I noticed that there is an open data table for sparql search.

Not knowing whether such a thing already exists for topic maps, I created a simple Topic Maps search servlet that can be used to query any topic map available on the web – as long as both the query language and the topic map format used are supported by Ontopia.

It’s just a prototype created for fun, but here are some examples of what could be done with such a service:

The results are presented as a JTM 1.1-ish documents.
No need to run a topic maps engine locally, and the results could easily be integrated into a web app using e.g. YUI.

It’s very buggy (e.g. might try to cast Float to TopicIF – sigh), but feel free to try it out at http://billy-corgan.com/yql?query={query}&topicmap={uri-to-topicmap}.

Here’s the Java servlet code:

package com.topicobserver.topicmaps.yql;

import java.io.IOException;
import java.io.PrintWriter;
import java.util.Collection;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import net.ontopia.infoset.core.LocatorIF;
import net.ontopia.infoset.impl.basic.URILocator;
import net.ontopia.topicmaps.core.AssociationIF;
import net.ontopia.topicmaps.core.OccurrenceIF;
import net.ontopia.topicmaps.core.TMObjectIF;
import net.ontopia.topicmaps.core.TopicIF;
import net.ontopia.topicmaps.core.TopicMapIF;
import net.ontopia.topicmaps.core.TopicMapReaderIF;
import net.ontopia.topicmaps.core.TopicNameIF;
import net.ontopia.topicmaps.query.core.InvalidQueryException;
import net.ontopia.topicmaps.query.core.QueryProcessorIF;
import net.ontopia.topicmaps.query.core.QueryResultIF;
import net.ontopia.topicmaps.query.utils.QueryUtils;
import net.ontopia.topicmaps.utils.ImportExportUtils;
import net.ontopia.utils.OntopiaRuntimeException;

import org.json.simple.JSONObject;

public class TopicMapSearch extends HttpServlet {

	private static final long serialVersionUID = 1L;

	private HttpServletRequest request;
	private HttpServletResponse response;

	private TopicMapIF topicMap;
	private String searchQuery;
	private String topicMapUri;

	public void doGet(HttpServletRequest req, HttpServletResponse res)
			throws ServletException, IOException {

		this.request = req;
		this.response = res;

		this.searchQuery = request.getParameter("query");
		this.topicMapUri = request.getParameter("topicmap");

		if(searchQuery == null || "".equals(searchQuery) ||
				topicMapUri == null || "".equals(topicMapUri)) {
			response.sendError(500);
			return;
		}

		this.processRequest();
	}

	private void processRequest() throws IOException {

		response.setContentType("application/json");

		try {
			TopicMapReaderIF reader = ImportExportUtils.getReader(new URILocator(topicMapUri));
			topicMap = reader.read();
		} catch (IOException e1) {
			return;
		}

		String json = "";
		PrintWriter out = response.getWriter();

		// Execute search
		try {
			json = this.searchTopicMap();
		} catch (InvalidQueryException e) {
			response.sendError(500);
		} catch (OntopiaRuntimeException e) {
			response.sendError(500);
		}

		out.print(json);
		out.close();
	}

	private String searchTopicMap()
			throws OntopiaRuntimeException, InvalidQueryException {

		QueryProcessorIF proc = QueryUtils.getQueryProcessor(topicMap);
		QueryResultIF result = proc.execute(searchQuery);

		String[] variables = result.getColumnNames();
		Object[] row = new Object[result.getWidth()];

		TMObjectIF currentTmObject;

		// OK, so I wrote this before I went looking for json.simple.JSONObject
		// and I am too lazy to refactor just for fun.
		int i = 0;
		StringBuffer json = new StringBuffer("{ \"result\": [\n");
		while (result.next()) {
			result.getValues(row);
			json.append((i++ > 0) ? ",\n" : "\n");
			json.append("[");
			for (int ix = 0; ix < variables.length; ix++) {
				currentTmObject = (TMObjectIF) row[ix];
				if (currentTmObject == null) {
					continue;
				}
				json.append((ix > 0) ? ",\n" : "");
				json.append("{ \"variable\": \"" + jsonEncode(variables[ix]) + "\",");
				json.append(" \"tm_object\": {" + jsonify(currentTmObject) + "}}");
			}
			json.append("]");
		}
		json.append("]\n");

		json.append(", \"query\": \"" + jsonEncode(searchQuery) + "\"");
		json.append(", \"topicmap\": \"" + jsonEncode(topicMapUri) + "\"");

		json.append("}");

		result.close();

		return json.toString();
	}

	private StringBuffer jsonify(TMObjectIF tmObject) {

		StringBuffer json = new StringBuffer();

		if (tmObject instanceof TopicIF) {
			json.append(jsonify((TopicIF) tmObject));
		} else if(tmObject instanceof OccurrenceIF) {
			json.append(jsonify((OccurrenceIF) tmObject));
		} else if(tmObject instanceof AssociationIF) {
			json.append(jsonify((AssociationIF) tmObject));
		} else {
			// Could be TopicName etc...
			// Could've cared had I been paid to do this.
			json.append("\"construct\": \"unknown\"");
			json.append(",\"string-value\": \"" + jsonEncode(tmObject.toString()) + "\"");
		}

		Collection<LocatorIF> iris = tmObject.getItemIdentifiers();
		json.append(json.length() > 0 ? "," : "");
		json.append("\"item_identifiers\" : ["   + locatorsAsCsv(iris) + "]");

		return json;
	}

	// in the lack of util methods to transform TopicIF to json object
	private StringBuffer jsonify(TopicIF topic) {
		StringBuffer json = new StringBuffer();
		Collection<LocatorIF> sis =  topic.getSubjectIdentifiers();
		json.append("\"construct\": \"topic\"" +
				  ", \"subject_identifiers\" : [" + locatorsAsCsv(sis) + "]" +
				  ", \"names\": [" + namesAsJson(topic)   + "]" );
		return json;
	}

	private StringBuffer jsonify(OccurrenceIF occurrence) {
		StringBuffer json = new StringBuffer();
		json.append("\"construct\": \"occurrence\"" +
				  ", \"value\": \""  + jsonEncode(occurrence.getValue()) + "\"");
		return json;
	}

	private StringBuffer jsonify(AssociationIF association) {
		StringBuffer json = new StringBuffer();
		json.append("\"construct\": \"association\"");
		return json;
	}

	private StringBuffer namesAsJson(TopicIF topic) {
		StringBuffer json = new StringBuffer();
		Collection<TopicNameIF> names = topic.getTopicNames();
		if(names == null || names.size() == 0) {
			return json;
		}
		for( TopicNameIF name : names) {
			json.append("{ \"value\": \"" + jsonEncode(name.getValue()) + "\"");
			json.append(", \"item_identifiers\": [" + locatorsAsCsv(name.getItemIdentifiers()) + "]");
			Collection<TopicIF> scope = name.getScope();
			json.append(", \"scope\": [");
			for( TopicIF scopeTopic : scope ) {
				json.append(locatorsAsCsv(scopeTopic.getItemIdentifiers()));
			}
			json.append("]");
			json.append("},");
		}
		json.deleteCharAt(json.length()-1);
		return json;
	}

	private StringBuffer locatorsAsCsv(Collection<LocatorIF> locators) {
		StringBuffer csvStr = new StringBuffer();
		if(locators == null || locators.size() == 0) {
			return csvStr;
		}
		for(LocatorIF locator : locators) {
			if(locator != null) {
				csvStr.append("\"" + jsonEncode(locator.getExternalForm()) + "\",");
			}
		}
		csvStr.deleteCharAt(csvStr.length()-1); // strip off last comma
		return csvStr;
	}

	private String jsonEncode(String str) {
		return JSONObject.escape(str);
	}

}

(and no, I never did create a YQL table for this)

PHP is not Java but Java is not a Singleton

In a reply to my twitter rant about PHP’s lack of method overloading a while ago, I was pointed to PHP Advent 2008 / PHP is not Java by Luke Welling. The post discusses how PHP code may turn unnecessary complex by applying Java-style design to PHP code. Well, it tries to.

The point of the article is valid: you should definitively make use of the language’s built-in features instead of inventing your own. It uses a very bad example, though, and that’s what this rant is all about. Now, I am very late to the party here, but still want to point out the misconception.

Welling wrote:

In case you are not familiar with it, the singleton pattern is a general solution to many situations where you want only one instance of a particular class. [...]

class Singleton
{
    private static $myObject;

    private function Singleton()
    {
        // Providing a constructor to eliminate default public one.
    }

    public static function getInstance()
    {
        if (! isset(Singleton::$myObject) )
        {
            Singleton::$myObject = new MyObject();
        }

        return Singleton::$myObject;
    }

    private function __clone()
    {
    }
}

True.

An experienced PHP developer is much more likely to implement the same pattern as follows:

global $myObject;

if (!isset($myObject)) {
    $myObject = new MyObject();
}

False.

An experienced PHP developer would know that the Singleton pattern could still be used because it makes sure you only create 1 instance of the MyObject class. In Luke Welling’s example, you would always have to remember to check if there was an existing instance of $myObject available, else you’d end up with more than 1 instance of the class. Even worse: what if $myObject was not set, but MyObject had already been instantiated somewhere in your code? $myOtherObject, for instance? You could end up with $stillAnotherObject. More than 1 database connection, etc.

Hence, the experienced PHP developer would make use of the Singleton pattern, and do this instead:

global $myObject;

if (!isset($myObject)) {
    $myObject = Singleton::getInstance();
}

Or just

$myObject = Singleton::getInstance();

He’d then rest assured that even if MyObject had already been instantiated, $myObject would now reference the single instance.

(And I am still annoyed by the lack of method overloading in PHP5.x — it’s just stupid!)

Moving to Los Angeles

The reason why this place has been really quiet lately (and I haven’t found a VPS yet), is that I’ve been quite busy moving to, meeting relatives and interviewing for jobs in the US.

Luckily, the largest uncertainty is no longer an issue. Come May, I will be joining Yahoo! Music‘s team of great engineers!

In the future I promise not to blog about such stuff here, but if you are interested in what our lives are like in the US you might want to check out our new blog at living-in-the.us.

Downtime

We’re currently in the middle of moving to the USA. As a result, I’ve had to wave good bye to my fiber-enabled home.

I’ve unplugged my Ubuntu server, am changing web hosts and therefore expect this blog, and the recently re-launched billy-corgan.com (should now redirect here), to experience some downtime during the upcoming weeks.

Billy-Corgan.com Re-Launched as Topic Maps Based Website

Billy-Corgan.com was the first public web site I ever created and ran. I first started playing with it (and web techs) in 1998/99. In 2000, I moved it from GeoCities to its own domain name. The site had it’s golden age in 2003/2004, with up to 4 million page views / month. I believe that is a rather high number for a non-popish fan site. Back then it was a PHP/MySQL driven site.

In 2005 I decided to stop maintaining Billy-Corgan.com due to various reasons, the most prominent ones being a lack of time and a decreasing level of devotion (esp. with regards to the online community and drama that follows).

For the last 5 years the site has therefore contained little to no information. Up until 2 days ago, it did only contain the Machina II MP3s (Smashing Pumpkins released this album on the Internet, for free, back in 2000!). At the same time, there have been hundreds of daily visitors (Google Analytics stats). And the old MySQL database has been kept intact on my backup devices. Therefore, I recently decided to re-lauch the site with some of the “static” content (lyrics, discography, photos + MP3s as before). No need for it to remain empty, right?

Moving to Topic Maps

So: what to do when putting some old database content on the web? I did not want to create a huge new web site and spend a lot of time writing (plain) PHP scripts and SQL queries. Didn’t have time for that right now.

Well, obviously I chose to create a Topic Maps based web app, with Ontopia being my preferred Topic Maps engine.

I started by creating a couple of new database views (to make for a simpler mapping) and a few stored procedures for “sanitizing”  some of the data (used in the views’ SQL). From there, the remaining tasks were pretty simple:

  1. Create an LTM file (my preferred format) containing the ontology (concepts like Song, Person, Composer-Of, etc.). ~100 lines of LTM.
  2. Write a DB2TM mapping file, specifiying which columns are mapped to what Topic Maps concepts. 136 lines of XML.
  3. Write JSP files — as discussed in my previous post on Web App Development with Ontopia. Ended up with 10 specialized JSPs.

UI Functionality

I also wanted to add some “fun” functionality by creating an Ajax enabled photo gallery. I did explore some pre-built galleries such as Galleriffic, but ended up building my own using a combination of jQuery and jQuery cycle. The album degrades gracefully by not requiring JavaScript support — all links work without JS (example: 1979 vs. Zero).

Further, I implemented an audio “player” for the MP3s based on the HTML5 <audio /> element. At this time the browser support is very limited, though, as these are MP3s and not e.g. OGG. In the lack of MP3 audio support I fall back to using Flowplayer’s audio plugin (Flash based). I’ve also played with some CSS3 properties like border-radius, as seen on the index page (granted your browser supports either -webkit-border-radius, -moz-border-radius or border-radius (why so many?)).

The result can be viewed at Billy-Corgan.com.

MySQL BLOB to File in 20+ Lines of PHP Code

One of the nice things about PHP is how extremely quick it is to hack together a script that will save your day.

Today, I needed to throw together a script for converting some images stored as BLOBs in an old MySQL database of mine, to JPEG files stored on the file system (obviously, I knew that my database only contained JPEGs). It was a one time job, and so I set out to create a quick and dirty script to get the job done.

The job only took me about 20 lines of code, due to PHP’s built-in MySQL support and file handling support. I’ve cleaned it up a bit before posting it here, though, so now it takes a whole lot of 27 lines due to the verboseness of the somewhat cleaned up code.

In cases like this, hard coding the DB credentials and directory paths wont make me loose sleep. Neither does doing it the procedural way. The script did it’s job, and I got what I wanted … without spending a single calorie too much.

<?php
$conn = mysql_connect('localhost', 'myuser', 'mypass');
mysql_select_db('mydb', $conn);

// returns rows of id, img_name and img_data
$sql = 'select * images';
$queryResult = mysql_query( $sql, $conn );

while( $row = mysql_fetch_object( $queryResult )) {
	processRow( $row );
}

mysql_close($conn);

function processRow( $row ) {
	$basedir = 'images/';
	$imagename = makeSafeImageName( $row->img_name );
	// in my case, file names are not unique, so I add the id
	$filename = $basedir . $imagename . '-' . $row->id . '.jpg';
	file_put_contents( $filename, $row->img_data );
	echo 'done dumping ' . $filename ."\n";
}

function makeSafeImageName( $source ) {
	return strtolower(preg_replace('/([^\w\d\-_]+)/', '-', $source));
}
?>

(Execute from command line with php my-script.php, or web server).

  • @twitter

  • Tags

  • Topics

  • Recent Comments

  • Topic Map Feeds