The Ontology Myth
For the past year, I have been observing a phenomenon in the US market, that of the spread of the ‘myth’ of ontology.
Ontologies are important elements for understanding text through semantic analysis, but they are insufficient (and, often, not even necessary) to resolve the problem of how to handle unstructured knowledge. Nonetheless, according to this ‘idea,’ they say that if you have a complete ontology, you don’t need anything else. Instead, semantic technology should be able to do it all automatically (for example, the typical activities correlated to knowledge management activities such as automatic categorization and discovery of knowledge and relationships between data). This assumption lacks substance, and even if I understand the reasons why this idea has spread (in the end, we are all always searching for fast and automatic solutions) it is important to explain the reality (which is completely different from the utopic view that some would have you believe).
An ontology is a structured and formal representation of relative knowledge in a certain domain or micro universe. For example, think about the ontology of a computer. If we simplify this to the extreme (and in doing so we risk being inaccurate), we have the following ontology:
- classes of concepts (for example, computers and manufacturers)
- sub-classes (netbooks, notebooks, desktops, servers)
- individual models (Eee 120, Air, Yours 500, Xps 170, W100 Booklet)
- attributes associated to these concepts
- relationships between the concepts (for example, each computer has a different manufacturer.)
Even if it seems the contrary, an ontology is always subjective because there are no rules to dictate exactly how an ontology should be made (at the maximum, some sensible practices born of experience do exist). An ontology’s structure depends in part on content that should be represented, and there are many ways to do it.
In a project that has the objective of categorizing content or extracting knowledge (like tags, entities, concepts, relationships, triples or similar) an ontology can be very useful because, when made correctly, it can capture the domain knowledge as if from the mind of an expert, which is also necessary for the setting of personalized analysis (semantic network and rules) to achieve the desired results.
For example, in the hypothetical case of the computer ontology that we mentioned earlier, if we already have the lists of the individual models, we can also add these details into the semantic network, and if we also already have the relationships, it will be easier to develop the extraction rules. A well made ontology saves time in the implementation phase and obtains better results; but a poorly made ontology wastes lot of time and, quite frankly, it is better to have nothing than a bad ontology.
Also, a well made ontology can be useful in the initial phases of the project, but it cannot substitute the work to define the analysis or what the customer has to do to indicate precisely the knowledge he wants to extract in the documents to be analyzed. In some cases, it is enough to represent the knowledge contained in the ontology in another way (also automatically) without doing anything else. The others need additional work more or less broad (and above all, specific to the project and not generic).
With this clarified, it remains to be added that, creating a valid ontology (useful in real projects) is not simple: more experience is needed, along with the vision and capacity for synthesis and compromise (without these, it can become a neverending task).
Unfortunately, nearly all the ontologies we see that have been made independently are of poor quality because they are thrown together, unbalanced and too theoretical, or they are the opposite—too specific. After such an attempt to make something helpful and useable, the results are often such that you have to start over again.
In some cases, however, even well made ontologies are useless because they describe concepts and relationships that have nothing to do with the problem at hand: a good job in terms of quality, but a waste of resources if they are not useful.
In summary: A well made ontology can help you work more efficiently and obtain better results quickly. A poorly made ontology not only creates more work; in the end, it will probably be useless anyway. Still, creating a good ontology is not easy, and the costs are not negligible (don’t trust anyone who tells you that they can make a good ontology in just two days).
I hope that this post helps you understand the pros and cons of ontologies a little better. There is already a lot of confusion and exaggeration around semantic technology; we don’t need to listen to more misleading information.