When I present a company with our software solutions (which are based on a semantic technology that uses a rich and vast semantic network), I find myself in front of an audience who clearly understands the advantages of this approach.  Yet, the series of concerns and doubts they raise often clouds the decision-making process and causes an incorrect evaluation of the actual return on investment.

Whether they are raised by IT managers, KM workers or software developers, the concerns fall into two categories: the first, the costs related to the setup and maintenance of the semantic network and the second, the costs related to the infrastructure required to maintain a performance level able to satisfy operations.

There are many reasons behind these concerns, but two factors seem to stand out. On one hand, there are the excellent (and often incorrect) communication activities carried out by the makers of systems based on keyword technology.  They have almost succeeded in convincing the market that a complex problem such as information management can be solved with automatic shortcuts and that any other alternative would be unaffordable. On the other hand, the majority of researchers in this sector are  still skeptical about systems which are entirely semantic. This is mainly caused by their inability (at least up to now) to develop software which can combine the advantages of increased text comprehension with performance in order to meet the demands of the real world (thus further strengthening the position of the competition.)

In the past ten years, many successful projects have been developed using our semantic technology. Therefore, I think it would be useful to use real data from our everyday experiences to help clear up the misconceptions which often cause people to make irrational decisions.

Costs of development

To add a new language to Cogito, two man-years of software development and 8-10 man-years of linguistic development are needed in order to refine the semantic network. You can quickly estimate the cost of such resources  (if you are in the Silicon Valley, divide your estimated total by 2!) and immediately understand that the initial investment is considerable, yet affordable considering the cost will be spread over all the implementations that will be done over time.

Cogito’s standard semantic network permits a horizontal management of content so that a significantly higher rate of precision e recall (compared to that obtained from a static system) is obtained with no need for further elaboration. For vertical implementations, start-up costs will be necessary so that a standard semantic network can be enriched with knowledge from a specific dominion (the number of added concepts usually does not exceed 5,000); usually 20-30 working days are needed for a linguist to complete this task.

For those who believe that “languages constantly change and adding new terms can be costly,” may I  remind you that even the most dynamic languages, such as English,  increase by no more than 100-200 new terms (of common use) and less than 1000 non-idiomatic expressions  per year  (in the worst case scenario, this could mean about 10 working days per year.)

Those who criticize the complexity of managing a semantic network often refer to the complexity of managing lists of entities such as: people, places, companies, organizations, etc.  Traditional systems are able to recognize an entity only if it is present in a list; this aspect is often  erroneously confused with semantic network management.  A good semantic engine is able to recognize an entity based on the semantic role it plays within a text, therefore it does not require the creation nor the maintenance of lists. At the same time, it is also able to correctly recognize  less frequent entities (which, for obvious reasons, have not been inserted in the list.)

Costs of infrastructure

Cogito can analyze more than 120KB of text (circa 40 pages of text) per second with a common single-processor server. This kind of speed, combined with its linear scalability and low cost, makes Cogito a  practical solution even in situations in which large quantities (tens of millions) of documents must be analyzed.

The development and maintenance costs of a semantic network are considerably lower than what is commonly assumed; the improvements in terms of the ability to manage information (even when very complex) are obvious even to those who are not experts in this sector. I am convinced that when these aspects can be objectively analyzed (when myths and obsolete information are ignored), the number of companies which adopt real semantic solutions will increase.


Author: Luca Scagliarini

