Automatic Categorization: No Magic, Just a Secret
The secret to a successful automatic categorization project is not dependent on choosing a powerful enough technology, but rather in the methodology used to implement the project: if the methodology is the right one, then a powerful technology will be actually indispensable to obtain success, but if the methodology is wrong, no technology will be able to help.
The most important element is the initial analysis phase, during which it is necessary to describe the core of the issue in a clear, objective and replicable way.
It is fundamental that the customer, typically an organization that needs to manage a considerable amount of knowledge (in general, various types of documents produced or acquired during work), explains to the supplier its real needs.
The supplier, of course, must commit to address such needs in the best possible way.
Described in this way, this situation is not so different from any other software development project. However, the difference in this case is that we must understand how to manage a complex knowledge, which is never easy, and cannot be improvised.
The first step is the most important one and it requires a special effort from the customer who needs to answer rationally the following questions:
Why do I need to categorize my documents?
Who are the people who best know the knowledge base I want to categorize?
If currently the categorization is performed manually, what is the detailed process flow?
Which are the really important and relevant categories able to make the content more useful and valuable?
If the category tree is already available, are all the categories really necessary?
Which are the most objective criteria that make a specific document belong to one category and not to another?
Even if the above questions are all simple ones, it is not so easy to quickly find the answers and this is where the experience of the supplier comes in, making the analysis phase a cooperation between customer and supplier.
First of all, the supplier needs to discuss the problem with the customer before offering a solution. Moreover, the expertise of the supplier must exceed the technical aspects strictly related to technology: in fact, the customer is generally not a knowledge expert and therefore, it is not easy for him to immediately individuate the basic categories (or knowledge domains) for the success of the project.
If the analysis phase is performed correctly, the most important step for the success of the project is already done: this is in fact, the only narrow path that leads to an effective system able to guarantee effectiveness and advantages in terms of costs and value.