CyberOSINT: a Strategic Opportunity in Enterprise Data World
Cyber OSINT–the point where open source intelligence and cyber security meet–will be a strategic opportunity for not only security analysts, but especially the enterprise. This was just one of the topics discussed at a conference where I presented last week. The event was organized by ISS World and chaired by Stephen Arnold, who recently published a very interesting report on intelligence gathering tools for open-source content, “CyberOSINT: Next-Generation Information Access”.
The report is a must read, and I’d like to highlight three points that I think are especially important for anyone dealing with text analysis:
1) The capability for predicting and visualizing critical events in real time will become paramount, and any solution for this will include text analytics at its core. Here, technologies will compete in three important aspects:
- Effective understanding of both meaning and context
- Flexible enough to adapt the output of the analysis to the specifics of the event being monitored
- Rich output: Real decision making requires rich output–a simple statistical distribution of keywords isn’t enough.
Text Analytics software based on linguistics and leveraging a rich, human-created knowledge can be extremely effective, especially in a specific domain where they can embed a deep domain knowledge. By processing the huge amount of available information, the analyst benefits from a huge jump start in information acquisition.
Being able to offer customize the output of automatic analysis by way of powerful linguistic development tools will give a real edge over less sophisticated, statistics-based software. These tools will provide analysts with the flexibility required to adapt the output of the analysis to the specific requirements of the moment (different taxonomies, richer set of data extracted, a greater capability to use inference) by leveraging the native richer and deeper understanding of the meaning and the context of the text analyzed.
2) Openness and ability to exchange information with other applications will continue to be extremely important.
The availability of APIs with standard output is very important in this scenario, for several reasons. First of all, it would minimize the complexity in the process of passing information to other systems. Second, it would easily enable a richer, more comprehensive, human-like understanding of text, thus expanding the possibilities for anyone engaged in developing innovative tools and software.
3) As human beings will remain central to the process of intelligence, smooth and seamless integration with visualization tools or user friendly and innovative interfaces is strategic and not just icing on the cake.
No matter how rich or insightful the data, it will still have to be processed by humans and presented in a form that will make it more easily accessible for analysts. In addition, aligning the user experience with the most popular web or mobile applications the analyst is using in everyday life could prove extremely valuable. The effect would be a smooth overlap between information processing in the private and professional spheres, a lower barrier to entry and limited training required to become proficient in effectively accessing information. This is why I think we will see more text analytics companies pairing up with next-generation information access technologies for mobile.
The event offered great exposure to these developments, and I believe that, as long as the market continues to stay open to innovation, it could be the first real example of what big data could actually mean for the real world.