SAS has announced the acquisition of privately held Teragram, the leader in natural language processing (NLP) and advanced linguistic technology. The acquisition will enhance SAS’ own robust text mining and analytical BI offerings, and extend them to enterprise and mobile search.

More than a decade ago, SAS was among the first companies to recognise the importance of text mining, the analysis of text and other unstructured data such as Web pages, documents, email, images and other information not stored in a structured database. Today, SAS leads this important and growing space.
“The addition of Teragram’s domain expertise and NLP technology will change the landscape of the BI and analytics markets,” says SAS CEO Jim Goodnight. “Teragram’s technologies augment, strengthen and extend SAS’ ability to combine structured and unstructured data – not only in our text mining solution but embedded across the entire SAS Enterprise Intelligence Platform – to drive better answers faster.”
Teragram, a 40-person firm, will be run as a SAS company. Terms of the acquisition deal were not disclosed. Teragram’s NLP technology is well-established, with a customer base including CNN,, NYTimes Digital, Sony,, Wolters Kluwer, the World Bank and Yahoo!.
“As the data explosion continues, companies need an intelligent way to make sense of it all, whether data is in structured databases or in the huge variety of unstructured sources,” says Yves Schabes, President of Teragram. “Teragram and its technology fit perfectly into SAS’ analytics and text mining efforts, as SAS continues to innovate in this rapidly growing market.
"We’re pleased to join a company that delivers the software businesses need to blend structured and unstructured data and reach better, timelier and more accurate decisions."
Teragram’s natural language processing (NLP) technologies help turn text – in many languages and from many sources – into useable information. NLP enables richer data processing at the level of words, linguistic relations and word meanings. Teragram has developed and maintains large annotated dictionaries containing several hundred million words in more than 30 languages.
Teragram’s advanced categorisation technologies provide instant, advanced classification of documents according to custom criteria, applied throughout the organisation. This enables faster and more accurate access to documents organised by specific topics that match the interest of a given user, regardless of the original document's location.
For enterprise search, Teragram’s NLP technologies scan structured corporate databases and unstructured sources including text-based reports and Web pages to provide comprehensive answers from these multiple information sources.
“With today’s multinational companies and distributed workforces, as well as tremendous amounts of data in disparate systems and formats, it’s more important than ever to get quick and accurate answers to key business questions,” says Schabes.
“Enterprise search is a competitive weapon for tapping an organisation’s existing data resources. Combining SAS’ business intelligence, data integration and advanced analytics with Teragram’s NLP technologies will deliver answers to search queries in seconds.”
Teragram’s sophisticated search capabilities deliver an easy-to-use environment for BI, extending the availability and use of BI throughout organisations. The combination of SAS and Teragram technologies provides indexing driven not just by a report’s header, but by its actual content and the metadata associated with it.
Teragram also brings SAS the next generation of mobile search, helping individuals scan information remotely and get answers faster. Using Teragram’s mobile search technology, individuals can store and retrieve information, connect to outside applications such as BI systems, and search databases from their BlackBerry, smart phone or other mobile device.
Business management expert Bill Jensen first decried the downsides of today's information explosion back in 2001, in his book "Simplicity." According to his research, echoed by others, the most conservative estimates currently show that business information is doubling every eighteen months. This data flood has only grown more pronounced in recent years, and much of this data lies outside traditional, “structured” databases.
According to estimates, unstructured data comprises up to 70% of all business data. This unstructured data resides in customer comments and service notes, e-mail and chat threads, documents and surveys, blogs and RSS feeds, warranty claims, resumes, voicemail and phone logs, among other sources.
If businesses fail to include this unstructured data in their analyses – of customers, market opportunities, internal operations, supply chains, etc. – they are only seeing part of the complete picture, and can make bad decisions as a result. Powerful analytics like SAS’ can help organisations weave structured and unstructured data to uncover hidden patterns and trends, and then use this insight to make better decisions, solve problems and take advantage of opportunities.