Samsung Research, the R&D hub of Samsung Electronics’ SET (end-products) business, has ranked first in the global artificial intelligence (AI) machine reading comprehension competitions twice in a row.
Samsung Research placed first in the MAchine Reading COmprehension (MS MARCO) Competition held by Microsoft (MS) last month. The same month, Samsung Research also showed the best performance in TriviaQA held by University of Washington, proving the excellence of its AI algorithm.
With competition in developing AI technologies globally, machine reading comprehension competitions such as MS MARCO are booming around the world. MS MARCO and TriviaQA are among the actively researched and used machine reading comprehension competitions along with SQuAD of Stanford University and NarrativeQA of DeepMind. Universities around the world and global AI firms including Samsung are competing in these challenges.
Machine reading comprehension is where an AI algorithm is tasked with analysing data and finding an optimum answer to a query on its own accord. For MS MARCO and TriviaQA, AI algorithms are tested in their capabilities of processing natural language in human Q&As and also providing written text in various types of document. This requires more advanced technical capabilities than SQuAD, which is just answering a simple question after reading a short paragraph in Wikipedia.
For example in MS MARCO, ten web documents are presented for a certain query to let an AI algorithm create an optimum answer. Queries are randomly selected from a million queries from Bing (MS search engine) users. Answers are evaluated statistically by estimating how close they are with human answers. This is a test designed to apply an AI algorithm to solve real-world problems.
Samsung Research took part in the competitions with ConZNet, an AI algorithm developed by the company’s AI Centre. ConZNet has skilful capabilities by adopting the Reinforcement Learning technique, the most advanced Machine Learning AI algorithm. Reinforcement Learning advances machine intelligence by giving reasonable feedback for outcomes as a stick and a carrot strategy does in a learning process. Cutting-edge AI technologies including AlphaGo are upgrading machine intelligence by applying the Reinforcement Learning technique.
With an acceleration in global competition in developing AI technologies recently, contests are widespread in areas of computer vision (technologies to analyse characters and images) and visual Q&A to solve problems using recognised images of characters as well as machine reading comprehension. The Beijing branch of Samsung Research won the International Conference on Document Analysis and Recognition (ICDAR) hosted by the International Association of Pattern Recognition (IAPR) in March, putting them in a top-tier group for global computer vision tests. The ICDAR is the most influential competition in Optical Character Recognition (OCR) technologies.
Jihie Kim, head of the Language Understanding Lab at Samsung Research, says that the question of how AI technologies understand human dialogue and queries to suggest an optimum answer is one of the hot topics in the AI industry.
He explains that the Language Understanding Lab at Samsung Research AI Centre is also striving to develop the technology behind an AI algorithm that can talk with people naturally and propose solutions to a problem.
“We are developing an AI algorithm to provide answers to user queries in a simpler and more convenient way in real life. Active discussion is underway in Samsung to adopt the ConZNet AI algorithm for products, services, customer response and technological development.”
MS MARCO and TriviaQA are among the top five global competitions in machine reading comprehension. AI algorithms are tested on whether they can understand and analyse questions to offer answers. Those tests are designed by referring to internet users’ queries and search results. The ConZNet algorithm developed by the Language Understanding Lab at Samsung Research is upgrading its intelligence by considering real user environments. The algorithm takes natural language into account such as how people deliver queries and answers online.