Machine Reading

An NLP research group at the UCL Computer Science department teaching machines how to read.

The amount of published information is growing rapidly. Much of this information comes in the form of unstructured text which cannot easily be searched, mined, visualized or, ultimately, acted upon. The principal goal of our group is to build machines that can read and "understand" this textual information, converting it into interpretable structured knowledge to be leveraged by humans and other machines alike.

To achieve our goal we work in the intersection of Natural Language Processing and Machine Learning. We rely heavily on statistical methods of various flavours.

Our group is part of the UCL Computer Science department, affiliated with CSML and based in the London Media Technology Campus. We are organizing the South England Natural Language Processing Meetup. Get in touch if you're interested in attending.

If you are interested in doing a PhD with us, please have a look at these instructions.

  • Sebastian Riedel Reader

    Sebastian works in NLP and Machine Learning. He is particularly interested in helping machines to read more accurately by leveraging knowledge gathered through reading more accurately.

  • V. Ivan Sanchez 4th year PhD Student

    I am working on learning interpretable models, such as decision trees and Bayesian networks, from Matrix Factorization models. I'm interested in probabilistic graphical models. I'm funded by CONACYT.

  • Matko Bošnjak 3rd year PhD Student

    Matko interests include both natural and unnatural language processing, and their interplay. Specifically, he's enjoying differentiable abstract machines and interpreters, code induction, and trainable combinations of neural networks and code. When tired from unnatural language, he can be found enjoying a good question answering model.

  • George Spithourakis 3rd year PhD Student

    I am working on Multi-Instance Text Regression and learning weakly supervised word embeddings. I am interested in structured prediction, distributional semantics, neural models and optimisation. My secondary supervisor is Steffen Petersen and I am funded by the Farr Institute of Health Informatics Research.

  • Ingolf Becker 3rd year PhD Student

    Ingolf researches into the intersection of NLP and Information Security. His work combines topic models, sentiment analysis and statistical tests to transcripts on security topics, attempting to automatically infer conflicts between security and business processes.

  • Pontus Stenetorp Senior Research Associate

    Pontus works somewhere in the intersection between Natural Language Processing and Machine Learning. He is particularly interested in representation learning and is currently funded by a machine reading grant from the Allen Foundation.

  • Johannes Welbl 2nd year PhD Student

    I'm interested in Machine Learning and NLP; my research focusses on Reading Comprehension and Knowledge Base Inference. Currently I work on multi-step Reading Comprehension, a scenario in which a model combines multiple facts to arrive at an answer.

  • Jeff Mitchell Research Associate

    I'm interested in using machine reading technology to extract and verify facts from raw text.

  • Pasquale Minervini Research Associate

    Pasquale is interested in Machine Reading, and how to leverage background knowledge in representation learning algorithms. He is currently funded by a machine reading grant from the Allen Foundation.

  • Tom Crossland PhD Student

    Tom is an astrophysicist working with the MR group and the Mullard Space Science Laboratory, interested in Machine Learning applications to his original subject area. He is currently working on automatic measurement extraction from scientific literature, with a view to applying the results to galactic archaeology.

  • Ed Grefenstette Honorary Reader

    Ed is interesting in teaching machines to understand and communicate using language (formal and natural), and in both neural and symbolic reasoning (and the intersection thereof). He is involved with UCLMR's research activities alongside a full time role in industry.

  • stat-nlp-book is an interactive Statistical NLP book in Python, used for our StatNLP from 2016 on
  • stat-nlp-book-scala is an interactive Statistical NLP book in Scala, used for our StatNLP course in 2015/16
  • wolfe is a framework for building rich machine learning models, based on functional programming, factor graphs, optimization and composition.
  • ucleed is a biomedical event extractor that ranked first in several tracks of the BioNLP 2011 shared task.
  • thebeast is a Markov Logic inference and learning engine.
  • What's Wrong With My NLP? is a visualizer for NLP problems.