Sebastian Ruder

I'm a research scientist at DeepMind. I blog about Machine Learning, Deep Learning, and Natural Language Processing.

  • About
  • Tags
  • Papers
  • Talks
  • News
  • FAQ
  • Sign up for NLP News
  • NLP Progress
  • Contact
Unsupervised Cross-lingual Representation Learning
cross-lingual

Unsupervised Cross-lingual Representation Learning

This post expands on the ACL 2019 tutorial on Unsupervised Cross-lingual Representation Learning. It highlights key insights and takeaways and provides updates based on recent work, particularly unsupervised deep multilingual models.

  • Sebastian Ruder
    Sebastian Ruder
20 min read
The State of Transfer Learning in NLP
transfer learning

The State of Transfer Learning in NLP

This post expands on the NAACL 2019 tutorial on Transfer Learning in NLP. It highlights key insights and takeaways and provides updates based on recent work.

  • Sebastian Ruder
    Sebastian Ruder
15 min read
EurNLP
events

EurNLP

The first European NLP Summit (EurNLP) will take place in London on October 11, 2019. It is an opportunity to foster discussion and collaboration between researchers in and around Europe.

  • Sebastian Ruder
    Sebastian Ruder
2 min read
NAACL 2019 Highlights
events

NAACL 2019 Highlights

This post discusses highlights of NAACL 2019. It covers transfer learning, common sense reasoning, natural language generation, bias, non-English languages, and diversity and inclusion.

  • Sebastian Ruder
    Sebastian Ruder
8 min read
Neural Transfer Learning for Natural Language Processing (PhD thesis)
transfer learning

Neural Transfer Learning for Natural Language Processing (PhD thesis)

This post discusses my PhD thesis Neural Transfer Learning for Natural Language Processing and some new material presented in it.

  • Sebastian Ruder
    Sebastian Ruder
1 min read
AAAI 2019 Highlights: Dialogue, reproducibility, and more
events

AAAI 2019 Highlights: Dialogue, reproducibility, and more

This post discusses highlights of AAAI 2019. It covers dialogue, reproducibility, question answering, the Oxford style debate, invited talks, and a diverse set of research papers.

  • Sebastian Ruder
    Sebastian Ruder
11 min read
The 4 Biggest Open Problems in NLP
natural language processing

The 4 Biggest Open Problems in NLP

This is the second post based on the Frontiers of NLP session at the Deep Learning Indaba 2018. It discusses 4 major open problems in NLP.

  • Sebastian Ruder
    Sebastian Ruder
10 min read
10 Exciting Ideas of 2018 in NLP
transfer learning

10 Exciting Ideas of 2018 in NLP

This post gathers 10 ideas that I found exciting and impactful this year—and that we'll likely see more of in the future. For each idea, it highlights 1-2 papers that execute them well.

  • Sebastian Ruder
    Sebastian Ruder
8 min read
EMNLP 2018 Highlights: Inductive bias, cross-lingual learning, and more
events

EMNLP 2018 Highlights: Inductive bias, cross-lingual learning, and more

This post discusses highlights of EMNLP 2018. It focuses on talks and papers dealing with inductive bias, cross-lingual learning, word embeddings, latent variable models, language models, and datasets.

  • Sebastian Ruder
    Sebastian Ruder
11 min read
HackerNoon Interview
natural language processing

HackerNoon Interview

This post is an interview by fast.ai fellow Sanyam Bhutani with me. It covers my background, advice on getting started with NLP, writing technical articles, and more.

  • Sebastian Ruder
    Sebastian Ruder
7 min read
A Review of the Neural History of Natural Language Processing
language models

A Review of the Neural History of Natural Language Processing

This post expands on the Frontiers of Natural Language Processing session organized at the Deep Learning Indaba 2018. It discusses major recent advances in NLP focusing on neural network-based methods.

  • Sebastian Ruder
    Sebastian Ruder
29 min read
ACL 2018 Highlights: Understanding Representations and Evaluation in More Challenging Settings
natural language processing

ACL 2018 Highlights: Understanding Representations and Evaluation in More Challenging Settings

This post discusses highlights of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018). It focuses on understanding representations and evaluating in more challenging scenarios.

  • Sebastian Ruder
    Sebastian Ruder
18 min read
NLP's ImageNet moment has arrived
natural language processing

NLP's ImageNet moment has arrived

Big changes are underway in the world of NLP. The long reign of word vectors as NLP's core representation technique has seen an exciting new line of challengers emerge. These approaches demonstrated that pretrained language models can achieve state-of-the-art results and herald a watershed moment.

  • Sebastian Ruder
    Sebastian Ruder
15 min read
Tracking the Progress in Natural Language Processing
natural language processing

Tracking the Progress in Natural Language Processing

Research in ML and NLP is moving at a tremendous pace, which is an obstacle for people wanting to enter the field. To make working with new tasks easier, this post introduces a resource that tracks the progress and state-of-the-art across many tasks in NLP.

  • Sebastian Ruder
    Sebastian Ruder
2 min read
Highlights of NAACL-HLT 2018: Generalization, Test-of-time, and Dialogue Systems
natural language processing

Highlights of NAACL-HLT 2018: Generalization, Test-of-time, and Dialogue Systems

This post discusses highlights of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2018). It focuses on Generalization, the Test-of-Time awards, and Dialogue Systems.

  • Sebastian Ruder
    Sebastian Ruder
15 min read
An overview of proxy-label approaches for semi-supervised learning
semi-supervised learning

An overview of proxy-label approaches for semi-supervised learning

While unsupervised learning is still elusive, researchers have made a lot of progress in semi-supervised learning. This post focuses on a particular promising category of semi-supervised learning methods that assign proxy labels to unlabelled data, which are used as targets for learning.

  • Sebastian Ruder
    Sebastian Ruder
19 min read
Text Classification with TensorFlow Estimators
tensorflow

Text Classification with TensorFlow Estimators

This post is a tutorial that shows how to use Tensorflow Estimators for text classification. It covers loading data using Datasets, using pre-canned estimators as baselines, word embeddings, and building custom estimators, among others.

  • Sebastian Ruder
    Sebastian Ruder
13 min read
Requests for Research
transfer learning

Requests for Research

It can be hard to find compelling topics to work on and know what questions to ask when you are just starting as a researcher. This post aims to provide inspiration and ideas for research directions to junior researchers and those trying to get into research.

  • Sebastian Ruder
    Sebastian Ruder
13 min read
Optimization for Deep Learning Highlights in 2017
optimization

Optimization for Deep Learning Highlights in 2017

Different gradient descent optimization algorithms have been proposed in recent years but Adam is still most commonly used. This post discusses the most exciting highlights and most promising recent approaches that may shape the way we will optimize our models in the future.

  • Sebastian Ruder
    Sebastian Ruder
15 min read
Word embeddings in 2017: Trends and future directions
word embeddings

Word embeddings in 2017: Trends and future directions

Word embeddings are an integral part of current NLP models, but approaches that supersede the original word2vec have not been proposed. This post focuses on the deficiencies of word embeddings and how recent approaches have tried to resolve them.

  • Sebastian Ruder
    Sebastian Ruder
17 min read
Multi-Task Learning Objectives for Natural Language Processing
multi-task learning

Multi-Task Learning Objectives for Natural Language Processing

Multi-task learning is becoming increasingly popular in NLP but it is still not understood very well which tasks are useful. As inspiration, this post gives an overview of the most common auxiliary tasks used for multi-task learning for NLP.

  • Sebastian Ruder
    Sebastian Ruder
16 min read
Highlights of EMNLP 2017: Exciting datasets, return of the clusters, and more
natural language processing

Highlights of EMNLP 2017: Exciting datasets, return of the clusters, and more

This post discusses highlights of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). These include exciting datasets, new cluster-based methods, distant supervision, data selection, character-level models, and many more.

  • Sebastian Ruder
    Sebastian Ruder
10 min read
Learning to select data for transfer learning
domain adaptation

Learning to select data for transfer learning

Domain adaptation methods typically seek to identify features that are shared between the domains or learn representations that are general enough to be useful for both domains. This post discusses a complementary approach to domain adaptation that selects data that is useful for training the model.

  • Sebastian Ruder
    Sebastian Ruder
3 min read
Deep Learning for NLP Best Practices
natural language processing

Deep Learning for NLP Best Practices

Neural networks are widely used in NLP, but many details such as task or domain-specific considerations are left to the practitioner. This post collects best practices that are relevant for most tasks in NLP.

  • Sebastian Ruder
    Sebastian Ruder
23 min read
An Overview of Multi-Task Learning in Deep Neural Networks
multi-task learning

An Overview of Multi-Task Learning in Deep Neural Networks

Multi-task learning is becoming more and more popular. This post gives a general overview of the current state of multi-task learning. In particular, it provides context for current neural network-based methods by discussing the extensive multi-task learning literature.

  • Sebastian Ruder
    Sebastian Ruder
29 min read
Transfer Learning - Machine Learning's Next Frontier
transfer learning

Transfer Learning - Machine Learning's Next Frontier

Deep learning models excel at learning from a large number of labeled examples, but typically do not generalize to conditions not seen during training. This post gives an overview of transfer learning, motivates why it warrants our application, and discusses practical applications and methods.

  • Sebastian Ruder
    Sebastian Ruder
28 min read
Highlights of NIPS 2016: Adversarial learning, Meta-learning, and more
meta-learning

Highlights of NIPS 2016: Adversarial learning, Meta-learning, and more

The Conference on Neural Information Processing Systems (NIPS) is one of the top ML conferences. This post discusses highlights of NIPS 2016 including GANs, the nuts and bolts of ML, RNNs, improvements to classic algorithms, RL, Meta-learning, and Yann LeCun's infamous cake.

  • Sebastian Ruder
    Sebastian Ruder
12 min read
A survey of cross-lingual word embedding models
cross-lingual

A survey of cross-lingual word embedding models

Monolingual word embeddings are pervasive in NLP. To represent meaning and transfer knowledge across different languages, cross-lingual word embeddings can be used. Such methods learn representations of words in a joint embedding space.

  • Sebastian Ruder
    Sebastian Ruder
41 min read
Highlights of EMNLP 2016: Dialogue, deep learning, and more
natural language processing

Highlights of EMNLP 2016: Dialogue, deep learning, and more

This post discusses highlights of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP 2016). These include work on reinforcement learning, dialogue, sequence-to-sequence models, semantic parsing, natural language generation, and many more.

  • Sebastian Ruder
    Sebastian Ruder
4 min read
On word embeddings - Part 3: The secret ingredients of word2vec
word embeddings

On word embeddings - Part 3: The secret ingredients of word2vec

Word2vec is a pervasive tool for learning word embeddings. Its success, however, is mostly due to particular architecture choices. Transferring these choices to traditional distributional methods makes them competitive with popular word embedding methods.

  • Sebastian Ruder
    Sebastian Ruder
9 min read
LxMLS 2016 Highlights
events

LxMLS 2016 Highlights

The Lisbon Machine Learning School (LxMLS) is an annual event that brings together researchers and graduate students in ML, NLP, and Computational Linguistics. This post discusses highlights, key insights, and takeaways from the 6th edition of the summer school.

  • Sebastian Ruder
    Sebastian Ruder
14 min read
On word embeddings - Part 2: Approximating the Softmax
word embeddings

On word embeddings - Part 2: Approximating the Softmax

The softmax layer is a core part of many current neural network architectures. When the number of output classes is very large, such as in the case of language modelling, computing the softmax becomes very expensive. This post explores approximations to make the computation more efficient.

  • Sebastian Ruder
    Sebastian Ruder
33 min read
On word embeddings - Part 1
word embeddings

On word embeddings - Part 1

Word embeddings popularized by word2vec are pervasive in current NLP applications. The history of word embeddings, however, goes back a lot further. This post explores the history of word embeddings in the context of language modelling.

  • Sebastian Ruder
    Sebastian Ruder
15 min read
An overview of gradient descent optimization algorithms
optimization

An overview of gradient descent optimization algorithms

Gradient descent is the preferred way to optimize neural networks and many other machine learning algorithms but is often used as a black box. This post explores how many of the most popular gradient-based optimization algorithms such as Momentum, Adagrad, and Adam actually work.

  • Sebastian Ruder
    Sebastian Ruder
28 min read
Sebastian Ruder © 2019
Latest Posts Twitter Ghost