« Smerity.com

About Me

My name is Stephen Merity, though I'm most commonly referred to as Smerity. I'm a senior research scientist working on deep learning in San Francisco with Salesforce Research via the MetaMind acquisition.

I've been lucky enough to work with fascinating people and groups over the years including Google Sydney, Freelancer.com, the Schwa Lab at the University of Sydney, the team at Grok Learning, the non-profit Common Crawl, and IACS @ Harvard. You can read my full history in my resume or at LinkedIn. Feel free to contact me in person at smerity@smerity.com or stalk me on my various social networks!


Regularizing and Optimizing LSTM Language Models (2017) [pdf][code]
Stephen Merity*, Nitish Shirish Keskar*, James Bradbury, Richard Socher (* equal contribution)

Revisiting Activation Regularization for Language RNNs (2017) [pdf]
Stephen Merity, Bryan McCann, Richard Socher

Quasi-Recurrent Neural Networks (2016) [pdf][code]
James Bradbury*, Stephen Merity*, Caiming Xiong, Richard Socher (* equal contribution)

The WikiText Long Term Dependency Language Modeling Dataset (2016) [dataset]
Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher

Pointer Sentinel Mixture Models (2016) [pdf]
Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher

Dynamic Memory Networks for Visual and Textual Question Answering (2016) [pdf]
Caiming Xiong*, Stephen Merity*, Richard Socher (* equal contribution)
ICML 2016

Integrated Tagging and Pruning via Shift-Reduce CCG Parsing (2011) [pdf][bib]
Stephen Merity (supervisor: Dr James R. Curran)
Honours Thesis (First Class + University Medal), The University of Sydney, Sydney, Australia

Best Student Presentation: Frontier Pruning for Shift-Reduce CCG Parsing (2011) [pdf][bib][presentation]
Stephen Merity and James R. Curran
Proceedings of the 2011 Australasian Language Technology Association Workshop, ALTA 2011

Accurate Argumentative Zoning with Maximum Entropy models (2009) [pdf][bib]
Stephen Merity, Tara Murphy and James R. Curran
Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries, ACL-IJCNLP 2009, pages 19-26


New Methods for Memory, Attention, and Efficiency in Neural Networks (November 2016)
Netflix, Los Gatos (California)

The Pointer Sentinel Mixture Model (October 2016)
Stanford University's Deep Learning Reading Group, Palo Alto (California)

The Frontiers of Memory and Attention in Deep Learning (September 2016)
Quora, Mountain View (California)

Dynamic memory networks for visual and textual question answering (April 2016)
NVIDIA GPU Technology Conference 2016, San Jose (California)

Dynamic memory networks for visual and textual question answering (March 2016)
Strata + Hadoop World 2016, San Jose (California)

Using the whole web as your dataset (Video Link) (July 2015)
Dato Science Summit & Dato Conference 2015, San Francisco (California)

Internet Scale Analytics With Common Crawl (Video Link) (May 2015)
Big Data, Analytics & Machine Learning Israeli Innovation Conference, Tel Aviv

A Web Worth of Data: Common Crawl for NLP (Video Link) (April 2015)
Text By The Bay, San Francisco (California)

Common Crawl for NLP (November 2014)
Web-Scale Natural Language Processing in Northern Europe, Oslo

Experiments in web scale data (November 2014)
Big Data Beers, Berlin

AWS at Common Crawl (October 2014)
Advanced Amazon Web Services, San Francisco (California)

Measuring the impact: Google Analytics (July 2014)
Open Data Bay Area, San Francisco (California)

Machine Learning Made Scary (All New Content from Cyberdyne Systems & Aperture Labs) (March 2013)
Sydney DataPreneurs, Sydney

Data Science for Managers (February 2013)
General Assembly, Sydney

Start-up Metrics (Smetrics?): Inspiration from Dave McClure & David Jones (December 2012)
Incubate.org.au, Sydney

Machine Learning for your Robotic Army: A Crash Course using Python's Scikit-Learn (October 2012)
Sydney Python (SyPy), Sydney


A true about me would have to mention the National Computer Science School (NCSS), a summer school run at the University of Sydney involving talented high school students from all over Australia. Having had the privilege of going there eight times (first in 2007 as a student, then in 2008 as a returning student and finally as a tutor in 2009, 2010, 2011, 2012, 2013 and 2014), it has been one of the most delightful experiences of my entire life.

Over the course of a little over a week, students are introduced to a programming language, taken step by step through a set of educational challenges, and then work together to launch a fully fledged working product. This product has been a search engine, a social network, a group of maze navigating robots and every variation inbetween!

If you're a student or know someone that general age, sign up to NCSS or into NCSS Challenge!