Important note: There are far too many papers for me to have accurately selected all of the interesting ones! Every time I go through the list again, I find an additional set of papers for me to read. This is dangerous as this is already a formidable list ;)
If you're interested in tackling the list of papers yourself, check out the ICLR 2017 Conference Track submissions. Bonus points if your over-eagerness to read all the papers crashes the web server again ;)
When I add (biased) it's as I'm an author on the paper or a colleague of one of the authors. I will note that I believe my bias to have a good foundation - my colleagues and I produce good work ;)
Tying word vectors has been shown insanely highly effective to language models. This advantage likely extends to other specifics tasks as well.
The "sparse things are better things" category:
Training recurrent neural networks is still fraught with terror. I've written previously about orthogonality in RNN weights. These works explore the recurrence within RNNs through these lenses.
Interested in saying hi? ^_^