I am a research scientist at the Allen Institute for Artificial Intelligence in Seattle, Washington where I do work on natural language processing and machine learning on the Aristo Project. Prior to this, I was a researcher at the Institute for Natural Language Processing (IMS) at the University of Stuttgart in Germany, where I received my PhD in October 2018. Before this, I received my B.A. from the University of Rochester in upstate New York (USA).
In the summer of 2016, I taught a Masters seminar on Semantic Parsing (my thesis topic) at the University of Stuttgart. The slides and course materials can be found here.
Some miscellaneous notes and musings: Number Theory Meets Computability
Theory (see also
blog post);
other lecture notes: Notes on Language Models, Attention and
Transformers,
Negation as Failure,
Mixing Logic and Deep Learning: The Logic as Loss Function
Approach, Introduction to
Probability. upcoming:
Formal Techniques for Neural-symbolic Modeling
recently taught at ESSLLI 2023
Recent Talks from me and my extended group: Brief (10 minute) introduction to Natural Language Understanding (NLU) and Language Modeling (intended for a non-technical audience); Overview of my work on diagnostic testing of neural models; Pushing the Limits of Rule Reasoning in Transformers (AAAI 2022), Breakpoint Transformers (EMNLP 2022); Learning to Decompose (EMNLP 2022) Decomposed Prompting (ICLR 2023);
I recently starting converting some of my research notes into blog posts, with the hope that someone might find them useful (or, even better, that someone might correct me when I’m wrong, since many of the topics covered go outside of my area of expertise).
Note: For the most up-to-date versions of my papers, please refer to the arxiv versions (unless stated otherwise).
Nora Kassner, Oyvind Tafjord, Ashish Sabharwal, Kyle Richardson, Hinrich Schütze and Peter Clark. (2023) Language Models with Rationality (work in progress) [arxiv]
Zeming Chen, Qiyue Gao, Antoine Bosselut, Ashish Sabharwal, Kyle Richardson (2023) DISCO: Distilling Counterfactuals with Large Language Models. (ACL 2023) [arxiv] [code]
Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal (2023) Decomposed Prompting: A Modular Approach for Solving Complex Tasks (ICLR 2023) [arxiv] [code] [poster] [slides]
Gregor Betz, Kyle Richardson. (2023) Probabilistic coherence, logical consistency, and Bayesian learning: Neural language models as epistemic agents (PLOS One journal) [publisher] [data/resources]
Kyle Richardson, Ronen Tamari, Oren Sultan, Dafna Shahaf, Reut Tsarfaty and Ashish Sabharwal. (2022) Breakpoint Transformers for Modeling and Tracking Intermediate Beliefs. (EMNLP 2022) [arxiv] [code] [slides]
Ben Zhou, Kyle Richardson, Xiaodong Yu and Dan Roth. (2022) Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts (EMNLP 2022) [arxiv] [data/code]
Matthew Finlayson, Kyle Richardson , Ashish Sabharwal, Peter Clark (2022) What Makes Instruction Learning Hard? An Investigation and a New Challenge in a Synthetic Environment (EMNLP 2022) [arxiv] [code/data]
Gregor Betz, Kyle Richardson. (2022) Judgement Aggregation, Discursive Dilemma and Reflective Equilibrium: Neural Language Models as Self- Improving Doxastic Agents. Frontiers in Artificial Intelligence. [publisher]
Aarohi Srivastava et al (+441 authors) (2022) Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models [arxiv] [resources]
Tushar Khot, Kyle Richardson , Daniel Khashabi, Ashish Sabharwal (2022) Learning to Solve Complex Tasks by Talking to Agents (Findings of ACL) [arxiv] [code/data] [slides] [poster]
Kyle Richardson , Ashish Sabharwal (2022) Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability (AAAI2022) [arxiv] [code/data][slides] [poster]
Daniel Khashabi, Shane Lyu, Sewon Min, Lianhui Qin, Kyle Richardson , Sameer Singh, Sean Welleck, Hannaneh Hajishirzi, Tushar Khot, Ashish Sabharwal, Yejin Choi (2022) PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts (Proceedings of NAACL) [arxiv] [slides]
Ronen Tamari, Kyle Richardson , Aviad Sar-Shalom, Noam Kahlon, Nelson F. Liu, Reut Tsarfaty and Dafna Shahaf (2022) Dyna-bAbI: unlocking bAbI’s potential with dynamic synthetic benchmarking (*SEM2022) [arxiv] [code/data]
Gregor Betz, Kyle Richardson. (2022) DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models (*SEM2022) [arxiv] [demo] [dataset] [model]
Hai Hu, He Zhou, Zuoyu Tian, Yiwen Zhang, Yina Patterson, Yanting Li, Yixin Nie, Kyle Richardson. (2021) Investigating Transfer Learning in Multi-lingual Pre-trained Language Models through Chinese Natural Language Inference Findings of ACL [code/data] [arxiv] [acl anthology]
Gregor Betz, Christian Voigt, Kyle Richardson. (2021) Thinking Aloud: Dynamic Context Generation Improves Zero-Shot Reasoning Performance of GPT-2 work in progress [arxiv]
Ben Zhou, Kyle Richardson, Qiang Ning, Tushar Khot, Ashish Sabharwal, Dan Roth. (2021) Temporal Reasoning on Implicit Events from Distant Supervision Proceedings of the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021) [arxiv] [code] [data] [leaderboard] [slides]
Tushar Khot, Daniel Khashabi, Kyle Richardson, Peter Clark, Ashish Sabharwal (2021) Text Modular Networks: Learning to Decompose Tasks in the Language of Existing Models Proceedings of the 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2021) [arxiv] [code/data] [demo] [slides] [poster]
Gregor Betz, Christian Voigt, Kyle Richardson. (2021) Critical Thinking for Language Models Proceedings of International Conference on Computational Semantics (IWCS 2021) [arxiv] [data] [models] [blog] [proceedings] [video]
Sumithra Bhakthavatsalam, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Peter Clark (2021) Think you have Solved Direct-Answer Question Answering? Try ARC-DA, the Direct-Answer AI2 Reasoning Challenge technical note [arxiv] [data]
Liang Xu, Hai Hu, Xuanwei Zhang, Lu Li, Chenjie Cao, Yudong Li, Yechen Xu, Kai Sun, Dian Yu, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li, Yina Patterson, Zuoyu Tian, Yiwen Zhang, He Zhou, Shaoweihua Liu, Zhe Zhao, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson, and Zhenzhong Lan. (2020) CLUE: A Chinese Language Understanding Evaluation Benchmark. in Proceedings of International Conference on Computational Linguistics (COLING) [arxiv] [website/leaderboard] [code/data] [proceedings]
Niket Tandon, Keisuke Sakaguchi, Bhavana Dalvi, Dheeraj Rajagopal, Peter Clark, Michal Guerquin, Kyle Richardson and Eduard Hovy. (2020) A Dataset for Tracking Entities in Open Domain Procedural Text in Proceedings of International Conference on Empirical Methods in Natural Language Processing (EMNLP) [proceedings] [arxiv] [dataset] [code]
Hai Hu, Kyle Richardson, Liang Xu, Lu Li, Sandra Kubler, Lawrence S. Moss. (2020) OCNLI: Original Chinese Natural Language Inference Findings of EMNLP [arxiv] [code/data] [leaderboard] [acl_anthonology]
Sumithra Bhakthavatsalam, Kyle Richardson, Niket Tandon, Peter Clark (2020) Do Dogs have Whiskers? A New Knowledge Base of hasPart Relations technical note [arxiv] [data]
Atticus Geiger, Kyle Richardson, Christopher Potts (2020) Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation in Workshop on Analzying and Interpreting Neural Networks for NLP (BlackBoxNLP) [arxiv] [proceedings] [data]
Kyle Richardson, Ashish Sabharwal (2020). What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge. in Transactions of the Association for Computational Linguistics (TACL) [arxiv] [journal] [code/data][slides (EMNLP2020)]
Peter Clark, Oyvind Tafjord,Kyle Richardson (2020). Transformers as Soft Reasoners over Language. Proceedings of International Joint Conference on Artificial Intelligence (IJCAI) [arxiv] [proceedings] [demo][data] [data generator code]
Hai Hu, Qi Chen, Kyle Richardson, Atreyee Mukherjee, Lawrence S. Moss,Sandra Kuebler (2020). MonaLog: a Lightweight System for Natural Language Inference Based on Monotonicity. Proceedings of Society for Computation in Linguistics (SCIL 2020) [arxiv] [proceedings] [data]
Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal (2020). Probing Natural Language Inference Models through Semantic Fragments. Proceedings of Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI) [arxiv][aaai][code/data][slides]
Peter Clark,Oren Etzioni, Daniel Khashabi, Tushar Khot, Bhavana Dalvi Mishra, Kyle Richardson, Ashish Sabharwal, Carissa Schoenick, Oyvind Tafjord, Niket Tandon, Sumithra Bhakthavatsalam, Dirk Groeneveld,Michal Guerquin, Michael Schmitz (2020). From ‘F’ to ‘A’ on the N.Y. Regents Science Exams: An Overview of the Aristo Project AI Magazine[arxiv][New York Times, GeekWire]
Kyle Richardson (2018) New Resources and Ideas for Semantic Parser Induction. PhD Thesis, Institute for Natural Language Processing (IMS), Faculty of Computer Science, Electrical Engineering and Information Technology. University of Stuttgart, Germany [opus][slides][code/data][handout]
Kyle Richardson (2018) A Language for Function Signature Representations. Brief technical note. [arxiv][data]
Kyle Richardson, Jonathan Berant and Jonas Kuhn (2018). Polyglot Semantic Parsing in APIs. Proceedings of 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) [arxiv][data][notes][code][slides][video]
Kyle Richardson, Sina Zarrieß and Jonas Kuhn (2017). The Code2Text Challenge: Text Generation in Source Code Libraries (2017) Proceedings of International Natural Language Generation Conference (INLG) [arxiv][paper][inlg_slides][resources].
Kyle Richardson, Jonas Kuhn (2017). Function Assistant: A Tool for NL Querying of APIs. (2017) Proceedings of Empirical Methods in Natural Language Processing (EMNLP) [arxiv][paper][demo][resources][code][poster]
Kyle Richardson, Jonas Kuhn (2017). Learning Semantic Correspondences in Technical Documentation. Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL) [arvix][paper][notes][data][acl_poster][stuttgart slides][code].
Kyle Richardson, Jonas Kuhn. (2016) Learning to Make Inferences in a Semantic Parsing Task. Transactions of the Association for Computational Linguistics (TACL) [paper][data][acl_slides][video] [extended version (from thesis)] [based partly on cky/kbest implemention from here].
Cleo Condoravdi, Kyle Richardson, Vishal Sikka, Asuman Suenbuel, and Richard Waldinger (2015) Natural Language Access to Data: It Takes Common Sense!. in Twelfth International Symposium on Logical Formalizations of Commonsense Reasoning (Commonsense-15). AAAI Spring Symposium. [demo][link]
Cleo Condoravdi, Kyle Richardson, Vishal Sikka, Asuman Suenbuel, and Richard Waldinger (2014) Deduction for Natural Language Access to Data. in University of Coimbra CS Technical Reports, CISUC/TR 2014-02. Presented at Joint Workshop on Natural Language and Computer Science (NLCS) and Natural Language Services for Reasoners (NLSR).
Kyle Richardson and Jonas Kuhn (2014) UnixMan Corpus: A Resource for Language Learning in the Unix Domain. in Proceedings of Language Resources and Evaluation (LREC). [link] [data]
Sina Zarriess and Kyle Richardson. (2013) An Automatic Method for Building a Data-to-Text Generator. in Proceedings of 14th European Workshop on Natural Language Generation (ENLG) [link]
Richard Waldinger, Danny Bobrow, Cleo Condoravdi, Amar Das, Kyle Richardson. (2011) Accessing Structured Health Information through English Queries and Automatic Deduction. in Proceedings of AAAI Spring Symposium on Health Communications.