License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.SNAPL.2017.4
URN: urn:nbn:de:0030-drops-71357
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2017/7135/
Go to the corresponding LIPIcs Volume Portal


Ernst, Michael D.

Natural Language is a Programming Language: Applying Natural Language Processing to Software Development

pdf-format:
LIPIcs-SNAPL-2017-4.pdf (0.5 MB)


Abstract

A powerful, but limited, way to view software is as source code alone. Treating a program as a sequence of instructions enables it to be formalized and makes it amenable to mathematical techniques such as abstract interpretation and model checking.

A program consists of much more than a sequence of instructions. Developers make use of test cases, documentation, variable names, program structure, the version control repository, and more. I argue that it is time to take the blinders off of software analysis tools: tools should use all these artifacts to deduce more powerful and useful information about the program.

Researchers are beginning to make progress towards this vision. This paper gives, as examples, four results that find bugs and generate code by applying natural language processing techniques to software artifacts. The four techniques use as input error messages, variable names, procedure documentation, and user questions. They use four different NLP techniques: document similarity, word semantics, parse trees, and neural networks.

The initial results suggest that this is a promising avenue for future work.

BibTeX - Entry

@InProceedings{ernst:LIPIcs:2017:7135,
  author =	{Michael D. Ernst},
  title =	{{Natural Language is a Programming Language: Applying Natural Language Processing to Software Development}},
  booktitle =	{2nd Summit on Advances in Programming Languages (SNAPL 2017)},
  pages =	{4:1--4:14},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-032-3},
  ISSN =	{1868-8969},
  year =	{2017},
  volume =	{71},
  editor =	{Benjamin S. Lerner and Rastislav Bod{\'i}k and Shriram Krishnamurthi},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2017/7135},
  URN =		{urn:nbn:de:0030-drops-71357},
  doi =		{10.4230/LIPIcs.SNAPL.2017.4},
  annote =	{Keywords: natural language processing, program analysis, software development}
}

Keywords: natural language processing, program analysis, software development
Collection: 2nd Summit on Advances in Programming Languages (SNAPL 2017)
Issue Date: 2017
Date of publication: 05.05.2017


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI