Overview
This workshop is intended to encourage interdisciplinary research between NLP and Software Engineering resources. We invited a range of researchers with both NLP and SE backgrounds to come together, discuss their research, establish datasets, tasks, and baselines, and generally help the field build momentum.
Recent updates
New!
Please add suggsted resources and datasets to this spreadsheet. We'll review and prioritize in the closing session.
New!
Tuesday sessions were rearranged; please check the schedule.
Call for participation
Get the current call for participation as a PDF.
Submissions
Please submit your 1–2 page paper here:
https://www.easychair.org/conferences/?conf=nlse2015
Important dates
- Sept 1, 2015
- 1-2 page paper due
- Oct 1, 2015
- Program finalized and sent out to attendees
- Oct 25–27, 2015
- Workshop dates
- Dec 15, 2015
- Final report produced
Attendee list
Name | Institution |
---|---|
Abram Hindle | University of Alberta |
Alvin Cheung | University of Washington |
Andrian Marcus | University of Texas at Dallas |
Ashish Vaswani | University of Southern California, Information Systems Institute |
Baishakhi Ray | University of Virginia |
Ben Snyder | University of Wisconsin |
Chang Liu | Ohio University |
Charles Sutton | University of Edinburgh |
Chris Quirk | Microsoft Research, Redmond, Washington |
Collin McMillan | University of Notre Dame |
Dana Movshovitz-Attias | Carnegie Mellon University |
Daniel Tarlow | Microsoft Research, Cambridge England |
David Chiang | University of Notre Dame |
Dawn Lawrie | Loyola University Maryland |
Denys Poshyvanyk | College of William & Mary |
Earl Barr | University College London |
Gagan Bansal | University of Washington |
Giriprasad Sridhara | IBM |
Graham Neubig | Nara Institute of Science and Technology |
Jane Cleland-Huang | DePaul University |
Jennifer D'Souza | University of California, Davis |
Jim Donlon | National Science Foundation |
Luke Zettlemoyer | University of Washington |
Mark Marron | Microsoft Research, Research, Washington |
Martin Monperrus | University of Lille |
Martin White | College of William & Mary |
Mirella Lapata | University of Edinburgh |
Nate Kushman | Massachusetts Institute of Technology |
Patrick Wagstrom | IBM Watson |
Percy Liang | Stanford University |
Prem Devanbu | University of California, Davis |
Ray Mooney | University of Texas at Austin |
Razvan Bunescu | Ohio University |
Sol Greenspan | National Science Foundation |
Sonia Haiduc | Florida State University |
Srini Iyer | University of Washington |
Tao Xie | University of Illinois |
Tatiana Korelsky | National Science Foundation |
Tien N. Nguyen | Iowa State University |
Venera Arnaoudova | Washington State University |
Vincent Hellendoorn | University of California, Davis |
Vladimir Filkov | University of California, Davis |
William Cohen | Carnegie Mellon University |
Yi Wei | Microsoft Research, Cambridge England |
Yoav Artzi | Cornell University |
Zhendong Su | University of California, Davis |
Zhilin Yang | Carnegie Mellon University |
Workshop program
This three day workshop will be held Sunday, October 25 until Tuesday, October 27, 2015. All sessions will be the large lecture hall in Microsoft Building 99, room 99/1919.
Program overview
Sunday, October 25
- 1:00pm – 2:30pm
-
Tutorial session: n-gram and Neural Network Language Modeling
- 2:30pm – 3:00pm
- coffee break
- 3:00pm – 4:30pm
-
Tutorial session: Software Mining and Software Datasets
- 4:30pm – 5:30pm
- break for check in, etc.
- 6:00pm – 8:30pm
- Dinner, catered at Microsoft Building 99
Monday, October 26
- 8:00am – 9:00am
- Light breakfast in 99/1919
- 9:00am – 10:15am
-
Software tools and processes
(Organizers: Premkumar Devanbu and Chris Quirk)
(Scribe: Jennifer D’Souza)
09h00 – 09h30 Opening session and introduction
09h30 – 10h15 Keynote by Charles Sutton
- 10:15am – 10:30am
- Coffee break
- 10:30am – 11:15am
-
Software tools and processes (cont'd)
10h30 – 11h15 Open Discussion
- What kind of collaborations would help move this area forward?
- Are there other questions/problems that remain unexplored?
- What data resources are needed?
- Are there benchmarks or evaluation contests that are needed?
- 11:15am – 12:00pm
-
Data repositories
(Organizers: Premkumar Devanbu and Chris Quirk)
(Scribe: Zhilin Yang)
11h15 – 11h45 Talk by Tien Nguyen: The BOA Code Repository and Infrastructure
11h45 – 12h00 Questions/Discussion
- 12:00pm – 1:00pm
- Lunch break
- 1:00pm – 2:15pm
Mutual introductions: Two minute madness
- 2:15pm – 2:30pm
- Coffee break
- 2:30pm – 4:00pm
-
Code and Program Modeling
(Organizers: Charles Sutton and Tien Nguyen)
(Scribe: Vincent Hellendoorn)
2h30 – 3h00 Talk by Earl Barr, University College London: Inference Problems in Software Engineering
3h00 – 3h30 Talk by Daniel Tarlow, Microsoft Research
3h30 – 4h00 Open Discussion
- Applications of code and program modeling?
- When are different levels of information appropriate, e.g., lexical, syntactic, semantic?
- Calling all hammers: What are NLP modeling techniques that are ripe for carrying over?
- What’s special about software? Can we just keep importing standard NLP methods willy-nilly or do we need methods that are SWE-specific?
- 4:00pm – 4:15pm
- Coffee break
- 4:15pm – 5:15pm
-
Ontologies and Understanding of Software Semantics
(Organizers: Dana Movshovitz-Attias and Tao Xie)
(Scribe: Zhilin Yang)
4:15pm – 4:35pm Talk by Jane Cleland-Huang: Leveraging Software Project Knowledge to Build Ontology
4:35pm – 5:15pm Open Discussion
- What are software engineering tasks that can benefit from a software ontology or semantic understanding?
- What are the unique characteristics of software entities that make them more/less susceptible for semantic analysis?
- Emerging NLP techniques that can be leveraged for software semantic understanding. Which ones have been successfully used by the participants?
- Open problems or challenges in this area.
- Available software ontologies or related resources.
- Evolving software ontologies. What changes can cause a software ontology to evolve over time? What are methods for accommodating such changes?
- What kind of collaborations would help move this area forward?
- 5:15pm – 6:00pm
- Free time
- 6:00pm – 9:00pm
- Dinner at Luc in the Madison Park neighborhood of Seattle; transportation from Microsoft and to Microsoft+hotel provided
Tuesday, October 27
- 8:00am – 9:00am
- Light breakfast in 99/1919
- 9:00am – 10:30am
-
Information Retrieval in Software Engineering
(Organizers: Denys Poshyvanyk and Dana Movshovitz-Attias)
(Scribe: Martin White)
09h00 – 09h20 Talk by Andrian Marcus: Overview of Text Retrieval Applications in Software Engineering
09h20 – 10h30 Open Discussion (sample questions are below)
Examples of successful ideas (applications of IR in SE): participants talk for 1 minute to give examples of their prior research projects
- Open problems or grand challenges?
- Emerging Information Retrieval techniques or approaches?
- What kind of collaborations would help move this area forward?
- What datasets and resources are available?
- Challenges in reproducibility of the experiments
- 10:30am – 10:45am
- Coffee break
- 10:45am – 12:15pm
-
Natural Language Programming and Semantic Parsing
(Organizers: Ray Mooney and Chris Quirk)
(Scribe: Gagan Bansal)
A set of 5 minute talks by:
- Ray Mooney
- Chris Quirk
- Gagan Bansal
- Yoav Artzi
- Percy Liang
- Srini Iyer
- Nate Kushman
- Yi Wei
Open forum discussion, seeded by particular topics:
- Data
- Compute cycles
- Techniques: Is it parsing? Is it MT? Is it program synthesis?
- Tools
- Evaluation metrics
- What are the relevant connections in SE?
- Dialog -- interaction strategies
- End-user debugging
- Where do we focus first? End users? Programmers? Power users
- Formalism / representation, especially for end users
- Adding I/O tuples or programming by demonstration
- 12:15pm – 1:15pm
- Lunch break
- 1:15pm – 2:45pm
-
Language Generation from Code
(Organizers: Dawn Lawrie and Graham Neubig)
(Scribe: Vincent Hellendoorn)
Mood setting talk: Survey of Methods to Generate Natural Language from Source Code, Graham Neubig (Slides)
Each of the other researches in the topic will be asked to produce at controversial statement or question to fuel discussion. Discussion will also touch on data sets and broader impacts. Other researchers will be asked about two weeks before the workshop to think about their controversial statement.
- 2:45pm – 3:00pm
- Coffee break
- 3:00pm – 4:30pm
Closing session
Venue
The workshop will be held at Microsoft Research in Redmond, WA.
The address is:
Microsoft, Bldg 99
14820 NE 36th Street
Redmond, WA 98052-6399
USA
Directions to visit the venue are available here.
Accommodations
We have reserved a block of rooms at the Courtyard in Bellevue Redmond.
To book, please call (800) 321-2211 and ask for the Microsoft NL + SE Room Block Oct2015 at the Courtyard in Bellevue Redmond. Or you can use this online reservation link
Please book by October 5th to ensure that you receive the block rate.
The address is:
Courtyard Bellevue/Redmond
14615 NE 29th Place
Bellevue, WA 98007
USA
Travel
Support for travel and accommodation are offered through a grant from the National Science Foundation. Covered costs include hotel, accommodation and local transit. Our budget can cover travel expenses up to:
- USD 1900 for international attendees
- USD 1000 for US West Coast attendees
- USD 1300 for attendees from other parts of the US
Attendees from the Puget Sound area should contact organizers if they need travel support.
Reimbursement
- Include paid receipts for airfare, hotel, airport shuttle, and meals.
- Note: no alcohol can be reimbursed.
- Use UC Davis Form here, sign as "Non Employee"
- Maximum budgeted amount for academic invitees: $1000 for West coast, $1300 for domestic attendees, and $1900 for international.
- Send to Jane Ryan, Dept of Computer Science, UC Davis, Davis, CA 95616; mark (Attn: NSF Workshop)
- Be sure to include return postal address, full name, and telephone number.
Sponsors
The workshop is being sponsored by the US National Science Foundation through Sol Greenspan and Tatiana Korelsky and Microsoft Research.
Steering committee
- Charles Sutton (Edinburgh)
- Daniel Tarlow (Microsoft Research, Cambridge)
- Dawn Lawrie (Loyola)
- Dennis Poshyvanyk (Willam & Mary)
- Ray Mooney (University of Texas Austin)
- Tao Xie (University of Illinois)
- William Cohen (Carnegie Mellon University)
Organizers
Prem Devanbu (UC Davis) and Chris Quirk (Microsoft Research) along with Dana Movshovitz-Attias (CMU) organized the workshop, under the guidance and direction of Sol Greenspan and Tatiana Korelsky from the NSF.