Software Engineering Research Exploiting Naturally Produced Artefacts

Over many years, software engineering research has used a range of practitioner-generated content e.g. interviews, surveys, project documents, ethnography. More recently, an emerging body of software engineering research has begun to use grey literature. One frequently-occurring type of grey literature is the blog post. Whilst there are prospective benefits to the use of grey literature and blog posts in software engineering research, there are also concerns about the quality of such material.

Aims

We have established a long-term research programme, SERENPA, to look at the opportunities to exploit naturally produced artefacts from practitioners in software engineering research. The aims for this research programme are:

  1. To investigate the desirability and feasibility of using naturally occurring evidence produced by practitioners in software engineering research.
  2. To develop theory, methodology and other resources (such as software tools) to support the use of natural evidence.
  3. To conduct and report empirical investigations that use natural evidence.
  4. To evaluate the actual benefits of natural evidence for software engineering research.
  5. To review and incorporate related research from other fields (e.g. health sciences).
  6. To contribute to the research and practice communities.

To scope the resources of the SERENPA programme we focus initially on the use of a particular kind of natural evidence: blog–like documents and their content.

Methods

We have developed, or are developing, a range of resources for software engineering researchers in relation to blog-like content, including:

  • a review of related research
  • the identification of a range of problems and challenges in the use of blog-like content definitions
  • data models of content
  • process models of the generation of content
  • models of threats to the quality of content
  • criteria for assessing quality
  • heuristics for the online search and subsequent selection of content
  • guidelines for using blog-like content in research
  • software tools

Results

We have published several peer-reviewed papers since the formation of the research programme. These papers: introduce a methodology to qualitative analyse blog-like content, summarise benefits and threats to the use of blog-like content, propose a preliminary set of guidelines for the use of blog-like content, recommend heuristics for the search and post-search selection of blog-like content, and report preliminary findings on the degree to which bloggers cite research on software testing.

Immediate future work

  • Preparing manuscripts on credibility criteria for assessing the quality of blog-like content (this work is based on a literature review and a survey of software engineering practitioners)
  • Developing and evaluating a set of guidelines for the use of blog-like content in software engineering research
  • Refining definitions and data models of blog–like content, and process models on the generation of blog–like content.

Longer-term work

To widen the scope of the research programme to consider other kinds of naturally occurring evidence, such as grey literature.

Keywords

social media, blog post, qualitative analysis, case study, survey study, case-survey, systematic review, grey literature, multi-vocal literature review