  1. Barany, A. ; Damsa, C. (Ed.)
    Regular expression (regex) based automated qualitative coding helps reduce researchers’ effort in manually coding text data, without sacrificing transparency of the coding process. However, researchers using regex based approaches struggle with low recall or high false negative rate during classifier development. Advanced natural language processing techniques, such as topic modeling, latent semantic analysis and neural network classification models help solve this problem in various ways. The latest advance in this direction is the discovery of the so called “negative reversion set (NRS)”, in which false negative items appear more frequently than in the negative set. This helps regex classifier developers more quickly identify missing items and thus improve classification recall. This paper simulates the use of NRS in real coding scenarios and compares the required manual coding items between NRS sampling and random sampling in the process of classifier refinement. The result using one data set with 50,818 items and six associated qualitative codes shows that, on average, using NRS sampling, the required manual coding size could be reduced by 50% to 63%, comparing with random sampling. 
  2. Barany,  A. ; Damsa, C. (Ed.)
    Analysis of policy ecosystems can be challenging due to the volume of documentary and ethnographic data and the complexity of the interactions that define the ecology of such a system. This paper uses climate change adaptation policy as a case study with which to explore the potential for QE methods to model policy ecosystems. Specifically, it analyzes policies and draft policies constructed by three different categories of governmental entity—nations, state and local governments, and tribal governments or Indigenous communities—as well as guidance for policy makers produced by the United Nations Intergovernmental Panel on Climate Change and other international agencies, as a first step toward mapping the ecology of climate change adaptation policy. This case study is then used to reflect on the strengths of QE methods for analyzing policy ecosystems and areas of opportunity for further theoretical and methodological development. 
  3. Barany, A. ; Damsa, C. (Ed.)
    In quantitative ethnography (QE) studies which often involve large da-tasets that cannot be entirely hand-coded by human raters, researchers have used supervised machine learning approaches to develop automated classi-fiers. However, QE researchers are rightly concerned with the amount of human coding that may be required to develop classifiers that achieve the high levels of accuracy that QE studies typically require. In this study, we compare a neural network, a powerful traditional supervised learning ap-proach, with nCoder, an active learning technique commonly used in QE studies, to determine which technique requires the least human coding to produce a sufficiently accurate classifier. To do this, we constructed multi-ple training sets from a large dataset used in prior QE studies and designed a Monte Carlo simulation to test the performance of the two techniques sys-tematically. Our results show that nCoder can achieve high predictive accu-racy with significantly less human-coded data than a neural network. 
