Since chatbots have moved into the public realm, there has been interest in evaluating how users interact with chatbots. For example, after analyzing conversations with the chatbot jabberwacky, one study found that the topics and style of conversations were broad (De Angeli & Brahnam, 2008). Users displayed different attitudes ranging from nice to nasty and derogatory. It was often found that users switched style and personalities during conversation with the chatbot. In one experiment, users were found to continue to abuse chatbots longer than they would abuse another human. Some of the reactions did not appear driven by specific reasons.
Other studies have found specific reasons for user disenchantment with chatbots. Another researcher conducted an observational study with an ELIZA-style chatbot, using a systematic, but subjective evaluation (Kirakowski, O’Donnell, & Yiu, 2007). Fourteen college students interacted with the chatbot for 3 minutes, after which participants were given a copy of their conversation transcript and were asked to identify unnatural examples. Essentially, they evaluated the human-chatbot interaction as compared to human-human interaction, and identified general differences in interaction and specic discrepancies. “Maintenance of themes, failure to respond to a question, appropriately responding to social cues (questions), use of formal or colloquial language, greetings and personality, offers a cue, phrases delivered at inappropriate times, damage control.” (Kirakowski et al., 2007). Other users may find a tool unsatisfactory because it did not answer questions accurately (Abu Shawar & Atwell, 2005).
The specific reasons cited in these studies provide a focus for researchers to address their efforts to improve human and chatbot interaction. The following section presents different ways in which developers are attempting to built better chatbots.