The majority of current chatbots use a rule-based approach to interact with the user. In order to imitate human conversations, developers use a linguistic model combined with computational algorithms to build chatbots. Depending on the chatbot, rules can be based on simple textual pattern matching or complicated rules based on inference mechanisms (De Angeli & Brahnam, 2008).
Chatbots analyze input and sends is to a dialog manager, which enables the chatbot to provide contextual information back to user. Generally behind such dialog managers lies a complex-rich rule-based system.” (Gandhe & Traum, 2007). Chatbots view utterances as a sequence of tokens (divided up sentences) and methods are applied to these tokens such as random bot, nearest context, segmented-nearest context, and segmented-random. Random bot replies with a random utterance from a list of replies (Gandhe & Traum, 2007). Nearest context chatbots possess local coherence and so do not choose a random reply but choose a reply using a vector space model that has the nearest context to the current ongoing dialog. Segmented-nearest context chatbots try to use the broader context of the dialog as well as local coherence. Segmented-random chatbots use only global coherence so keeps coherence, but selects an utterance randomly that matches the context.
Pattern matching in chatbots means that they determine the most relevant keywords extracted from the last phrase inputted by the user and generate responses according to if-then-else” rules (Zdravkova, 2000). Their ability can be enhanced by repeating slightly modiffed replies, and including statements that define the direction of the conversation with the user (Zdravkova, 2000).
AIML files contain the rules for chatbots (A.L.I.C.E. foundation website, 2009). The language AIML (first used with ALICE), is an extension of XML. AIML allows the use of categories and topics, and has iterative capabilities and logic functionality (Shawar et al., 2007). It consists of data objects made up of units called topics and categories. Categories are the lowest-level and are rules for pattern matching (user input and template to generate a response). Topics are optional and are a top-level unit with a related set of categories.