Literature Review


Previous research has defined multimedia and interactivity in a variety of ways. Furthermore, the relationship between the two is also of great importance and has been thoroughly debated over the years. The first portion of this paper will analyze numerous definitions and provide universal concepts of each. Beyond classifying multimedia and interactivity, it is also important to note the fields of study where researchers utilized these concepts, as well as what findings they discovered. Thus, the latter portion will analyze past research conducted in the fields of marketing, journalism, academia, and advertising, respectively.

Multimedia Defined

Brunye, Taylor, Rapp, and Spiro (2007) defined multimedia as any presentation that utilizes multiple formats in either one or multiple sensory modalities (visual, auditory, etc). Mayer and Moreno (2003) considered multimedia to be the combination of words and pictures, where the pictures can either be static (illustrations, graphs, charts, etc.), or dynamic (animation, video, interactive illustrations, etc.) It is important to note, however, that there are multimedia presentations where no text is present, but rather a combination of two other formats (an audio slide show without captions, for example). Therefore, while Mayer et al. have a more defined idea of multimedia, Brunye, et al.’s more broad definition successfully includes the non-traditional methods of non-text presentations and is therefore the most concise definition to date.

Interactivity Defined

Shedroff (1994) fleshed out the differences between passive and interactive media, but in terms of interactivity levels. He envisioned a continuum of interactivity, with passive experiences on the far left and interactive actions on the far right. Ultimately, Shedroff argued that the difference, which subsequently defines interactivity, is how much control is given to the audience, and the amount of choice that control provides. His considerations for interactivity included feedback, control, (co-) creativity, productivity, communications, and adaptivity.

Similar to the concept of control afforded to the user, Wise and Reeves (2007) utilized the terms passive and active media, declaring passive media as a user’s reaction to content controlled by someone else versus active media as a manipulation of content controlled by the user’s contemplated actions.

Yun (2007) referenced and built upon a similar definition from two separate researchers, Andersen and Sims, who in 1997 both defined interactivity dependent upon the user’s sense of control. Yun also noted a more historic definition of interactivity from Sheizaf Rafaeli, who in 1988 argued interactive communication as sequential messages responding to an initial message, otherwise known as feedback in Shedroff’s model. Yun continued by stating that while multimedia, or multimodal, presentations can have the impression of being interactive, they normally lack the core characteristics of interactivity (namely active involvement, nonlinear story structure, and role taking).

Chung (2008) and others distinguished interactivity from that of interaction (the traditional relationship between two or more people), and defined it as either being medium interactivity (user-to-system, U2S) or human interactivity (user-to-user, U2U). She argued that the presence of a technology channel in U2S and U2U models distinguishes it from traditional interaction. Chung continued by organizing interactivity by content where features are divided into three tiers: low (audio and video downloads, photo galleries, etc.), middle (content submission features, polls, etc.), and high (e-mail links, message boards, chat features, etc.).

Deuze (2003) also focused on content when defining interactivity, but from more of a user interface (UI) mindset. He used the terms navigational (hyperlinks, menu bars, etc.), adaptive (content submission tools, polls, etc.) and functional (messages boards, chat features, etc.) to describe different types of interactivity. It can be argued that Chung and Deuze’s definitions are interchangeable: low-navigational, middle-adaptive, and high-functional. Futhermore, Yun (2007) divided his definition of interactivity into two sub-categories: human-to-human, and human-to-computer. These terms can also be considered analogous with Chung’s usage of human interactivity and medium interactivity, respectively. Liu and Shrum (2002) included a third sub-category, user-to-message (U2M), which is related to Shedroff’s (1994) perspective in that it is the ability of the human to control and manipulate the message, or content.

Clearly, there are countless definitions of interactivity. Liu et al. noted this inconsistency and the importance of creating a universal definition for use in future analysis on the subject. They worked to combine the three traditional sectors of interactivity (U2S, U2U, and U2M) to create a three-dimensional construct of the term. Specifically, their definition states interactivity to be “the degree to which two or more communication parties can act on each other, on the communication method, and on the messages and the degree to which such influences are synchronized” (Liu and Shrum, 2002, p.54). Moreover, they specified three constructs of interactivity: active control, two-way communication, and synchronicity. Active control is similar to Shedroff’s 1994 definition of user control and how that directly influences the experience. Two-way communication refers to the ability of reciprocal communication seen in the U2U and U2M models, but referencing the companies behind the message in the U2M model rather than just the message. Lastly, synchronicity is characterized by the time lapse between the input and response and whether or not it can be considered nearly simultaneous. By incorporating not only the three types of interactivity afforded to the user, but also the three means in which they can interact, the researcher believes that Liu et al.’s definition is the most complete to date.

Multimedia in Journalism

In 2004, researchers Steve Outing and Laura Ruel used eye tracking to study information recall in text and multimedia stories. Specifically, they wanted to analyze difference in user retention between a text story and multimedia package. Therefore, they utilized two stories from The New York Times where the same story was presented in two formats, one for print (text) and one for online (multimedia). Outing et al. found that, on average, people were more likely to correctly answer questions regarding names and places if they had viewed the text version. On the other hand, test subjects who had viewed the multimedia presentation were more likely to correctly recall information regarding a confusing process or procedure. While this study setting is similar in scope to mine, they chose to compare and contrast a non-multimedia format with a multimedia format. In my case, I aim to analyze two multimedia formats, one with passive elements and the other with interactive capabilities.

Two years earlier, Ruel did a similar eye tracking study with Nora Paul analyzing information recall, stickiness (time spent on the site), and satisfaction differences between users viewing a static HTML site and an interactive Flash site (Ruel and Paul, 2002). They found that Flash users spent on average two minutes longer on the site than HTML users, and found their experience to be more enjoyable that the latter group. Furthermore, Flash users correctly recalled a higher number of unaided questions regarding the content, whereas there was no significant difference between users answering aided recall questions. Again, while an HTML site may be considered passive and a Flash site considered interactive, they are not considered multimedia in and of themselves, which is the main differential between this study and mine.

Chung (2008) researched patterns within the journalism sector amongst “engaged” readers, defined as those who utilize the interactive features on news organizations’ Web sites. Chung studied the use of a variety of U2S and U2U interactive features, such as polls, e-mail links, hyperlinks, download options, chat features, etc, and found that specific user characteristics largely determine the usage of interactive features. Specifically, gender, perceived Internet skill, and perceived credibility of online news were all positive predicators for interactive usage. It is important to note that Chung’s findings are based on readers of a small, mid-western U.S. city newspaper who opted in to the study from an online advertisement. This study pool may therefore be completely different than a randomized study set in a different location, such as an urban area or University setting. The study was also conducted in 2005, which may explain why she generalized her findings to assume that online audiences are not using interactive features extensively. Further research needs to be done within different study settings to verify if Chung’s findings are accurate half a decade later across all engaged media consumers.

While Chung researched interactivity within journalism on a broad level, other researchers focused on certain interactive elements, such as the use of hyperlinks (Ruel and Wojdynski, 2009), pictures (Wise and Reeves, 2007), and audio slide shows (Ruel, Holman and Wojdynski, 2009).

Deuze (2003) categorized hyperlinks as a type of navigational interactivity, which was the focus of Ruel and Wojdynski’s 2009 study. While they analyzed a variety of data sets, of particular interest to my research study is their evaluation of user attitudes toward the website (user satisfaction levels) and story recall (retention rates). While they found that the greater number of links led to more time spent on the site, there was no significant correlation between hyperlink density and user attitude or story recall. Therefore, the data suggests that navigational interactivity does not affect user satisfaction levels or retention rates. This finding supports my decision to not distinguish interactive from passive multimedia solely based on the presence of playback control, hyperlinks, and other limited forms of navigational interactivity.

Wise and Reeves (2007) utilized the notion of a user reacting to media versus controlling its onset to study the cognitive and emotional processing of pictures. They found that physiological arousal was greater when subjects clicked through the images (also defined as the active interaction based on Shedroff’s 1994 stipulation of user control). However, subjects gave higher arousal ratings for pictures when the computer controlled its onset (otherwise known as passive interaction). They had several hypotheses for their seemingly unexplained findings, such as the fact that the task of pointing and clicking increases physiological arousal, but the inherent knowledge of user control diminishes the user’s surprise, and therefore negatively affects their arousal ratings of the content. One notable section of the article highlights the need for future research to address how interacting with media affects memory, which is one of the two conditions analyzed in this qualitative study. They hypothesize that active interaction may result in worse memory since they found that computer onset (passive media) elicits orienting (the brain’s ability to mobilize cognitive resources for potential action), which results in the encoding of stimuli helping to improve memory recall (which corresponds to the researcher’s hypothesis).

Ruel, Holman and Wojdynski (2009) evaluated four conditions of audio slide shows to determine a variety of dependent measures, of which attitudes, content recall, and perceived involvement are of particular interest. While I defined an audio slide show as a passive media format, Ruel et al. added limited interactivity to two of the four conditions, including chapter tabs (Deuze’s 2003 navigational interactivity definition) and a scrubber bar to control the playback (Shedroff’s 1994 interactivity definition). They did not find any significant effects for the presence of either interactive component in relation to the user’s attitude toward the website. This goes against my hypothesis that interactive media will heighten user satisfaction levels (if user satisfaction can be interchangeable with user attitude). Furthermore, their data suggest that users had more correct recollections of the passive media conditions than the interactive ones, which also refutes my hypothesis. It is important to note, however, that I aim to compare and contrast a wide range of multimedia stimuli, including sites on the opposite ends of Shedroff’s spectrum (ranging from 1 to 4 in Figure 1), whereas the four conditions identified in this study are nearly replicas of one another with small variations to include the most basic of interactive elements (ranging from 2 to 3 in Figure 1). Therefore, it will be interesting to compare my data to theirs when taking this difference into account.

Multimedia in Marketing

Taylor, Rapp, and Spiro (2007) analyzed differences in retention rates (specifically, order verification, free recall, and format recall) of users who used a text-only, picture-only, and multimedia (defined as text-and-picture) instructional assembly pamphlet. They discovered that there was unanimously better recall and higher accuracy with test subjects using the multimedia pamphlet. However, they also noted that the majority of multimedia users incorrectly guessed that they were given the picture-only version. Brunye et al. noted that this might be because while the user benefited from reading the text of the multimedia version, he or she only remembered the visual component. This is an important area for further research because it begs the question whether a user finds a visual stimulant more important to his or her learning capability than text, or whether an accurate and complete definition of “multimedia” is not widely known amongst end users.

The researchers also posed the question whether multimedia was more effective due to the repetitive nature provided by combining a complimentary text description and picture. They tried to mitigate this difference by providing two copies of the text description on the text-only version, and two copies of the photo on the picture-only version. While this is an interesting technique, it could be argued that it was done in vain because a user’s brain may immediately recognize the duplication and disregard the second copy. However, the combination of a photo and text description is not visibly identical regardless of the content contained within each medium, and therefore the user’s brain would not register duplication of content. This issue is of importance in that one must distinguish between multimedia in which several formats are combined to display identical information and multimedia in which several formats are combined to display supplemental information. For instance, some multimedia presentations need to be viewed as a whole to understand the message (supplemental), while others could be split apart as stand-alone elements where interacting with one or the other would be suitable (identical).

Multimedia in Academia

Mayer and Moreno (2003) researched solutions to reduce cognitive load in multimedia learning environments through a series of studies. My analysis of retention rates draws specifically from this study, and generally from the education sector, as “meaningful learning” is defined in this article as being assessed through retention tests (recall what was presented) and transfer tests (apply the newfound knowledge to solve another problem). Mayer and Moreno constructed a cognitive theory of multimedia learning, in which multimedia requires the use of two separate sensory channels – the auditory/verbal channel and visual/pictorial channel – but the processing capacity available in each channel is limited. Furthermore, multimedia learning requires substantial cognitive processing in both channels, as well as in working memory, and therefore could be prone to cognitive overload. One of their solutions, the redundancy effect, states that narration is more effective than the duplication of narration and on-screen text. This is interesting in comparison to the research method of Taylor et al., who purposively duplicated content on the text-only and image-only sections to mimic the “duplication” from combining the text with the image. However, Mayer et al. might argue that this led to an increased cognitive load and negatively affected the data collected from the two single content formats. Thus, further research needs to be done regarding the duplication versus complementary debate of multimedia elements and their effect on learning.

Multimedia in Advertising

Liu and Shrum (2002) analyzed the effectiveness of interactivity in advertising campaigns. They found that interactivity enhances users’ self-efficiency, which results in better learning from their elevated confidence in themselves and enhanced motivation to learn. Furthermore, they noted that the user control evident in interactivity improves user satisfaction since the user has a direct impact on the experience. While they found overarching advantages for the use of interactivity, they also noted two personal variables that might negatively impact a user’s interactive experience. First, each person has a unique desire for control, and those who prefer little control will most likely feel uncomfortable in situations where they are afforded too much freedom. On the other hand, users with a high desire for control tend to show signs of depression when they are unable to control events in their lives, and would therefore have high satisfaction rates from interactive experiences. Thus, satisfaction levels should arguably vary depending on the user’s personality. Second, Liu et al. included computer-mediated communication apprehension (CMCA) as a trait that will greatly impact satisfaction levels in response to interactive presentations. Those that have high levels of CMCA, they argue, tend to avoid interaction and are less likely to enjoy such experiences. This demographic would include novice users of technology as well as those who deal with general communication anxiety, regardless of communication method. Chung’s finding that perceived Internet skill enhances the interactive experience also corresponds with their findings of confidence issues potentially from CMCA hampering results. Therefore, these findings suggest that questions in a pre-test questionnaire need to gauge personality traits and perceived comfort level with technology in order to weigh these factors against my hypothesis regarding user satisfaction.


As seen from the wide range of definitions and observations regarding multimedia and interactivity, it is important that both terms be clearly defined when analyzing their effectiveness. To date, Brunye et al.’s 2007 definition of multimedia as multiple formats utilizing one or more modalities is arguably the most complete representation. Moreover, Liu et al.’s 2002 findings regarding interactivity as a three-dimensional construct can also be considered the most accurate definition. Therefore, these two interpretations will be utilized in my research when selecting passive and interactive multimedia stimuli.

Furthermore, the majority of past research analyzing multimedia and interactivity has traditionally focused within one sector: Dueze (2003), Wise et al. (2007), Chung (2008), Ruel et al. (2009, 2009) in journalism; Shedroff (1994), Bucy et al. (2007) and Yun (2007) in psychology; Brunye et al. (2006) in marketing; Mayer et al. (2003) in academia; Liu et al (2002) and McMillan et al. (2002) in advertising. Therefore, user satisfaction rates and retention levels may differ depending on the context of the presentation and the intentions of the end user. For the purpose of this study, we will build upon the research in the journalism sector.

About the Researcher

Tracy Boyer is an award-winning multimedia technology strategist, specializing in the intersection of digital media and interactive technology. Currently, Tracy is the first MBA/MSIS dual master’s candidate at UNC-Chapel Hill, where she is studying General Management at Kenan-Flagler Business School and Human-Computer Interaction in the School’s Information Science program. Boyer is currently the managing editor of Innovative Interactivity, a widely read multimedia blog that she founded in 2007.

Previously, she reported on malnutrition in Honduras with The Pulitzer Center, was a multimedia producer at, served as the UNC correspondent for and interned with The Atlanta Journal-Constitution. In 2007, she was selected to participate in the Poynter Summer Fellowship. Boyer graduated with a multimedia degree from UNC’s School of Journalism and Mass Communication.

Feel free to email Tracy directly with comments, suggestions, and any other feedback related to this research.

View more of her work at