Planning. Pre-production preparation was conducted in terms of analyzing what was relevant for the class, and the content was designed to fit the length restrictions. A comprehensive script was drafted that followed a step-by-step practice in which details of all instances were discussed extensively to fit the existing content knowledge of all students. However, due to time constraints, the narrator did not reflect on the material or adjust to the requirements of the script. The presented material was quickly drafted by the narrator and the style of teaching within the video reflected that of in-class education, which has been proven by educational video experts to result in lower engagement than educational videos that are tailored with video production in mind (Guo et al. 2014). For example, further segmentation within the video to sub-segments was not possible due to the nature of the shot material, decreasing the amount of event boundaries, and thus reducing the possibility of mental structuring, as is evidenced by Zacks et al. (2007). In addition, the material was not reviewed after the first take further decreasing the quality of the shot material. This was expected to be reflected in the student feedback negatively.

Method. A modified version of the talking head method (Guo et al. 2014) was used with a green screen enhancement. The talking head method was deemed suitable for this purpose as student engagement has been shown to be high with this method. The green screen provided the post-production editor the possibility to decide the type of background and graphics that would support learning efficiency while minimizing any distractions. By minimizing any background distractions extraneous cognitive load (Van Merrienboer et al. 2002) was minimized while directing the student attention to the teacher and the presented graphics, which is in line with the coherence effect in which better transfer is achieved when extraneous material is excluded (Clark and Mayer 2008). The green screen method stays true to the beneficial aspects of the talking head method in terms of personalization benefits with an added bonus of not having to switch between presentation methods, as graphics can be naturally presented near the narrator, which is in line with the spatial contiguity effect (Clark and Mayer 2008) where graphics and text should be placed near one another to reduce unnecessary scanning. While having a green screen background and not presenting a tight shot of the narrator’s head increases the formality of the videos feel, it was speculated that the benefits from being able to see the lecturer and thus eliminating the need to switch between headshots and graphics content overshadowed the negative effects. As was stated in the research of Guo et al. (2014) picture-in-picture view might work better than the constant switching. Because the narrator was discussing specific complicated topics that would have made little sense without constant graphical support, a tight shot was not perceived to be the most efficient method. This modified talking head method could have been further enhanced by having the narrator interact with the objects shown next to him to provide a sense interaction (e.g. weatherman style), but due to the time constraints, this was not possible and the graphics were simply added afterwards to support the narration.

Control. Control was provided by the micro-level activities (Merk et al. 2011) of the basic control functions of Youtube media player. This required no pre-training as it can be generally understood that the target group is familiar with these functions.

Segmentation. The topic discussion video was the first part of a two-part complementary video series that were distinctly divided from one another, which is in line with recommended segmenting (e.g. Clark and Mayer 2008). The topic discussion video alone had a duration of four minutes and forty-five seconds, which is in line with recommended duration (Guo et al. 2014).

Visual elements. The narrator in the topic discussion video was not instructed to show enthusiasm (Guo et al. 2014) or non-verbal immediacy (Houser et al. 2007), but to act naturally, as there was no possible way to instruct the narrator to behave in a manner that would encompass these features- a process that would have taken a significant amount preparation and possibly scripted narration. However, this fact was assessed in the questionnaire and anticipated to be reflected negatively in the student feedback and thus, depending on student feedback, confirming the need for these to be taken into consideration when creating educational videos.

Graphics were used to support the narrator. Relational graphics (Clark and Mayer 2008) were used to depict relationships between quantitative elements (e.g. price and quantity) in the form of a table. Text (Mayer and Clark 2008) was used only when necessary to depict complicated formulas that would cause unnecessary cognitive load if provided verbally. Highlighting (Paik and Schraw 2013; Mayer 2005; Atkinson 2002; Craig et al. 2002; Jeung et al. 1997) was used in the form of increasing the brightness of the discussed parts of a formula or a graphic to direct student attention to specific parts of the video to reduce the need to scan for this information. Schematic animation was used minimally (Cheon et al. 2014; Tversky et al. 2002) to represent drawn motion (Ng et al. 2013) and a connection between elements within the video.

Audio. As there was no possibility to prepare the narrator thoroughly for the shooting, only high audio quality (Reeves and Nass 1996) can be assumed to be perceived beneficially. However, the audio quality suffered slightly due to a noticeable echo in the room, which cannot be erased in post-production. It was expected that the lack of audio – narration (Appendix 1.) elements being applied to the video, would result in lower satisfaction with the overall agreeableness towards the video content. Especially, the lack of conversational style (Clark and Mayer 2008; Beck et al. 1996), native speaker (Atkinson et al. 2005; Mayer et al. 2003), and enthusiasm (Guo et al. 2014) would be perceived negatively.

Hardware and software details. Video was shot with a Sony ILCE-5100 camera that provided high-quality Full HD video recording. Audio was captured with Zoom H5 audio recorder with a Zoom SGH-6 Shotgun Microphone Capsule, which provided professional quality narration recording. A standard three-point lighting method was used to illuminate the narrator by using three YongNuo YN-600 LED lights. Post-processing was done by using two video editing programs. Adobe Premiere was used to cut and combine video and audio. Adobe After Effects was used to key out the green screen and add graphics. The post-processing was done by the author who has several years of experience with the use of these programs.

During the shooting, it was hard to get the narrator to keep his or her gaze on the camera due to notes being presented behind the camera. Breaking eye contact diminishes the personalization of the experience and makes the video look less professional. Equipment such as Teleprompters and Eyedirect (Bloom 2011) could have helped to alleviate this issue.



Atkinson, R. 2002. Optimizing Learning From Examples Using Animated Pedagogical Agents. Journal of Educational Psychology, 94, 416–427.

Atkinson, R., Mayer, E., and Merrill, M. 2005. Fostering social agency in multimedia learning: Examining the impact of an animated agent’s voice. Contemporary Educational Psychology, 30(1), 117-139.

Bloom, P. 2011. ’Eyedirect: Solving The Problem Of Getting People To Look Straight Into The Camera.’. (website) (29.4.2015).

Cheon, J., Chung, S., Crooks, S., Song, J. and Kim, J. 2014. An Investigation of the Effects of Different Types of Activities during Pauses in a Segmented Instructional Animation. Journal of Educational Technology & Society, 17(2).

Clark, R. C., and Mayer, R. E. 2008. E-Learning and the science of instruction. San Francisco: Pfeiffer.

Craig, S. D., Gholson, B. & Driscoll, D. M. 2002. Animated Pedagogical Agents in Multimedia Educational Environments: Effects of Agent Properties, Picture Features, and Redundancy. Journal of Educational Psychology, 94, 428-434.

Guo, P., Kim, J. and Rubin, R. 2014. How Video Production Affects Student Engagement: An Empirical Study of MOOC Videos. (website) (29.4.2015)

Houser, M., Cowan, R. and West, D. 2007. Investigating a new education frontier: Instructor communication behavior in CD-ROM Texts—Do traditionally positive behaviors translate into this new environment?. Communication Quarterly, 55(1), 19-38.

Jeung, H., Chandler, P. & Sweller, J. 1997. The role of visual indicators in dual sensory mode instruction. Educational Psychology, 17, 329-343.

Mayer, R. 2005. The Cambridge handbook of multimedia learning. Cambridge, U.K.; New York: Cambridge University Press.

Merkt, M., Weigand, S., Heier, A. and Schwan, S. 2011. Learning with videos vs. learning with print: The role of interactive features. Learning and Instruction, 21(6), 687-704.

Ng, H., Kalyuga, S. and Sweller, J. 2013. Reducing transience during animation: a cognitive load perspective. Educational Psychology, 33(7), 755-772.

Paik, E. and Schraw, G. 2013. Learning with Animation and the Illusion of Understanding. Journal of Educational Psychology, 105, 278-289.

Reeves, B., Nass, C. 1996. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places. New York: Cambridge University Press.

Tversky, B., Morrison, J. B., and Be´trancourt, M. 2002. Animation: can it facilitate? International Journal of Human Computer Studies, 57, 247-262.

Van Merrienboer, J. J. G., Schuurman, J. G., de Croock, M. B. M. and Paas, F. 2002. Directing learners’ attention during training: effects on cognitive load, transfer test performance and training efficiency. Learning and Instruction, 12, 11-37.

Zacks, J. M., Speer, N. K., Swallow, K. M., Braver, T. S., and Reynolds, J. R. 2007. Event perception: A mind–brain perspective. Psychological Bulletin, 133, 273–293.