When thinking about Massive Open Online Courses (MOOCs), it is easy to get caught up in how incredible it is that so much high quality content is being offered for free to anyone in the world. While free high quality educational content is truly a great story, there is a much more significant impact that is just under the surface in MOOC data, and may prove to be far more significant in the long run. The ability for MOOC providers to generate massive amounts of useful data from course enrollments ranging from 10,000 to 100,000 students will enable providers to not only lower the cost of education, but also improve outcomes. Via optimization, experimentation and personalization, MOOCs have an opportunity to dramatically improve the quality of education in a way that has not been possible up until now.

MOOC Data | Big Data Transform Education

Big Data can help MOOCs transform education through optimization, experimentation and personalization.

Optimization

Like most tech startups, MOOCs like Coursera collect data for every action (or inaction) performed by a student – when a student pauses a video, increases playback speed, answers a quiz question, revises an assignment, or comments in a forum. This microscopic level of data, when collected at the scale that MOOCs operate on, is perhaps the most valuable in identifying the root cause of failure by students. As Daphne Koller, cofounder of Coursera, points out: “If two students in a university class of a hundred give a wrong answer, you would never notice, but when two thousand people give the same wrong answer, it’s kind of hard to miss.”

The widespread reach of MOOCs makes it possible to optimize courses by recognizing areas where content is insufficient to deliver on learning goals – something that would be impossible to identify in a classroom with 20, or ever 200, students. If a test question is answered incorrectly, or if students lose focus during a specific point in the course, data can direct the course creators to go back into the curriculum to add or modify the content so as to remove the “swiss cheese gaps in the foundation,” as Salman Khan of Khan Academy so eloquently puts it. More than anything, data and scale will enable teachers and instructors to have actionable feedback on what is, and what is not, working.

MOOC Data | Salman Khan | Khan Academy | Daphne Koller | Coursera

Salman Khan of Khan Academy & Daphne Koller of Coursera have access to massive amounts of useful student data.

Experimentation

If you read about online startups, you may have heard of the concept of A/B testing. The concept is simple: rather than have product managers make the final decisions on the best way to achieve a given user goal, it is far more efficient to try multiple versions of the same page, and then let the data drive which is the best decision. This methodology  of data-driven education is only possible when you have substantial scale – hundreds or even thousands of users passing through the different versions of your pages – so that you can be confident that the results are significant. Education, which typically is done in classrooms of 20, or perhaps sections with a few hundred, has not previously offered the scale necessary to attempt this strategy. MOOC data can change that.

And MOOC providers are taking advantage of this scale to experiment, generating unexpected results. A concrete example: you might expect students would learn better when lessons were offered in color. However when Sebastian Thrun – founder of UdacityA/B tested a color lesson vs. a black and white lesson, he found that “Test results were much better for the black-and-white version… that surprised me.” In this particular case, there could be many theories as to why the black and white version produced better results, but in many ways it doesn’t even matter – data can guide us to the best decisions.

Thrun's A/B Test at Udacity

Thrun’s A/B Test at Udacity revealed insightful data about student engagement.

Andrew Ng used A/B testing at Coursera to experiment with email reminders to increase engagement. Originally, Coursera would email students to remind them of upcoming course deadlines. After some testing, this was substituted when they found that e-mails summarizing students’ recent activity on the site resulted in an engagement boost of “several percentage points.”

Using such experimentation, MOOCs can continuously improve courses, but more importantly, they can improve the entire educational process, and allow data rather than intuition drive results. Whereas most of the classes you took in college likely changed very slightly from semester to semester, this new data will allow MOOC professors and administrators to evolve their classes on a weekly basis. Although a typical A/B test focuses on a very narrow hypothesis (i.e. blue button vs. green button), over time hundreds of such tests could eventually evolve the very nature of online courses themselves.

Personalization

Anyone who works in analytics online will be able to tell you that no single web experience will work for all users – people are too different, and each of us has unique preferences regarding how we consume content and make decisions. It’s the same with education. Online courses driven by software make it possible to remove the “one size fits all” model and instead deliver content to students in the format and method that most suits the way that they learn.

David Kuntz, Head of Research at Knewton explains: “Many educators believe that different students learn in different ways. Some learn best by reading text, others by watching a demonstration, others by playing a game, and still others by engaging in a dialogue. A student’s ideal mode may change, moreover, at each stage in a course—or even at different times during the day. A video lecture may be best for one lesson, while a written exercise may be best for the next. By monitoring how students interact with the teaching system itself—when they speed up, when they slow down, where they click — a computer can learn to anticipate their needs and deliver material in whatever medium promises to maximize their comprehension and retention.”

This type of personalization is likely some time away, but it’s easy to envision how your past learning experiences could be used to improve the way you are taught in the future. This becomes even more significant when we consider online learning becoming a part of the K – 12 education system, and being able to build a learning profile that takes into account many years of your life. In this new model, content and delivery can be better customized for each student, so if one student required visual aids to learn, whereas another student prefers to read, and another student requires an assignment, each will get what they need in order to learn and succeed.

Personalization can help modify content to suit the different learning needs of individuals.

Big Data, not Video, will Transform Education

It’s easy to think that the most significant aspect of MOOCs is the OOC, or the open online courses. However, that misses the bigger picture. Open courseware (OCW) has been around for nearly a decade, and high quality content has been available for free or at a low cost for much of that time. Content, while important, is only part of the story. Of greater importance is the M in MOOC – the fact that the courses are massive, and offer enough scale to be able to test, optimize and improve educational outcomes in a way that has not been possible before in human history. Like Amazon has changed how shopping occurs, and Google has built algorithms that know what you are looking for before you even search for it, the dream with online education is that through experimentation and optimization, we will be able to offer personalized educational experiences that are far more effective than anything traditional educational systems have offered before.