Anyone Can Do Data Science

The animated film Ratatouille (2007) tells a story about the passion for cooking. Chef Auguste Gusteau, one of the main characters, has repeatedly said his famous motto: “Anyone can cook.”

A similar belief is held by the wonderful people behind the Data Science Program (DSP). We believe “anyone can do data science.”

The Dilemma We Faced

The business and the public sectors need data scientists, a lot of them, and now. However, it takes years to train a data scientist with in-depth skills and experience in a variety of knowledges, such as statistics, data mining, machine learning and data visualization.

To solve this dilemma, we took a new approach: team training.

First Thing First: Where Are The Teachers?

To design a course, or any course, in data science, the first challenge we encountered was finding the right lecturers.

Our courses focused on hand-on practical training, not scholarly debates. Therefore, we needed to find people who met the following criteria:

  1. Familiar with some cutting-edge data kung-fu in a particular field
  2. Have industry experience
  3. Know how to teach effectively

It turned out Number 3 was the hardest condition. However, through a few rounds of open invitations, we were fortunate enough to recruit several industry veterans well-suited for the job. Namely, Jerry, Rafe and Johnson.

We gave the lecturers a challenge: If you only have three hours for XYZ, what will you teach? (XYZ can be statistics, algorithms or any other data science related field)

We asked these data gurus to work together on course development. Finally, they came back with an innovative answer: the Data Team Training.

Course Planning on the Whiteboard

The Data Team Training

In December 2013, we launched the first course, the Data Team Training, a six-weekend course providing on-the-job training regarding the basics of data science.

From a highly competitive sign-up list and through a heated discussion, the “admission committee” hand-picked 33 people with a variety of specific skills, including data engineering, data analysis, visual design, and storytelling.

It is noteworthy that women accounted for 24% of the enrollment (8 out of 33). It was not just about gender equality, but also about having the magic of a female touch while framing and solving problems.

By skills and by gender, we divided these professionals into some sort of “Delta Force” teams. Each team is composed of four types of position: data hygienist, data analyst, visual designer and campaigner. The team composition was inspired by this great article from Harvard Business Review.

Some of the trainees had advanced degrees in statistics or computer science, some were well-versed in Photoshop, Illustrator or d3.js, while some of them were seasoned product managers by trade.

In addition to the differences in skills, these professionals came from various industries, including telecommunications, healthcare, finance, insurance, e-commerce, advertising, high-tech manufacturing, research institutions and news media.

In order to promote the importance of data literacy in public policy, we offered full scholarships for three selected government and non-profit organization employees.

Over the course of six weekends, we taught these professionals how to ask right questions, feel the data, learn some data-oriented skills, and conduct role-play problem-solving group projects.

First, the lecturers walked the trainees through the scientific method, how to “feel the data”, some of the most important concepts in statistics and data mining, and a framework for data visualization. These lectures were coherent and loaded with real-world data from government and business.

Role-Playing Group Projects

Then, the fun part began.

We asked these teams to choose their roles, such as a government agency, a non-for-profit organization, or maybe an entrepreneurial startup. Then we asked them to design a “minimum viable (data) product” to solve a real-world problem, while utilizing open data and any other data source they could manage to get their hands on.

The teams had to survive three brutal rounds of sales pitch, in order to gain supports from their peer trainees. Since they were all working professionals, they had to set up their own team meetings between classes and during the evenings, face-to-face or via Google Hangout.

The Team Project

The Data Fiesta

The climax finale of this six-weekend course was what we called the Data Fiesta.

On the last day of the course, it was the Data Fiesta. It was the time for the teams to shine. In other words, the teams had to demonstrate what they had learned and achieved through the presentation of a data product.

The Data Fiesta was a public event, and the trainees knew it since Day 1.

The design of such event provided both motivation and pressure for trainees to exceed themselves beyond expectations, and under a pretty tight deadline.

The results were well-received. Over 100 people attended the Data Fiesta. The audience enjoyed the team presentations and got to eat some finger foods with the newly graduates of the DSP.

A Scene from Data Fiesta

What We Learned

Through the preparations and execution of the first-ever DSP course, the Data Team Training, we have learned two things:

  1. It takes a team to teach teams.
  2. Anyone can do data science.

The DSP Team

Behind the scene, the Data Science Program is the fruit of a huge team effort. The Data Science Program is brought to life only because we have a fantastic team. A team composed of always-joking data scientists, innovative marketing talents, and super-pushing administrative staffs.

We also proved that given just enough exposure to a carefully designed education program, anyone, even who has troubles with EXCEL, can do data science, and feel proud of himself or herself.