It’s about time we stop using the term Citizen Data Science.

Amateur. Interim. Acting. Citizen. These are all terms that imply someone only has temporary control until the “real people” who know what they are doing take over. This data-driven culture has produced a great demand filling open data science positions. There are numerous software vendors and startups attempting to improve the data ecosystem and lower the entry point for data scientists, thus the required background and skill set for these jobs have been lowered in recent years with technology advancement. The data science community should encourage more individuals to get better training in statistics and computer science techniques. Let’s stop using the term Citizen Data Science; it alienates those who want to get into this field.

Given that the data science field no longer requires a PhD, some researchers are labeling these candidates “citizen data scientists” because they understand the applications, but not the underlying math and statistics. This is an unfair label because as technology continues to automate certain processes that currently were manual previously.

For example, software engineers actually create software with other software, like Visual Studio or a text editor. The editor converts lines of code into machine code (i.e. 1’s and 0’s), compiles it and shows the output. How many engineers actually know machine code well enough to create great new applications?  They may have learned basic machine language in school but not necessarily how to write applications.  It can be done, but it would take years to complete when it would take a computer just seconds.  So is it fair to call them citizen software engineers because they don’t compile machine code manually?

The best part about technology is that it’s constantly evolving to improve our lives every day, often at an overwhelming pace. If technology wasn’t designed to produce advancements that make our lives easier, we would all be using command line tools and writing in Fortran, LISP and COBOL. No thank you.

Part of this explosive growth has been powered with cloud infrastructure vendors like Amazon Web Services, Microsoft Azure, Rackspace, and many others. Just ten or fifteen years ago, nearly every company was still on-premises with expensive data centers and paying out the nose for their ERP software. In addition, these companies had to hire many FTEs just to keep operations running. The software, hardware and personnel to keep these systems running likely cost tens of millions of dollars at an entry point barely a decade ago. Now with cloud hosting infrastructure, companies could start operations with thousands of dollars, not millions.

Netflix, for example, which is hosted on Amazon Web Services, consumes roughly 35% of ALL internet traffic and doesn’t even have its own data center. Netflix does employ thousands of infrastructure engineers and developers to sustain their operation, but they do not own any hardware or expensive software like Oracle to keep their operations running. Yury Izrailevsky, the company’s VP of cloud and platform engineering, wrote in a blog post, “Supporting such rapid growth would have been extremely difficult out of our own data centers; we simply could not have racked the servers fast enough.”  So if they don’t manage their own hardware technology, should we call them a citizen cloud application?

Of course not. Netflix made the move to AWS because of its lower cost, higher efficiency and unparalleled scale. Netflix could host its own applications, but it would not be able to grow the business because of this limited technology. The exact same thing can be said about the term Citizen Data Science. What were archaic workflows ten years ago are now automated with advance in technology. So why do we punish new entrants in this field with a terrible title of Citizen Data Science?

Data Science is a field, not a position. So it’s time to put a stop to the effort to label the quasi-data scientist.  If individuals aren’t well versed in statistics but are in charge of finding trends in data for their company, let’s call them Data Analysts. Heck, even the term Business Analyst is better than Data Scientist if they don’t have the skill set. Technology will continue to improve workflows, making our jobs today obsolete.

Guess how many open jobs there are with the title “Citizen Data Scientist” on LinkedIn Jobs, Indeed, ZipRecruiter, The Ladders and Glass Door? None. Why do you think that is? Because no one wants to be labeled with such a condescending title. And no company wants to hire someone that doesn’t know what they are doing.

With MOOCs like Udemy, Udacity, edX and Coursera, people (myself included) from around the globe are learning how to improve the world around us using data. A worldwide movement is underway, and it’s very exciting to be a part of. So let’s make everyone feel involved. The fact that data scientists are no longer required to have an advanced degree will actually have a beneficial impact on the industry rather than a disadvantage.

The data science industry will continue to evolve and needs more men and women to apply their passion for data to gain the skills to become a data scientist.  Using the term Citizen Data Scientist alienates people from pursuing a career in data science, which can hurt the development of our data-driven culture. The data science field should value diversity in both quantitative and qualitative acumen. We could have used the terms citizen computer scientists or citizen statisticians. But parts of several industries came together to form a new field, data science. So from a field that was developed outside of a traditional education track, it’s about time we stop using this condescending, inappropriate, and hypocritical label.

Recommended Posts