By Bob Violino
Contributing writer, InfoWorld |
Data science involves using scientific methods, algorithms, and systems to extract insights from structured and unstructured data. As a discipline, data science synthesizes mathematics, statistics, computer science, domain knowledge, and other inputs to analyze events and trends.
In a world gone digital, data scientists are among the most highly sought IT professionals. Fundamentally, a data scientist should be able to write clean code and use statistics to derive insights from data.
According to the career site Indeed.com, data scientists not only combine mathematics and computer science but must understand the industry they serve. Data scientists use unstructured data to produce reports and solutions related to their field.
According to Indeed, data scientists should be familiar with cloud computing, statistics, advanced mathematics, machine learning, data visualization tools, query languages, and database management. The ability to program with Python and R is generally expected.
Career site Glassdoor says the estimated total pay for a data scientist is $120,660 per year in the United States, with an average annual salary of $99,672.
The staffing firm Robert Half notes that landing jobs in data science, particularly at the entry level, is not insurmountable. Despite recent cutbacks, recruiting for the technology sector remains active, as IT employers are hiring at or beyond pre-pandemic levels.
“As businesses accelerate their digital transformation, data scientists are needed across all major business sectors—from technology and manufacturing to financial services and healthcare—as well as organizations in academia, government, and the nonprofit sector,” says Robert Half. “That’s because organizations of all types need to turn numbers into recommended strategies and actions.”
To find out what’s involved in becoming a data scientist, we spoke with Daryl Kang, data scientist at mobility-as-a-service provider Uber Technologies.
Daryl Kang is a data scientist for Uber Technologies.
Kang earned a Bachelor of Arts degree from the University of California, Los Angeles, where he majored in business economics with a minor in accounting. “I was a first-generation college student,” he says. “I graduated summa cum laude in 2.5 years, which allowed me the financial wherewithal to pursue graduate school.”
Kang went on to pursue a Master of Science degree in data science at Columbia University. Qualifying for the data science program required a foundation in math, probability, statistics, and computer science.
“I was originally motivated to pursue a career in banking and finance,” Kang says. “Having graduated with a degree in economics, I had assumed this to be the most natural career path.”
However, during a gap year after finishing college, Kang had the opportunity to work on personal projects that aligned with his passions. “I was motivated to major in economics after being inspired by the book, Freakonomics,” he says. “It showed me the power of data in answering questions that were universally applicable to any field.”
Around this time, Kang also discovered a passion for programming, after “running into the ceiling of what was possible with Excel,” he says. He devoted several months to learning how to program through free online courses.
“This set me on a clear path to eventually discovering the field of data science, and with it the clarity of recognizing it as a continuation of my passion for economics,” Kang says. “At this point, I was determined to pursue my graduate studies in data science to make the career switch.”
Growing up in Malaysia, Kang says he experienced a strict public education system, “where discipline was a key value that was instilled in me. This definitely set the stage for building a strong work ethic that helped in my data science career, since the role can be demanding.”
In addition, Kang’s experience in a liberal arts program at UCLA helped foster a sense of appreciation for other fields of study, and a general desire for learning. “This gave me the discipline, but more importantly the passion, to pursue continuous learning that is essential to keeping up with the field of data science,” he says.
Kang also notes that starting from a non-technical background helps him empathize with non-technical stakeholders, which he uses to communicate effectively in his role.
Kang’s first exposure to working in data science came in an internship with the entertainment company Viacom (now Paramount). He spent seven months working as a data scientist intern. “This was my first real experience with data science in the industry,” he says. “I worked on predicting box office revenues.”
The experience was instrumental in helping Kang bridge the gap between academia and industry. He was able to identify the gaps in his skill sets that he would need to close in order to succeed in applied data science, he says.
In 2018, Kang joined the media company Forbes as a data scientist, focusing mainly on building recommendation systems. One example was a system that recommends trending news articles to writers in the newsroom.
“There was a heavy emphasis on back-end engineering, and it gave me an opportunity to better improve my software engineering skills,” Kang says. “It was also an opportunity to experience the end-to-end lifecycle of delivering a data product, from setting up the back-end infrastructure, to parsing insights from the data, to surfacing those insights to the end user.”
To be effective in his role at Forbes, Kang needed to have a solid grounding in Python and software architecture.
After about three years at the company, Kang joined Uber as a data scientist in a role heavily focused on product analytics. “I worked specifically on merchant growth and acquisition. This meant that the deliverables were focused more on informing business decisions and making product recommendations.” Kang notes that data engineering was also a significant part of the role. “Data from a multitude of sources had to be consolidated to properly communicate the state of the business.”
At Uber, Kang says he has had to be well-versed in experiment design, “which forms a core part of Uber’s principles in making data-driven decisions.”
“Meetings, unsurprisingly, are a key part of the week,” Kang says. “These are opportunities to deliver reports, presentations, and build empathy for stakeholders.” Oftentimes these stakeholders are product managers, though it is not uncommon to collaborate with other job functions such as user experience researchers, product designers, or engineers.
“Depending on the projects at hand, the rest of the time could be spent doing analytics—for example running descriptive analytics to prepare a monthly performance report or diagnostic analysis to investigate a change in a metric—crafting presentations, or more specifically defining the narrative and arriving at recommendations,” Kang says.
“One of my favorite memories from my time at Forbes was from mentoring a team of graduate students through their capstone project as part of an industry outreach program,” Kang says. “It was refreshing to play the role of mentor for the first time, and it was as much a learning experience for me as it was for the students. That the team also won first place in the end-of-semester capstone showcase competition was just the icing on the cake.”
“Fortune favors the bold,” Kang says. “Many things seem insurmountable at the onset but will ease with time and repetition. Also, it’s important to know the difference between a positive and negative challenge. Quitting the wrong pursuits enables us to focus on the things that matter.”
Practically speaking, Kang recommends anyone interested in data science should start by learning Python and statistics. “If you’re undeterred and curious enough, you will naturally fall into the fields of data science and machine learning next.”
Copyright © 2022 IDG Communications, Inc.
Copyright © 2022 IDG Communications, Inc.