Today’s article is about the two main traits a data scientist must have to become a to-die-for data scientist.
What’s a to-die-for data scientist? It’s basically a Data Scientist that every company wants/NEEDS to have on their team. You are the ultimate data science package – you understand difficult concepts, you know the basic techniques, you can analyze the data in a fresh perspective, and you use your brain in a systematic manner.
You get recruitment emails everyday and Linkedin requests from recruiters every week asking if you’re ‘available to chat’. Everyone seems to want some of your cool analysis action.
So – what are the two traits that you need to become a to-die-for data scientist?
Is it a Degree or Years of Experience? (Spoiler: No.)
In many fields, there’s constantly talk about how you can’t get far without a degree in the field – whether it’s a Bachelor’s, Master’s or even a PhD. You need to know all the fundamentals; you need to know the rules, or rather, the “laws” of the fields.
And sometimes, that is true.
Can you imagine trying to develop a new theory in Physics without an understanding for how things work right now? Bizarre.
Or could you imagine trying to create a medicine for dogs without knowing how to put together the chemical compounds to concoct an effective medicine?
There are many professions where experience and foundation knowledge in the field is vital for you to perform well, but there are also many fields were that just isn’t true.
Fortunately, Data Science is one of them.
Don’t get me wrong: you will 100% benefit from playing around with data sets and teaching yourself standard data science techniques.
It is, of course, super important to know and understand both basic, and also advanced data science techniques, but it’s not important for the same reasons as for the Chemists and Physicists.
>> You want to be skilled at data science techniques because they are tips and tricks; they are guidelines developed to help you analyze a problem. <<
If you want to analyze something, the real question to ask is “What do I want to know?” Either you have a concrete answer, in which case you know exactly what to look for, or you say, “I’m not sure yet, I want to uncover things hidden in the data.”
Either way, you highly benefit from knowing standard methods of approach. If you know standard techniques of testing for things, you are able to just pick the most appropriate, and have your solution in no time.
Knowing the general guidelines allows you to have a plan of attack, similar to how in poker, you can bring your cheat sheet card that tells you precisely when to hit and when not to.
But just knowing the standard methods of approach is not really what makes a great data scientist.
The 2 Key Traits
A great data scientist is so much more than that. In particular, they possess two distinctive traits. These two traits belonging to every successful data scientist are: 1) creative thinking and 2) logical thinking.
Don’t roll your eyes. Stay with me here.
To become a fantastic and all-around data scientist, you need to be able to do more than just apply standard techniques. Rather, and much more importantly, you should be able to create both a logical, as well as a creative approach to analyzing your data.
Essential Data Scientist Trait #1: Creative Thinking
Without creativity, there is just, frankly, nothing. You’ll look at the data and that’s just what you’re going to see: data.
Creativity allows you to warp information, twist it, manipulate it, transform it into something valuable. Often times, as a data scientist, what you’re given is just a whole load of messy, complicated, noisy, raw data, and it is your job to find something valuable to present as a finding.
If someone gives you a huge mountain of raw data and numbers, you need to be able to go through it all this data, work your magic and draw some sort of conclusion. For example, ‘I’ve discovered that panthers are much faster than sloths, but only on land. Here is the data to back that claim up – here is my complete analysis report breaking down each aspect.”
And you honestly can’t do any of that without creativity. You need to be able to look at the data you’ve been given and play around with it enough that you are able to discover something new.
Essential Data Scientist Trait #2: Logical Thinking
But having just creativity isn’t enough. You need to reel in that creativity to keep it from going absolutely haywire. In order to reel it in, you need logic.
You need logic to process the information yourself and ask if the conclusions you’re drawing make sense.
Think the conclusions all the way though from A to Z. Does the information make sense given what you know about the data, the variables, the factors?
It is very important to remain skeptical of your findings and to test everything extremely thoroughly to ensure that you’re not finding a false-positive or simply seeing a pattern where there just isn’t one.
Data Science without logic and creativity is just plain old data. The logical and creativity parts are necessary when you want to properly process your data and evaluate your outcome.
>> With creativity and logic, you can find real reason and answers behind your data. You can look for things that are hidden, buried deep within the statistics, and find real gems. <<
Being able to identify and differentiate between what is important, and what is background noise, will help you get that ground-breaking solution, rather than being embarrassed when you find something that isn’t actually there.
Logic and Creativity Test
Challenge time! I’ll give you a problem, and I want you to walk through it with me as we follow the process of using our brains to be both logical and creative.
Key Question: Does the result you have in front of you make sense?
Let’s say you have a data set with one week of data for public transportation usage in a busy city.
1. Our Assumption: You would probably assume that there is an increase in public transportation usage in the morning and evenings considering standard working hours.
2. Our Problem: You find that this normal spike didn’t occur on Friday. The data shows that on Friday, people used public transportation on a schedule similar to Saturday and Sunday.
Is it right to conclude that everybody works from home on Friday? Perhaps some people don’t like coming into work on Friday, or there’s a lot of car-sharing going on, so coworkers can head to a bar or the game after work on Friday without having to leave their vehicle at the office?
3. Using Logic: Sure, there are many different possible explanations that may pop into your head (there’s that creativity kicking in!), but logically, the first thing you should do is to look for the date of the Friday in question. You find the Friday in the data and see – oh, it was a public holiday on Friday!
It’s so key to be able to look at your results logically in step-by-step process, try to dispute your hypothesis by asking yourself if your conclusions make sense. Is this really a good, plausible explanation?
However, even more importantly, is the ability to think creatively.
What if Friday hadn’t been a public holiday?
You’d need to now uncover the real reason for the drop on Friday, if there even is one. Perhaps it was by some weird random chance that many people decided to take a long weekend.
Improbable, yes. Impossible, no.
4. Using Creativity: Sit back and think about possible reasons for a little bit. If you find it hard to sit and be creative, take a break from your analysis instead. Maybe go outside, and take a nice refreshing walk. Perhaps you end up at the bus station to go get some groceries and clear your mind.
As you wait for your bus to arrive, you notice that the bus is late. You want for the next bus, but it also doesn’t show. You get annoyed and think “this stupid public transportation system, they’re probably on strike again, and then it hits you.
What if the train conductors were on strike on Friday and people weren’t able to get to work the regular way? What if people took other methods of transporation to get to work and avoided the trains completely.
When you get back to your analysis, you look at some other indicators and deduce there’s a pretty high likelihood that this is what had happened. You look it up online, and see that there was indeed a strike on that specific Friday!
The ability to think creatively and be able to come up with several reason for something happening is very important.
You have to be constantly letting your mind develop its ideas, connecting the dots and figuring out the puzzle.
Have you ever watched or read a Sherlock Holmes story, where Inspector Lestrade asks Sherlock “Any theories?”, and Sherlock replies with “So far, I’ve got 7”.
That has got to be you.
In the first stage of analysis, you don’t know the answer to your data. But as the stages of analyses progress, creativity and logic play instrumental roles.
Being able to expand your field of probable answers, and then using logic and reason in addition to some analytical testing, will allow you to quickly weave out the wrong or inconsistent theories, and stick with the stronger ones.
Always remember that the computer or the program is not responsible for doing the data analysis portion.
You are responsible for doing the data analysis. You’re the Data Scientist!
The computer’s job is to crunch numbers fast, and provide you with results and the visualizations that you want to see. Your job is to draw conclusions from those results, to find patterns in the visualization, and then to test to see if your hypothesis is correct.
Being able to use the computer effectively to calculate and show the most relevant results, and then being able to look at those results, and think – that’s what makes a truly great data scientist.
Want more free help on getting started with data science?
If becoming a data scientist sounds like something you’d like to do, and you’d like to learn more about how you can get started, check out my free “How To Get Started As A Data Scientist” Workshop.
We go through everything we’ve covered in this blog post in more detail, dispel some common misconceptions, and give you a roadmap and checklist of what you need to do to get started to working as a Data Scientist.