Today, I want to lay out exactly how I learned Python programming and became a successful data scientist. I like to think of it as my Cinderella Story because I feel like through hardship and lots of confusion came an evolution of sorts, and I feel like a princess frolicking in the field of data science now.
My Not-So-Tragic Backstory
In high school, I really wanted to get into programming because of the typical reasons a 15-year-old boy wanted to get into programming:
- I thought it was super cool
- I wanted to look like the hackers in the movies
But I always found it a little bit too intimidating, and I was always too busy with stuff like League of Legends and Mario Kart Double Dash to ever fully commit to learning programming. So that never happened in high school.
When I got to college, I decided to take the programming language C because:
- I needed the credits
- the 15-year-old inside of me who really wanted to be a cool hacker was like ‘yes… finally’ – wait till the ladies get a whiff of this coolness
So it happened – I took my first class in C. It was a lot of programming, and I hadn’t developed a Programmer’s Mindset yet so it was hard for me to understand a lot of the concepts.
When I first started programming in C, I was constantly flooded with errors. I heard every rejection possible from C: “type identifier expected,” “segmentation fault,” “syntax error,” “undefined reference,” etc.. It literally rained errors on my programming parade.
In C – you have to be crazy clear in your language. Let’s say you write something like “a = 5” and “b = 3” and you want the program to return “a + b”. You ask for a seemingly simple question of “What is a + b?” C just absolutely won’t have any of it.
Another weird thing about C is that you have to allocate memory before any user input. It means if you want to ask for someone’s name, you have to reserve that space before you ask for the name so the program can save it in the right spot. You also can’t reserve too little space because C then crashes. If you reserve too much space, C just hogs all that space for no use.
>> You can imagine it kind of like if you invite 2 friends over for a barbecue and you reserve 100 parking spaces for the occasion. That kind of just makes you a jerk. <<
But after the initial struggles of getting down the essentials, I kind of got into the groove of it all. I would sit at my computer with a nice beer and some classic rock music playing in the back, and would just write programs. (Yes, I did feel extremely awesome and smooth.)
After C, I took C++. Both of which are a relatively lower level programming language compared to a language like Python. Low level programming basically just means closer to the computer. It means less “human level” and easy to understand, and more on the processor level.
I developed a lot of interest in programming and the mighty power of being able to code anything I could come up with in my mind.
My Glass Slipper: Python
With C and C++ as my basis, I took a course in numerical simulations in mathematics, particularly focusing on non-linear partial differential equations. (“Nobody knows what that means.” – My Girlfriend, 2017)
Unfortunately, neither C nor C++ was a good fit for a simulation development. I looked all over the internet to find the best programming language, and fell upon Python. I looooooove me some Python. In particular, Python was a great choice because of its widespread use in the scientific community and all the of external libraries (numpy and scipy for mathematical implementations and matplotlib for data visualization) that have already been prepared and make simulating with Python that much simpler.
I learned two things: Python was so easy to learn, and Python was so much fun to play with. Honestly speaking, learning Python blew my mind.
Development in Python went much faster compared to C and C++, and the little nuances like a crazy amount of bracket placements, semicolons, and static variable type uses no longer cluttered my screen full of code.
Python was fun to program in because I always felt like I was speaking to my computer in plain old English, rather than trying to imitate its own made-up language.
Example- Python vs. C++
So even after my course finished, I stayed loyal and stuck with Python.
I’ve created my best and by-far most intricate programs in Python, and it had taken me much less time to learn than its lower level counterparts in C or C++. I can only recommend Python as a coding language for any and everyone looking to get into programming and data science.
Finding my Prince: Data Science
Don’t worry – now we’re getting into the Data Science part.
Long story short: After graduating from university, I decided I didn’t want to stay in Physics anymore, because the graduate options just weren’t very interesting. You either go deep into research, which has a bad reputation unless you’re really into that sort of thing, or you work in industry with optics or semi-conductors, neither of which are my favorites.
I decided I want to do something more practical, where I can quickly see the “fruits of my labor” or “programs of my labor,” if you will. (Hehe.)
I tried out an internship in the field of Internet of Things at a big consulting company, but that really was just not the right fit for me. I decided not to spend my time at large companies, and instead, find something that could be challenging and rewarding personally for me.
I stumbled completely by accident onto data analysis and soon after, got completely swept away in the incredible world of machine learning. I spent as much time as I could learning absolutely everything. All my free time was funneled into learning subjects related to data analytics: be that through books, videos or research papers (mostly a combination of all of them).
>> At one point though, learning wasn’t enough, I wanted to do something. <<
** Disclaimer: this part is a little nerdy. **
I looked around for free datasets to play with, finding Kaggle in the process, which by the way, is a great source for getting all sorts of random data including mushroom classifications, football games stats, nutrition facts of foods, and many more.
I also started playing with web scraping and APIs, because the free data available to download:
- wasn’t enough
- wasn’t the right data
- I wanted more
- I wanted my own data
I wanted the really good stuff like stock market and social media data, not mushroom classifications.
I played and tested so much at it was really enjoyable (yes, I’m a little geeky), and I was feeling pretty confident about how good I was at it. So I set my sights on professional (aka getting paid $$$$) challenges.
I started freelancing online doing programming and data analytics (completely undercharging, by the way), and got a job as a data scientist for an investment firm that wanted to do machine learning with weather analysis.
It was a great starting project, and a lot of my job was focused on scrapping, mining, and processing data (which, I won’t lie, is a huge part of data analysis) as well as doing things like building our database. I also worked on creating programs that could easily grab the desired data, and return them in a wonderful format that would be fed directly into a machine learning algorithm.
I wasn’t doing as much analysis as I’d wanted though. With my background in Physics, my strengths are big in terms of pattern discovery and understanding data and noise.
I didn’t want to just build our dataset: I wanted a huge dataset (> 3 gigabytes) that I could analyze and find cool/weird things in.
My Happily Ever After
After a while, I finally found the perfect company, and that’s where I am now doing data analytics in esports. (#dreamjob much?)
I work remotely, which means I can wake up at 9am, have a nice cup of coffee before starting my work. I do exercise, I make chili, and I play League of Legends during the day (My 15-year old self would be so proud).
And all that – the freedom, the flexibility, the money – all stem from learning Python.
The Moral: Always be trying to better yourself and your knowledge. If you don’t know something, figure it out, ask questions – find the answer.
- Are there things that others in this field know how to do that I don’t?
- Are these things critical?
My path was in no way clear. It involved trying lots of things out, sticking and digging deeper into the fields that I really liked that brought me to where I am now.
>> What was most important was confidence, and I gained that primarily through playing around with data sets and testing, which is what I can only strongly recommend to aspiring data scientists. <<
Worst thing that can happen is that it doesn’t work out the first time, you learn why, you improve, and you try again.
It can only ever go up from there.
Want more free help on getting started with data science?
If becoming a data scientist sounds like something you’d like to do, and you’d like to learn more about how you can get started, check out my free “How To Get Started As A Data Scientist” Workshop.
We go through everything we’ve covered in this blog post in more detail, dispel some common misconceptions, and give you a roadmap and checklist of what you need to do to get started to working as a Data Scientist.