Data science is a topic that has boomed over the past few years, and it seems that more and more people are looking to enter this area. There’s currently an interesting debate going on about how much you actually need to know before you can enter this area and work for a company as a junior data scientist. Many say that you should have a deep, fundamental understanding of mathematics and statistics, with at least an undergraduate degree in statistics and preferably a graduate degree and at least one paper under your name. To these people, the journey to becoming a competent data scientist is one that should take years and can’t be done through simple online courses. However, it is clear that more and more data scientists now are self-educated, and companies are starting to value those who have their own projects under their belt. In this article, I’ll list some ways to get started with your own data science projects that you can put on your CV and companies will value.
Learn from Kaggle Projects
Kaggle is by far the biggest platform for data science projects. Every day, people upload new datasets, and new competitions take place where data scientists compete to build the best model for a certain dataset. There’s a lot of value in both competing in data science competitions in Kaggle as well as reading Kernels from other people on Kaggle and learning from their methods. Kaggle also has an active discussion forum, which is very helpful.
Read Data Science Blogs and Watch Videos
In the spirit of sharing information, many data scientists and machine learning experts write blogs where they share methods and tutorials for free. There are loads of data science tutorials on Medium.com, for example, and many of them are written by people working professionally as data scientists. Check out YouTube as well ¨C just a simple search for ‘data science’ will return all kinds of helpful and clear tutorials.d
Try Bootcamps and Online Learning Platforms
If you are willing to spend a little money, there are also plenty of courses on data science these days. Datacamp, Dataquest and Coursera offer data science courses, along with other platforms like Udemy. Most of these cost less than $100, a tiny fraction of what a college education would cost you. Of course, you can reasonably argue that they’re nowhere near as complete as a college education, but they do definitely have significant value.
Check Out Alibaba’s Big Data Courses
In addition to the resources mentioned above, Alibaba also has numerous courses on big data that are available here. The beginner courses package, “Data development programming languages,” is an excellent tutorial if you’re just getting started and want to learn some of the main tools of the trade (Python and MySQL). Once you’ve completed that, you can move on to more advanced courses. Upon completing all the courses you should have a very solid foundation in data science.
Don’t Be Afraid to Experiment
Once you’ve got a handle on the fundamentals, try to think of what would be an exciting project for you and go do it. Don’t be afraid to fail! See if you can scrape some interesting data that no one has tried to analyse before, and then train a model on it. There is an absolute plethora of data out there on the internet, and much of it is freely available. You never know what you’ll discover by picking some certain area, gathering data, and analysing it.
What Do to If You Have Limited Computing Power
Many data science projects and of course machine learning projects require a GPU, and that’s expensive. However, you don’t necessarily need to have a state-of-the-art machine in order to do your own machine learning projects. With Alibaba Cloud and especially its Elastic Compute service, it’s possible to rent a machine by the hour and only pay for the time you use. Even if you use several hours worth of compute time, it will definitely vastly cheaper than buying your own GPU, and it’s easy and convenient.
About the Author
Moises Alicante is a part-time editor for OutwitTrade, a product review publication.