and don't know where to start well this video
is perfect for you because I am going to talk about how I would learn data science in 2023 if
I were to start all over again I did a version of this video in 2022 and I like to refresh
it every year with my updated thoughts based on the industry Trends in the data science job
family so in this video I'm going to summarize six steps that I would take to learn data science
we would be talking about what research you need to do and then we'll jump into what skill set
you need to learn followed by what do you need to do after you're done with all of that as we
go through the steps I suggest you take notes I'm going to link down a notion template below
which you can use when researching different data science roles different data science job
family and for creating your study plan number one thing that I would do to become a data
scientist in 2023 is I would research the role when I started to learn data science I
heard that data science is the sexiest job Emily thanks to the Harvard Business review and
I jumped in I wish I wish I had done the research try to understand the data science job family a
little bit more to understand what encompasses the job family and what are the different options
in there before I jump into one in 2023 there are many roles within the data scientist jaw family
and I'm going to list a few of them there is data Engineers machine learning engineer data scientist
product data scientist data analyst just to name a few so if you are just starting out and don't know
anything about these job families I would suggest that you read up on each of these role and try to
understand what they do before jumping into any of the roles that are in the data science job family
because this step will give you actually a really really good idea what the expectations of the
role are and what you will enjoy doing so let's say you have decided that you want to become
a data scientist generalist you can think of it as like full stack engineer who does back-end
friend that the middle and everything in between you can think of it as like a full stack data
scientist that does pretty much everything they know statistics machine learning they know coding
they can take the business problem and apply data science to solve those problems and add value to
the business so let's say you figured out that data scientist role is what you want to do you're
not done researching yet what I would suggest you to do is go read the job descriptions for the
data scientists role at different companies that you are interested in as I mentioned in
some of my previous videos data scientist is a slightly difficult for Deaf Family to be in
because the definition for a data scientist is not well defined from company to company so this
is why you need to put a lot of attention on this step because your study plan your roadmap is going
to look different depending on what companies and what specific roles that you're trying to Target
let's say Amazon is the company that you want to Target and you want to become a data scientist
there you will go to the job description for the data scientist open roles at Amazon and you will
try to understand what are the skill sets required second you will go to LinkedIn look at people who
currently work as a data scientist or have worked as a data scientist at Amazon and try to look
at those people's educational background what did they study then the type of projects that
they've worked as a data scientists basically we're actually doing this research and we're
basically actually doing data science here statistics we basically ran collecting samples of
people and trying to understand what are the what are their educational history what is the type
of project that they do this is also a good time to look at if they're like specific certificates
that they have these people have specific degree programs specific boot camps that these people
took that basically helped them so this will get you actually a lot of information so let's
say you've done your research where you're ready for the third step the third thing I
would do is I would learn the fundamentals of data science you probably were not expecting
to hear that because a lot of other advice that you have heard is basically is telling you to
start coding I strongly believe that in order to be a data scientist a good data scientist you
actually need to have a solid Theory knowledge of statistics and machine learning before you get
to the coding part and there's a reason why the coding languages is a way to apply data science
they are not the data science itself you can ask any data scientist who is currently working in the
industry or is going to school you can ask them like what is data science they will tell you data
science is pretty much statistics domain knowledge and machine learning knowledge so that's why it's
important for you to build those fundamentals before you start getting into coding now here one
thing I would say that if you hate coding then I would not pursue a data science because data
science does require coding and if you have never coded in your life in that case I would suggest
you to try out coding first before you jump into the theory but theory is so so important and it
basically builds the foundation for data science so I would suggest you to learn statistics and
learn the machine learning fundamentals before you jump into coding and we're doing this step
but we're not going to go too deep we're going to go at the high level you're going to try to
understand what statistics actually is you're going to try to understand what machine learning
actually is so before you jump into the next part let's say if somebody comes and asks what is the
difference between linear and logistic regression you should be able to explain it that's the
level of knowledge that I suggest you to have before you jump into the coding part the coding
part would actually when you apply the knowledge you will actually get to learn statistics and
machine learning much more but for the initial learning period I would suggest to stick to the
fundamentals learn statistics mathematics and machine learning so these are the three areas
that I would suggest to build your fundamentals on so here is how I would approach it for math I
would be very comfortable with linear algebra for statistics I would get pretty comfortable
with probability distributions hypothesis testing Bayesian versus frequencies so those are
basically the basics in statistics the third I would suggest you to go into machine learning
and machine learning try to understand what are the types of machine learning there is supervised
unsupervised reinforcement and then within each try to understand what is a regression what is
a logistic regression what is classification what is decision tree we're not going too deep
but we're understanding it enough that we we can explain theoretically what they are because
we're gonna go back to it again now a word from our sponsors simply learn if you're trying to
start your career in data science and looking for a structured program that simply learns data
science boot camp might be a good fit for you the program is developed in partnership with Caltech
University which is ranked number nine in the US and in my opinion gives that credibility it's
a six month cohort based boot camp but 25 plus Hands-On projects going through the curriculum
I really like that it starts with building foundation in statistics and machine learning and
then jumps into coding and teaches python SQL and R with hands-on experience in three different
domains also I really like that it focuses on interview prep which is much needed because we all
know data science interviews are not easy in this program you will learn from global data science
faculty who have combined 40 years of experience the cohort is starting soon and has limited
seats I'm linking The Bootcamp below check it out it might be a good fit for you now back to the
video so let's say you build your Theory you have a solid foundation in statistics machine learning
now it's time for you to go to the next step which is learn to code there are many languages that
you can learn for data science for the simplest it is take I would suggest to start with SQL
and python I personally started with sqlnr r has a very steep learning curve and it's not as
intuitive as python but R has a lot of ready to go statistics methods available to you that you can
use for for doing like a very quick analysis but art cannot be productionized by whereas python
can live in a production environment I would suggest you to start with python and SQL python
is super intuitive and as an industry I have seen that python has actually been taking off because
it's easy to understand by everybody involved in the project including software engineers and
data scientists for learning SQL and python SQL you can you can go with any Learning Resource
for learning SQL it's pretty straightforward you just need to understand how to join different data
sets in our right left self joins which is like a weak point for me how to do SUB queries how to do
window function for python now there are a lot of courses that are out there for a learning python
here I would like you to pay special attention to focus on learning python that is specific
to data analysis for example there are several libraries that are focused on data science and
data analysis and machine learning such as pandas numpy scikit-learn matplotlib just be mindful when
you're picking your Learning Resource for python make sure that it has a focus of data science
because there's a ton of learning material out there that teaches to software engineers and
non-software Engineers I would suggest you to focus on python that is more targeted toward
data science at this point you have a really good understanding of fundamentals of data science
which is statistic machine learning math you have coding knowledge which is SQL and python now
is actually it's time to apply the skills that you have learned so far and turn it into a project
there is a possibility that while you are learning all those things learning by python learning SQL
through whatever Learning Resource that you use you probably already did the projects so you have
like some hands-on experience but this step is specifically to build your portfolio and to get
you more Hands-On knowledge on how to do things start building your project portfolio and remember
the first step that we did or the second step that we did where we were looking at different people
on LinkedIn who are working at your target company in that role if you have already written down
what type of projects that they do in their role as a data scientist this will give you actually a
very good idea of what type of projects to Target what kind of focus areas that you need to have
in your project portfolio if you're looking for data there are actually a lot of free resources
available to you where which has tons of data and problem sets that you can use to build your
product portfolio listing a few including kaggle Google data search US Census Bureau and you don't
have to be limited to the data that is available in this platform you can actually make your
own data for example you can look at your your purchase history on your credit card download
that data turn this into a data science problem and do a project project on it identifying your
purchase trends for example How likely are you to buy a coffee if it's raining what I'm trying
to say here is like you can look at different data set make your own problem and try to build
projects around it build at least five projects get your data from kaggle Google data search
or make up your own data Target a domain that you are interested in but also a domain where
you want to get into for example if you want to be a data scientist in e-commerce then you would
pick a data data set that is related to that and solve a problem that is related to e-commerce I
would also suggest if you have the option is to build the online project portfolio link it on
your LinkedIn link it on your resume and build a GitHub portfolio this is optional I personally
didn't do a GitHub portfolio but if I were to do it again and I don't have any experience I would
build a GitHub portfolio so recruiters can look at it and they have additional information on my
skill set that they probably don't have on other candidates who don't have GitHub portfolio the
reason you're learning trying to become a data scientist is to get a job if you are doing it just
because you're curious like that's great but most people who are trying to become a data scientist
they want to get a job so the sixth step that I would recommend is to prepare for interviews and
the reason I say this is because a lot of people discount how much work interviewing is knowing
the skill versus doing it in an interview setting where you are under a pressure environment and
you have to answer in a time constraint manner I would start practicing on a platform like lead
code or start a scratch I'm going to link it below and that is for SQL and Python and then for your
fundamentals in your theoretical knowledge I would start mock interviewing and start practicing
with a friend have them ask you questions so that way you are ready for the interview itself
I've created a detailed video on how I prepare for interviews and what is my process you can go
and watch it here I'm going to link it somewhere here it goes in a lot more detail that I'm going
in this video so hopefully by the end of this process you are able to go into an interview and
perform to get a job offer now that being said I do want to mention that over the last few years
given that there's so much interest in the job family it has become more and more competitive so
don't be discouraged if you don't get your job on the first try so this is the process that I would
use if I were to learn data science all over again do any of these steps resonated with you surprised
you let me know in comments with that thank you so much for watching this video and I will see
you in a different one have a beautiful day bye
